teawd's blog

#tech [RSS]
#all #art #linux #personal #media

10.04.2022

Speech to text diary scripts

First of all, this blog post is about a somewhat specific problem. Despite this, I think it is quite useful to most readers looking for some Linux knowledge as the tools I used are very general. Let's start with the explanation of

The problem

Sometimes I have good ideas when I'm outside. Sometimes I really wanna preserve these ideas, but it's very inconvenient to use phone keyboards, especially on the move. One solution is recording my voice. But then, I would want to transfer those recordings to my computer for future reference. But what if I also want to read the text version of those voice recordings? Let's handle these problems one at a time.

Syncing files

Syncthing

Syncthing is an open source cross-platform program that can sync folders accross multiple devices. It's perfect for this problem, as it allows for P2P syncing (for example from a phone to a computer).
It's very easy to set up aswell, you download it on both devices, open a browser (Synthing has a web frontend) tab on 127.0.0.1:8384 and scan the ID QR with your phones camera. Then share whatever is the folder that has your audio recordings.

If your init system is systemd, to start synthing (and enable autostart at boot) you can run (where USER is your linux user): systemctl enable syncthing@USER.service --now Synthing has configuration options for encryption and file filtering. If you want more robust syncing, you could also host a Syncthing instance on a server.

Speech to text

To convert my speech to text I use the SpeechRecognition python package: pip install SpeechRecognition It supports many speech recognition options, including offline ones. It's also really easy to use, check my script.

Options (present in my script as comments):

Pocketsphinx (low-accuracy offline option): pip install pocketsphinx
Google translate (has a limit on length)
Google cloud (free, requires registration)

There are also other online and offline options. (SpeechRecognition documentation)

Storing recorings in notes

Vimwiki

I use vimwiki vimwiki (a (neo)vim plugin) that makes managing markdown files in a wiki-like manner extremely easy. Vimwiki also has a diary option. To store the recording path and text in my notes, I wrote a few scripts.

The speech-to-diary.sh script calls the speech recognition script on a selected file, appends transcription to a vimwiki diary and (optionally) moves the voice recording to the vimwiki folder. it has a bunch of flags, you can get a help message by running: ./speech-to-diary.sh -h

The text that it puts looks like this: 20220328_162342.m4a 16:24:05 Whatever speech recognition returned /home/tea/vimwiki/diary/resources/20220328_162342.wav I have a vim shortcut that plays an audio file under the cursor so that I can also listen to it if the transcription is not accurate enough.

If you don't use vim or vimwiki, you can just change the last line of the script to save the text to a file of your choosing.

Tying it all together

Actual Linux knowledge

Now, we need a way to react to new files appearing in the syncing directory. For that, we can use the linux inotify interface, that waits for changes in files. For syncthing, the following command works: inotifywait -c -r -m -e attrib $PATH_TO_WATCH You can also see it used in a script. Be sure to read the readme on the repo for some additional information on setting up a systemd service (if you don't know how).

Now, if you autostart syncthing and the watchscript at system boot, it will process all new audio files.

GH Repo SpeechRec Docs

13.03.2022

Simple video editing with MPV

#tech #linux

MPV is one of my favorite open source projects. It's a simple-looking video player with an enormous amount of features while also being highly extensible through the lua programming language.

MPV-splice Script

MPV-splice is a script that allows you to use MPV as a video editor. You set the start and end timings for fragments, and then the script cuts the video and concatenates the fragments.

I made a fork of it to extend it's abilities.

I added an option to automatically upload the video (to any platform, but the default config works for Streamable).
I added an option to reencode the video, making the timings frame perfect. By default, the script does not reencode the video, allowing for extremely fast results.
I fixed the editing of online videos. The script will only download the selected portions of the video, which allows you to cut out fragments of even hours long YouTube videos for example.

Another useful script I forked is 8mb, which compresses your video to 8mb, changing it's FPS and resolution to appropriate values.

mpv-splice repo public-scripts(includes 8mb)

20.02.2022

Music Playing: Local & Remote

#tech #linux

The most basic way to play music is using some website. Most people prefer Spotify, but I never liked it.

For the longest time I've been using Soundcloud, it has most music I like, especially the quirky electronic type. It also allows you to upload songs without any additional setup. But, of course, there are songs that are not on there. The worst part is that some more main-stream artists have their songs region-locked or only available with soundcloud premium.

Local Music Library

A year back I decided to begin storing my music locally. This approach ensures that I have access to all my favorite music, in best quality, offline, forever.

It is honestly really freeing and opened a bunch of possibilities for me.

DeaDBeeF Music Player

I first settled on DeaDBeeF, as it's highly customisable, extensible and actively developed.

Configuring takes some time to get used to. You have to enable 'design mode' in the 'View' submenu to start adding and removing panes. It allows you to use a lot of GTK3 (UI library) widgets to structure your window however you like. You can find my config in the GitHub repo linked below.

I especially like the visualization plugins that I found. On Arch-based systems it is extremely easy to install plugins: yay -S deadbeef-plugin-spectrogram-gtk3-git \ deadbeef-plugin-musical-spectrum-gtk3-git \ deadbeef-plugin-rating \ deadbeef-plugin-waveform-gtk3-git Some plugins are configured through settings, others through the right-click menu.

There are 2 main problems with DeaDBeeF for me:

Lack of dynamic playlists or being able to go though artsts/albums without creating playlists for them manually.
No playback queue display (it only shows the song position in the queue next to it).

Both of these features are in plans so at some point in the future I hope we'll get them.

MPD Music Server

Recently I started using MPD. It's a bit technical.

It might sound scary at first but all you need to do to set it up is create 1 config file and run the server. There are a lot of clients that allow you to connect to an MPD server with relatively conventional interfaces. I am currently using Cantata, as it's the most feature rich one I could find.

The great thing about MPD is that you can connect as many clients as you want and they all will be synchronized with each other. There are clients for scrobbling your music to last.fm (with mpdas), setting your discord 'rich presense' and controlling playback with media keys or a notification (with mpdris2). It's the most modular music playing experience and I love it. My config looks like this: music_directory "~/Music" db_file "~/.config/mpd/database" playlist_directory "~/.config/mpd/playlists" sticker_file "~/.config/mpd/sticker.sql" log_file "syslog" auto_update "yes" restore_paused "yes" audio_output { type "pulse" name "pulse audio" } audio_output { type "fifo" name "my_fifo" path "/tmp/mpd.fifo" format "22050:16:2" } 'sticker_file' allows for custom music ratings (supported by Cantata), 'fifo' audio output allows for latency-less audio visualization with Glava.

To install everything I mentioned on Arch you would run: yay -S mpd mpd-discord-rpc-git mpdas mpdris2 cantata

Remote Music Library

But what if I want to play my music on a device where I don't have my audio library?

MPD Does it All

You can create an http output by putting the following lines in your config: audio_output { type "httpd" name "My HTTP Stream" encoder "lame" port "8000" bitrate "192" format "44100:16:1" max_clients "0" # 0=no limit } I host an MPD instance on my server and can control playback with any client. On my phone I use MPDroid (which can also stream the aduio). To stream audio on desktop you have to connect to the specified port through a music player that can play http streams, like mpv, or a web browser. MPD clients, such as Cantata, can also play http audio streams. The delay is a little annoying. There is a setup that allows for 0 delay playback ( with local mpd using a remote music library) I saw here: www.joram.io/blog/android-streaming-mpd/

But I want my recommendations!

MPD :)

There is a client that automatically queues new songs based on last.fm recommendations.

mpv-splice repo

06.02.2022

Updating the Website

#tech

Incomprehensible and unstructured jumble of code scattered across a range of highly specific tools that is one typo away from crumbling into ruins. This is how I would describe

My static-site generator

To write a new article, I go to my blog-post directory, open a new file with vim and use a snippet (powered by UltiSnips) that inputs an html template with todays date, it also uses some python to automatically set ids to article titles.
After that I am free to use any html I please inside the article body. When I'm done, I set a flag (an html comment) to indicate to my site generator that the article should be published/updated/deleted.

Yep, it's all plaintext. I don't know what will happen if I mess up any of it.
This is how I roll.

The site-generator itself is written in python and just parses the file for the fields I mentioned and then compiles all blog-posts into html pages. It's NOT recommended for use yet but you can check it out. What I would recommend is Luke Smith's lb.

So what's new?

New and fresh additions to the site:

Posts are now tagged and each tag has it's own page.
Each blog-post now has an id, try clicking on the title of a post to get a link directly to it!

I also rewrote a lot of the code to make future changes easier.

What's in plans?

I didn't have enough time to implement everything, in the future you can expect:

RSS feeds.
An index page and a separate page for each post.
Better explanation of the inner workings and a more general implementation.

GitHub Repo

30.01.2022 [upd 06.02.2022]

Small GUI to Make Weird Designs

#tech #art

Hello! First week already proved challenging. Unfortunately I fell ill which also made it more difficult to finish something meaningful in time, but I have some results.

The Idea

I really like art based around very simple concepts. An example of such a concept is pixel art. The idea of pixel art is about as simple as it gets - you draw with little squares. So, I was doodling around trying to come up with something similar that looks cool. I came up with a modification of pixel art, with non-parallel lines and variable line width. I tried recreating it in inkscape, but quickly encountered multiple problems, like differently angled lines being at different distances from each other and the ends of lines looking off.

How to GUI

In the end I just decided to write my own program to make this kind of designs. Only experience writing GUI I had was using QT (C++); it was painful. I decided instead to use Godot, as using a python-like language with dynamically typed variables seemed like a much better idea for a GUI.

I had a few ideas I wanted to implement in my Goo-e: shortcuts that the user can set/unset just by right-clicking the element instead of going into settings menu and searching for it there (I prototyped it but didn't yet use). Another idea was giving the user to ability to save any value as default (again, by right-clicking), for now I implemented it with a button next to a value.

How do I optimize this ;-;

As you can see the UI is incredibly bare-bones. Main features are there: you can pan, zoom, draw and erase lines on a grid. Changing the angle of the grid is incredibly laggy as the code has to update a huge amount of individual nodes (objects). I will probably completely rewrite the entire program at some point.

Was it worth it?

Writing an entire program just to make this very specific type of designs might seems silly, especially considering how I realised that this effect can be achieved just by applying a simple image transform on a picture made using a normal square grid.

Regardless, the experience is valuable. I felt good programming, abusing the hell out of Godot's signal system and other features that make my life just a little bit easier. And damn does it feel good accomplishing something at the end of the week and posting about it here. Here is something I doodled in the program: see if you can figure out what it says, each letter is made out of 1 line. That's it for this week! Let's see what lies ahead :)

GitHub Repo