Travelling alone has its bright and dark sides. You can neither share the excitement when you notice a sea lion swimming next to the boat nor look all the 227 meters down the Golden Gate Bridge. But you are free to do whatever you like, without discussing your options with anyone. For instance, you can exchange one more day in San Francisco for visiting the Computer History Museum in Mountain View, which probably isn’t the way of spending your holidays that many would choose.
In the museum you will find a former ballistic missile that after its retirement started an academia career as an univeristy, general-purpose computer.
Renting a car
The lady at the Avis counter tried really hard to give me this Camaro.
— Mustang is an entry-level sport car. But Camaro (Camaro!) is really luxurious.
She also told me that they have a red Camaro, but Mustang (which I originally booked online) is only available in black. What could I do? I chose the luxury and agreed to get the Camaro. When it came to signing the deal, lady Avis told me that Camaro is only $30 more expensive than Mustang - per day. Given this unexpected piece of information, I decided to stick to the original choice. However, she wasn’t accepting “no” as an answer:
— OK, there’s one more thing. Please tell me which airlines did you use to get here.
— Lufthansa - I said, adding that I don’t have the ticket anymore.
— No worries, it’s important that you are Lufthansa customer. For such special clients we have a discount: only $10 extra fee for Camaro.
Really, I didn’t have a chance.
Pacific Coast Highway
If you’re going from San Francisco to Los Angeles, you can either take one of the standard highways and see the Hollywood sign in about 5 hours or choose the Pacific Coast Highway, part of the State Route 1 (and admire the views for 8+ hours). It was the best 8 hours during the whole trip. The road is simply beautiful, with Pacific Ocean on one side and a steep wall of mountains on the other. It really deserves to be contemplated. A lot of people did - you’ll find multiple viewpoints there. Seriously, you should see it, at least in the street view:
I was listening to the latest Iggy Pop album. There’s this song, Paraguay, about escaping from the civilization, from all of these information and knowledge, that really resonated with me then, at the western edge of the North American continent.
The main museum in Los Angeles is LACMA. It isn’t unusual for the museum to have a small park in front of it, so you can enjoy trees and fresh air before or after enjoying the work of art. However, in case of LACMA, the fresh air part is not entirely true. You can smell some oil or petrol scent here and there in the park, but it isn’t caused by the vehicles. There’s a few small lakes across the park (and one quite big) filled with a thick, black oil. They are fenced (as they probably would be fatal for a potential swimmer) and bubbles from time to time in the way, that only a thick, black oil pool can bubble.
It’s really strange to see such lake in the middle of a huge metropoly.
A really good way to feel the Americans’ love for their cars is going to the Mulholland highway in the Sunday morning. The road is very curvy, scenic and runs through the mountains. Every week you can spot multiple sport cars there, from old BMW to new Ferraris and from Corvettes to Fiskers.
Last day I hailed an Uber to get to the airport. A black lady was driving and listening to some really nice music. She wasn’t really talkative (unlike the other Uber drivers), but that was fine. I used Shazam to find out what song is that. A minute later the whole album was downloaded to my phone. We exchanged a few words about crazy taxi prices and she wished me a good flight. Later, I submitted an Uber comment to the ride: “great music, thanks!”. Now it’s time to go home.
In the previous post I described the process of creating an ad-blocker for the polish radio station “Trójka”. It uses cross-correlation and fast Fourier transform to detect the ad jingles in the internet radio stream, silences the commercials and outputs the result. In other words: it plays the internet radio stream without ads. Creating the app wasn’t my final goal, though. I wanted to bring the same feature to my home amplituner (Yamaha R-N301), which receives “Trójka” in a traditional way via the FM broadcast.
So, I already have the algorithm, but I’m missing 3 elements required to run it in the real world of my flat:
a device on which I can run the analysis,
the broadcast, which will be used as a source for the analyzer,
a way to turn the volume up and down on the amplituner.
That’s the easy one. I already have Raspberry Pi 2, which works great as a home media center. It should be able to run the Java analyzer.
the broadcast source
The internet broadcast is delayed by about 30 seconds, so I can’t use it to analyze the signal and silence the FM tuner. No, I should analyze the real-time FM broadcast. How to receive it on a Raspberry Pi? Well, it will cost about $20. There’s a lot of cheap DVB-T USB sticks on the market and apparently (because of the RTL-SDR library) they can be used to receive all kinds of signals, FM radio stations included.
After plugging-in the stick, following command will receive the broadcast on 89.5Mhz:
rtl_fm -f 89.5M -M wbfm
The stream will be redirected to the standard output as a 16-bit, little-endian, 32k, 1-channel PCM data, perfect for the Java analyzer.
controlling the amplituner
Yamaha calls R-N301 a “Network HiFi Receiver”, which means (among other things) it can be controlled from an iPhone app. Apparently, Yamaha network-enabled devices exposes a RESTful interface, which can be called with curl. Following command will set the volume level to 55:
Polish Radio Three (so-called Trójka) is famous for broadcasting good music and having non-offensive speakers. On the other hand it suffers from the number of commercial blocks between auditions. The ads, usually related to drugs or electronics are loud and irritating. Trójka accompanies me in home and at work for most of the time, so I wondered if there’s something than can be done about the ads. It seems there is.
digital signal processing
My aim is to create an app that mutes the ads. The commercial block starts and finishes with a jingle, so the potential software should recognize these specific sounds and turn off the volume between them.
I know that the area of maths/computer science dealing with these kinds of problems is called digital signal processing, but it always seems like a magic for me - so this is a great opportunity to learn something new. I spend a day or two trying to find out what mechanism can be used to analyze an audio stream looking for a jingle. And I found it, eventually - it’s called cross-corellation.
People usually describes the cross-corellation referring to the MATLAB implementation. MATLAB is an expensive application that makes it easy to perform complex mathematical operations, including DSP operations. Fortunatelly, there’s a free alternative to MATLAB, called Octave. It seems it’s quite easy to run cross-corellation on two audio files using Octave. All you have to do is to run following commands:
As you can see, there’s a peek describing the position of the jingle.wav within the audio.wav. What suprised me is the simplicity of the method - the xcorr() makes all the work, the rest of the Octave code is just for reading the files and displaying the result.
I wanted to reimplement the same algorithm in Java, so I’ll have a tool that:
reads the audio stream from standard input (eg. provided by ffmpeg),
analyses it looking for the jingles,
outputs the same stream on stdout and/or muting it.
Using stdin and stdout will allow to connect the new analyzer with other apps, responsible for providing audio stream and playing the result.
reading sound files
The first step of the Java implementation is to read the jingle (saved as a .wav file) into an array. .wav file contains some extra info, like headers, metadata, etc. while we need the raw data. The format I was looking for is PCM - it simply contains the list of numbers representing sounds. Converting a wav to PCM can be done using ffmpeg:
In this case each sample will be saved as 16-bit number, little endian. In Java such number is called short and the ByteBuffer class may be used to automatically transform the input stream into a list of short values:
ByteBuffer buf = ByteBuffer.allocate(4);
short leftChannel = buf.readShort(); // stereo stream
short rightChannel = buf.readShort();
In order to implement the xcorr() function in Java, I looked into the Octave’s source code. Without changing the final result, I was able to replace the xcorr() invocation with the following lines - they need to be rewritten to Java:
N = length(audio);
M = 2 ^ nextpow2(2 * N - 1);
pre = fft(postpad(prepad(jingle(:), length(jingle) + N - 1), M));
post = fft(postpad(audio(:), M));
cor = ifft(pre .* conj(post));
R = real(cor(1:2 * N));
It looks quite scary, but most of the functions are trivial array operations. The heart of the cross-corellation is applying the fast Fourier transform on the sound sample.
fast Fourier transform
As someone who didn’t have earlier experience with DSP, I’ve simply treated the FFT as a function that takes the array describing the sound sample and returns an array containing complex numbers representing the frequencies. This minimalistic approach worked well - I was able to run the FFT implementation from the JTransforms package and got the same results as in Octave. I guess there’s a bit of cargo cult here, but hey - it works!
running xcorr on a stream
As you may see, the algorithm above assumes that the audio is an array, in which we are looking for the jingle. That’s not exactly the case for the radio broadcast, where we have a continous stream of sound. In order to run the analysis, I created a round-robin buffer, slightly longer than the jingle I’m looking for. The incoming stream fills the buffer and once it’s full, I run the cross-corellation test. If there’s nothing found, I discard the oldest part of the buffer and then wait until it’s full again.
I experimented a bit with the buffer length and got the best results with buffer 1.5 times bigger than the jingle size.
putting it all together
Getting the stream in PCM format is easy and can be done using aforementioned ffmpeg - the command below redirects the stream into the java standard input and then plays the result:
I also prepared a simple standalone version of the analyzer, that connects to the Trójka stream on its own (without an external ffmpeg) and plays the result using javax.sound. The whole thing is a single JAR file and contains a basic start/stop UI. It can be downloaded here: radioblock.jar. If you feel uneasy about running a foreign JAR on your machine (like you should do), all the sources can be found on my GitHub.
The final goal is to mute ads on a hardware amplituner, receiving a “real” FM signal rather than some internet streams. This will be covered in the next blog post.