Researching more on audio feature extraction last week gave me a lot to think about. As it settled in, I came up with great ideas for hypothesis to test and specific applications for them. The kind of ideas that lead to turning off your cell phone, skipping lunch and dinner, and get things working.
Unfortunately, as mentioned last week, I do not have the time learn audio processing algorithms to the extent that I can implement them, so I will have to make a choice on a library that will help. A further complication was getting code to run in Android. Libraries exist to run Python code in Android, but the Python libraries I’ve found for audio analysis are lacking in features and I would need to use a combination.
The most comprehensive library I’ve found is Essentia. Of course, it’s so comprehensive I won’t be able to use the code for a professional app without a commercial license. Luckily there is a noncommercial license available that will allow me to get results for the project and determine if there are commercial applications.
Essentia is a C++ library. So the good news is I get to use JNI and the Android NDK (Native Development Kit) to run C++ code. Getting C++ code to run from an Android Activity is straightforward enough, but I do worry about potential complications in running Essentia. There are a number of dependencies that I fear might cause trouble with a feature I might want in the future. These are kept to a minimum with a special flag during compilation for Android. But alas, my paranoia strikes.
Because Essentia is open source, I am at least able to see implementations of the audio processing algorithms, and the code is well-documented with references to studies. Signal processing is a degree, not just a semester-long project. I certainly appreciate that fact more over the past couple of weeks. But this will be a great overview and using the code will still require understanding of the underlying processes.
Progress was made on converting files from audio to basic byte data. When I was considering Python libraries, I was under the assumption I could use WAV files and easily get byte data (for example, with librosa). Android’s MediaRecorder doesn’t support saving WAV files, so other formats must be used.
From the blog CS@Worcester – Inquiries and Queries by James Young and used with permission of the author. All other rights reserved by the author.