Last week, I gave an overview of my planned independent study project. This week, I’ll give a bit more detail on what I’ve done.
I have a habit of preparing for classes before they start. I buy the text books, get an overview of the material, and prepare to apply learning techniques throughout the semester. This helps me identify problems before the semester starts and address them in class as I go. Likewise with this project, I had hoped to get as much as I could done with machine learning before this semester started to hit the ground running with the software portion. Naturally, my ignorance led me to assume the problem was easier than it actually is. After a great conversation with a communications professor, I realized the problems I was trying to solve had to be broken down.
Counter-intuitively, the broken down problem is more difficult. To recap, my project involves machine learning and audio signal processing. Although great leaps have been made in this field and many problems have been solved, they mostly use clever tricks to achieve the results they get. Take speech recognition for example: your text-to-speech software transcribes nearly 100% correctly. Machine learning models can use huge datasets of audio, as well as commonly-spoken phrases to decide which words you’re most likely to say. The result is that mumbling, stuttering, or ambient noise is a bit more forgivable. On the contrary, transcribing each and every syllable is not nearly as easy, and in fact it’s a problem that has yet to be solved. That’s a shame, considering I was hoping to transcribe an audio sample into phones as part of my process and I somehow doubt I can do it without first getting a PhD.
This realization has led me to take a quick course on machine learning problem framing. It teaches the process of developing an hypothesis and developing a model to prove it, as well as resisting the temptation to shoehorn a problem into machine learning when a heuristic solution is as good or better. I did manage to find examples of using machine learning to solve some of the problems I wanted to (dating back 20+ years, even), but unfortunately they were each limited in scope and would be difficult to use to make a cohesive app. My goal isn’t a Frankenstein project.
In an effort to dig deeper, I’ve also done an introduction to Pandas tutorial, and started on a Tensorflow tutorial. These are surface-level in and of themselves, but help in understanding the higher-level frameworks. My hope is that understanding the basics will allow me to create a model that has an exciting application. In the meantime, I can implement the prerequisite features in software: audio recording and signal processing.
I’m going to dive into more specific problems as the semester goes on and I get more comfortable with the topics. I’ve already come up with a list of ideas and find myself wanting to write posts on small things like Android project flavors and build configuration. Posting thoughts and lessons in a public place has been great for accountability, and my goal is to know these technologies inside and out 4 months from now.
From the blog CS@Worcester – Inquiries and Queries by James Young and used with permission of the author. All other rights reserved by the author.