Towards automated audio to score transcription — Audio-to-score alignment

This post is the first in a series describing the process of making a tool for transcribing guitar music. This is clearly a task for ML, and so we need a dataset for training/validation. In the data preparation stage, obtaining scores aligned with music is incredibly challenging. To the best of my knowledge, the approach I present here is original.

The Challenge

We need a dataset of onsets in audio format labelled with the notes that begin at the onset.

Since such a dataset doesn’t exist, we must make it ourselves. Yes, we could use synthesised audio, but real-world data should result in better generalisation.


Now, having acquired hundreds of fingerstyle pieces from various artists in both audio and score format, how do we match these two together? This problem is an active area of research, and in its more general form can also be found in speech recognition systems.

Tetris Cube and n-dimensional Tetris

Recently I attempted to solve the Tetris Cube I found in a box under my bed. Just like normal Tetris but extended to 3D in a 4x4x4 cube. How hard can it be to to fit 12 pieces into this cube? Well by my rough estimates it’d be on the order of trillions of (very) naive brute force attempts to find just one solution. Tetris in 2D is already NP-complete; this is definitely not any easier.


OS Life

Operating Systems, the most feared undergraduate CS class at UT … and I blindly walked into it as an {exchange student. Not just that but I’m taking the reknown [unofficial] honours class as well. Don’t get me wrong, OS is fun

HackMIT - Wifree

HackMIT was awesome … You may be asking how/why I went to HackMIT in the first place. Long story short, a friend was determined to go (he’d formed a team already) but he didn’t get invited; I, on the other hand, got lucky.

Travelling ... and now?

What I’ve been doing in the past month? Hopping around cities in the US. It was awesome! Every city was incredibally different.

Now classes are about to start at UT. I hope taking OS, Algorithms and Real Analysis at the same time isn’t as bad as people tell me …

Unearthed Hackathon 2017

This was my first hackathon. I walked in with a friend and formed we formed a team with a few other guys we met on the evening, and yeah, somehow we won!

Update [26 April]: Charlie has made an awesome portfolio of the project

Collatz Conjecture


Recently I was reminded of the Collatz Conjecture during a lecture.

WSL Windows Interoperability

You have to give it to Microsoft, they’re doing some pretty impressive stuff these days. Last night I got the latest Creators Update, and took WSL for another spin. It now ships with Ubuntu 16.04 and a tonne of improvements. We can now launch Windows applications from bash!

Maths on the web (Mathjax, MathML etc)

LaTeX is amazing. Mathjax allows LaTeX equations to be rendered properly depending on the browser and device but using Javascript (i.e. isn’t static). So when Google AMP (Accelerated Mobile Pages) doesn’t allow custom Javascript to be used, what do I do? I look to static solutions, and the closest alternative I find is MathML–not exactly a solution when Firefox has been the only browser with native support for years, yet MathML is part of HTML5. So while my website might not satisfy Google AMP criteria, I think I’m sticking with Mathjax (which works seamlessly with Kramdown) for now.

Setting up the new site with AWS, Jekyll, and WSL

Over a year ago, for some reason I created my blog on Blogger. Even at the time, I knew I’d eventually want to get my own domain, host, and build my website on something better. Well recently I’ve finally done it and it was surprisingly easy.

An Arduino IDE Alternative

It seems like it was only yesterday when I first downloaded Arduino 1.0.1, when I first discovered the magical world of embedded programming. I clicked on the upload button … “serial port not found.

HM-11 Breakout Update

The boards arrived yesterday –surprisingly quick for OSHPark. HM11 Board Diagram

Laser rangefinder

After adding the laser, I quickly made a test program that sends the estimated range of the object via serial to the computer. The sensing algorithm on theCMUcam5 only detect the laser up to a distance of around 80cm. What’s worse?

Laser rangefinder (camera implementation)

I decided to begin making a laser rangefinder today using a camera implementation. This method is quite simple compared to the complex circuitry and ICs needed for a TOF (time of flight) implementation–light is very fast and ridiculous precise and accurate timing is required for TOF.

HM-11 BLE Breakout

So today I finally decided to design a PCB for the HM-11 Bluetooth module. At $10 dollars a piece (and as low as $2 at large quantities), I had bought HM-11 for the ballbot. However, I decided that I’d get the ballbot working first with the RN42 (which I later replaced with a HC-06).

The new control system [with video of robot balancing]

So previously, the ballbot would attempt to stay at the same spot. Sure it worked, but it was very difficult to implement remote control. My only attempt of remote control involved sending the target coordinates over bluetooth to the ballbot. This meant that I was only really controlling the target position of the robot, not the target velocity.

Sanyo projector teardown

So a few weeks ago I managed to get my hands on a Sanyo PLC-WXU700A for free. I tried getting it to work but the warning light would light up and the projector would just shut down. I was pretty sure it was fixable but I didn’t really need a projector anyway so I left it in my room for disassembly in the future.

Well, today was the day.

Deriving the derivative

So I’ve finally managed to get the robot balancing well enough to be confident to push it around on tiles and not be afraid of it falling. The derivative term of the position controller seems to be helping a lot.

But obtaining the derivative was not fun. Over the past day or so, I’ve found out that the positional output calculated from the encoders was horrendous, causing the robot to vibrate weirdly. Since then I’ve had a look at the output, and am now using a complementary filter.

Bluetooth. You get what you pay for--usually.

So I’ve been using the RN42 bluetooth module (sold as the bluesmirf silver breakout from SparkFun), and was experiencing issues was experiencing issues with the module crashing whenever the data rate got moderately high. I was sending around 50 bytes of data from the module to the computer at around 200 Hz. The RN42 simply couldn’t cope and would stop transmitting data, though remain connected, after around two seconds. I even increase the baud rate up to the maximum of 230400 BPS and still did not see major improvements.

The new ball: an old ball

After more hours of tubing I decided that one of the biggest problems of the Ballbot was the hard and bouncy ball. So now, I’ve moved to using an old and deflated basketball that is rounder and dampens the vibrations of the wheels. Preliminary results are promising.


Ballbot Balancing -- but not very well

Merry Christmas everyone!

Finally - the hardware is almost complete

So the 18650 battery holders finally arrived yesterday. By drilling a few holes into the acrylic frame and battery holders, the holders could be securely screwed onto the robot.


Motors singing in harmony - Pachelbel's canon

Technically I did this last Saturday, but I was too lazy to upload the video with my ridiculously slow internet.

Anyway, here it is: the ballbot playing Pachelbel’s canon (AKA Canon in D) with it’s motors by changing the PWM frequency for the motor controllers! Only the first dozen or so bars are played since this is one of my earlier revisions of the program.

Omnidirectional control

Balancing the robot

For any 3 wheeled robot with wheels mounted at with velocities then

where and are the x and y components of the robot’s velocity (relative to the robot), is the radius of the circle made by the wheels and is the robot’s angular velocity. This is implemented in my omnidrive library which is used by the PiBot and also the ballbot.

MPU-9250 Troubles

According to the InvenSense website:

The MPU-9250 is a System in Package (SiP) that combines two chips: the MPU-6500, which contains a 3-axis gyroscope, a 3-axis accelerometer, and an onboard Digital Motion Processor™ (DMP™) capable of processing complex MotionFusion algorithms; and the AK8963, the market leading 3-axis digital compass.

Ballbot - Soldering the PCBs

So finally the PCBs arrived yesterday afternoon and I immediately got to work this morning. Not having access to a reflow oven or hot air station certainly wasn’t helpful when soldering the 0603 surface mount components, especially WITHOUT TWEEZERS!

Retirement of the PiBot


One of the PiBots with a spare PCB on the left

*For those that don’t know, my team represented Australia in the Robocup Junior International competition in Hefei, China, in July. It’s basically a robotics soccer competition with a small ball emitting pulsed IR. We placed 8th individually and 1st in the superteam competition, winning the finals 1:0 thanks to our camera vision.

The Ballbot Chassis

Just some pictures of when assembling the chassis:


Without the 3d printed supporters, the acrylic mount (connected to 4 screws) is unlikely to sustain the torque applied from the motor.

Ballbot PCB - Part 2

So with a few modifications to the PCB design, here’s what I’m sending out for production.


Ballbot PCB - Part 1

So after spending a long time over the past two days designing the PCB of the ballbot, I realise I’m missing something.


I’m missing wireless communication! I’m intending to add NRF24L01+ as well as a bluetooth module, but as you can see, I’ll probably need a separate board. It’s already very hard to find space to add a few extra pins.

Team PI's code is now open source!

If you don’t know, I’ve been programming soccer playing robots for three years now to compete in Robocup open soccer. I’ve also been heavily involved in the PCB and hardware design of the robots.

Our website was once but is now Since competing in China, I’ve tried making various posts on the team website, but it seems there are issues with linkage to resources from the old site. As I’m not the site maintainer I have no clue how to fix it.

Anyway, all our code is now open source. Some of the libraries may be very useful.

Main programs Libraries

Ballbot Part 1

The ballbot is my latest project. The name was first coined by CMU in the early 2000s when they made the first “ballbot.” Till this day, the CMU ballbot remains one of the best, if not the best, ballbot alongside the Rezero (designed by a few undergraduate students at ETH Zürich) and BallIP (Tohoku Gakuin Uni).

Electrodermal Activity

So I first came across when the Embrace watch was launched on Indiegogo. Essentially by monitoring the resistance across the skin of a person, it is able to reliably detect seizures. Yes, the majority of seizure related deaths have causes that are unfortunately unknown and “just happen” (SUDEP), but even the slightest probability of someone alone having a seizure at the wrong place at the wrong time (such as a flight of stairs) can be extremely worrying for families. Various attempts and products have existed on the market utilising accelerometers/gyros to determine the event of seizures and message loved ones, but they often give off false detections and they require the epileptic to have clonic seizures (i.e. jerking).

Getting drag

When I heard that my maths assignment would be in part, about calculating the coefficient of air resistance of an object with and without a parchute (i.e. comparing the two), I couldn’t help but extend the assignment a bit further.

Tada! figure

Robocup Junior 2015 Best Shot

So instead of uploading gigabytes of videos, here’s one of the best goals we scored in the competition.

In case you didn’t get that, here’s a side on view that starts a little bit earlier.

See how the robot turned towards the goal? That’s all thanks to the CMUCAM5 camera on the robot tracking the yellow goal.

Getting "yaw" bearing

If there’s one thing that’s popular these days, it’s quadcopters, and one of the obstacles in making a quadcopter is making it balance by obtaining roll and pitch. A simple google search will flood you with information on simple complementary, Kalman, Mahony and Madgwick filters. But what happens when you want yaw? After a bit of searching online and complementary filters (which were good enough for my application of a omnidirectional four wheeled robot), I couldn’t find a single source pointing to how to fuse gyro and magnetometer data together, and so I endeavored to approach the task from scratch. Currently, I’m using a LSM9DS0 but this should apply to all 9DOF imu’s.

Perhaps the best explanation of the complementary filter can be found here [Update. Link doesn’t work but should still be accessible via waybackmachine] which is a must read.