
I used CreateML for this, the software is great and easy to use, but data collection can take forever. This method of data collection will apply to any use of time series motion data collection, so feel free to read if you aren’t using the same software.
When I first started the data collection process my plan was to use os_log to output the data from my Apple Watch, process that data, and I’d be done. Easy!
It turns out I needed a much larger set of data than I imagined, and to rub salt in the wound, the pandemic hit and due to an underlying health condition I was confined to my house.
That meant that I couldn’t pass off my Apple Watch to some friends, have them complete samples for me and sit back and relax. I was going to have to do all of them myself; and I really didn’t want to spend weeks spinning my arm around in circles. Yeah, I know the results will be skewed because the model will be trained to recognise my specific movements; but desperate times call for desperate measures and with deadlines slowly but steadily creeping up on me; these were indeed desperate times.
I needed a way of getting many samples in a short time, so I put on my Watch and analysed where most of my time was lost during the collection.
Most of my time was spent stopping and starting the data collection between samples. So I decided to try getting a load of samples in a row to cut that down.
This was a step in the right direction, but it just wasn’t good enough. Samples were either completed too fast, or too slow and so I increased my processing time. It meant that I was still losing time ensuring that all the samples fit correctly into the prediction windows in CreateML.
That won’t do. I needed to refine the idea.
So I went back to the drawing board; I liked the idea of collecting large sample sets in one session, but wanted to minimise the processing of the data to cut out as much time as possible.
Now anybody who knows me knows that I’m a musician, and I’ve spent way too much time using a metronome over the years.
Metronomes are a drummers best friend (sometimes) and when you’re recording music you get well acquainted with them. For those who don’t know what a metronome is; it clicks at a constant BPM (beats per minute) and there’s one built into Google for you to use.
So I was going to use a metronome to keep my speed at a constant rate, but what about the processing time?
I searched the App Store for an app I could use on my Watch that would provide me with samples in CSV form, and where I could set my own sample rate.
SensorLog to the rescue!
This meant that I could collect the data at the sample rate that I wanted to use, and I could then use the samples per second (Hz) and the beats per minute of the metronome to work out how many rows of my CSV output would correspond to one sample, while using the clicks of the metronome to make sure that I hit my cues for starting and completing an action every time.
So for example if my action takes 1 second to complete, I can set my metronome to 60 BPM (or 120 BPM if you want more intermediary clicks), 60 BPM would equal one click per second. I set my sample rate to whatever Hz I want to feed into CreateML.
I put the metronome in 4/4, and count along with the clicks like this;
1… 2… 3… 4… 1… 2… etc.
I start the action on each 1 I count with the click of the metronome, and aim to finish it in time with when I count to 2, then start an action when I count 3 clicks, and finish that action when I count the 4th and final click.
So basically; start your action on odd numbers, and end it in time with the even numbers.
That means 30 samples per minute each minute you collect data!
You will need to change this to suit your samples and the time it takes to complete them, but it will allow you to keep your samples uniform in collection which means minimal processing!
If, like me, you’re collecting the samples only from yourself; try to vary how you do the action. It’s tough to avoid falling into a rhythm with the metronome clicking away but try to do some actions more pronounced, some more subtle. Do some that complete slightly faster than others, some slightly slower. This will aid the accuracy of your model during training.