Embedded & HardwareML ResearchPythonSignal ProcessingEmbedded SystemsMachine LearningMicroPythonSwift

Skiing IMU Data Analysis

Building a custom skiing motion sensor and analysis pipeline from scratch

2026
8 min read
Skiing IMU Data Analysis

Ski IMU: Custom Wearable Sensor for Alpine Skiing Technique

A hardware-to-model pipeline for capturing and analysing alpine skiing motion. From a hand-built IMU logger on the boot, through a labelling app on the phone, to a planned transformer-based model for technique analysis.

Personal project - work in progress, expected to run long-term.

⚠️ Status: hardware, firmware and labelling app are working. Data is being collected on real ski days. The machine-learning side has not started yet - I am intentionally waiting until I have enough real data to understand what it actually looks like before designing the model.


Motivation

I have been skiing for a long time and have always been curious how much of technique sits in body movement that you cannot see from the outside. At the same time, most of my engineering work to date has been pure software, and I wanted a project that pushed me through the full stack: hardware selection, firmware, data engineering, labelling, modelling, and visualization, without the shortcut of a phone or smartwatch SDK doing the sensor work for me.

There is also a more honest motivation: I wanted to feel for myself how hard real-world ML data collection actually is. Clean benchmark datasets gave me a misleading picture of what "data" means in practice. Vibration, calibration drift, timestamp skew, missing labels, weather. None of that exists in a Kaggle download. Building this end-to-end has been a deliberate exercise in earning the data the hard way.

The long-term goal is a low-cost, open hardware reference for alpine skiing motion capture, paired with a transformer-based model that can label runs, detect technique inefficiencies, and eventually give actionable feedback. Existing commercial systems are closed and expensive; the bet is that a CHF 20.- microcontroller plus a CHF 5.- IMU is enough to produce useful signal once the analysis is right.


Approach

The project follows an end-to-end pipeline, intentionally separated into independent layers so I can iterate on each one without rebuilding the others.

Hardware

The current rig sits on the cuff of the ski boot. The IMU is mounted there rather than on the boot shell so that the orientation signal reflects the lower-leg angle, which is the part of the body that actually drives ski edging in carved turns.

ComponentRole
Waveshare RP2350-ZeroMicrocontroller; runs the logger firmware
DFRobot BNO055 (I2C)9-axis IMU with on-chip sensor fusion (Euler, quaternions, gravity vector, linear acceleration)
WeMos D1 Mini micro-SD shieldOn-device storage of session CSVs
USB-C power bankPower supply - easy to swap mid-day
3D-printed enclosure (Onshape, 66×54×32 mm)Boot-cuff mount

All parts sourced from Bastelgarage.ch; the enclosure is printed locally.

Firmware

Written in MicroPython on the RP2350. The logger reads from the BNO055 at 50 Hz and streams CSV directly to the SD card. Each row carries everything I might later need without revisiting the hardware: the sensor's on-chip Euler angles and quaternions (the fused orientation that BNO055 ships with), raw and gravity-compensated acceleration, angular velocity, and the four per-subsystem calibration scores.

Calibration scores in particular are recorded as columns rather than thrown away, when the model later struggles with a section of data, I want to be able to ask whether the IMU was actually calibrated at that moment, or whether the magnetometer had drifted.

Labelling app

A SwiftUI app on iPhone 17 (Swift 6, the new @Observable macro) for annotating runs after a ski day. Each session can be timestamped, tagged with an activity type (carving, free skiing, traversing, lift, standing) and a subjective quality score, then exported to CSV or JSON to be joined back against the IMU log on timestamp. Built and tested on-device via Xcode.

This part of the system exists because there is no public dataset of labelled boot-cuff IMU data for alpine skiing. I have to bootstrap my own. The app is deliberately minimal: the goal is to make labelling fast enough that I actually do it on the gondola back up, not three weeks later from memory.

Data sample

The CSV layout looks roughly like this (synthetic placeholder, I will replace with a real sample once I have a clean session):

timestamp_ms,euler_heading,euler_roll,euler_pitch,quat_w,quat_x,quat_y,quat_z,accel_x,accel_y,accel_z,lin_accel_x,lin_accel_y,lin_accel_z,gyro_x,gyro_y,gyro_z,gravity_x,gravity_y,gravity_z,calib_sys,calib_gyro,calib_accel,calib_mag
1734268800000,182.31,-12.40,3.75,0.9912,-0.1015,0.0832,0.0211,0.41,-0.18,9.71,0.05,-0.04,0.12,1.21,-0.34,0.08,0.40,-0.20,9.69,3,3,3,2
1734268800020,182.45,-12.62,3.81,0.9908,-0.1037,0.0838,0.0213,0.43,-0.22,9.73,0.07,-0.08,0.14,1.35,-0.41,0.10,0.41,-0.23,9.71,3,3,3,2
1734268800040,182.58,-12.81,3.84,0.9904,-0.1059,0.0843,0.0215,0.48,-0.26,9.75,0.13,-0.13,0.16,1.49,-0.46,0.11,0.43,-0.26,9.72,3,3,3,2
1734268800060,182.76,-13.11,3.92,0.9899,-0.1083,0.0851,0.0218,0.55,-0.30,9.79,0.20,-0.18,0.20,1.62,-0.52,0.14,0.45,-0.30,9.74,3,3,3,2
1734268800080,182.91,-13.40,4.01,0.9893,-0.1108,0.0859,0.0221,0.62,-0.35,9.84,0.27,-0.23,0.25,1.74,-0.58,0.17,0.46,-0.34,9.76,3,3,3,2
1734268800100,183.10,-13.65,4.11,0.9887,-0.1131,0.0867,0.0224,0.71,-0.41,9.90,0.36,-0.29,0.31,1.85,-0.63,0.20,0.48,-0.37,9.78,3,3,3,2

The shape that interests me sits in euler_roll (lower-leg edge angle), gyro_z (rate of turn), and lin_accel_y (lateral force as I cross the fall line). Those are the columns I expect a model to lean on most. The model design will not happen until I have several full ski days of recorded data, as I want to see what the signal actually looks like under real skiing conditions before committing to an architecture.

Why this hardware, why this mounting point

There are three obvious mounting choices for a skiing IMU: the helmet, the boot shell, and the boot cuff. The helmet picks up too much torso motion to be a clean lower-body signal. The boot shell is too rigid, as the ski transmits a lot of high-frequency vibration that swamps the slower technique signal I care about. The cuff sits between the lower leg and the boot shell and captures the angle that produces edging, which is the variable I most want to model. The trade-off is that it is mechanically harder to mount cleanly.

Why a transformer (eventually)

Most published work on IMU-based motion classification uses CNN-LSTM stacks. That's a defensible baseline, but I want to try a transformer-based encoder because (a) it's the architecture I worked with for my MSc thesis on multimodal sleep EEG, (b) the long-context attention seems well-suited to turn-level structure where the relevant span is seconds rather than tens of milliseconds, and (c) I am curious whether a simpler tokenization of the IMU stream into short windows can be made to work. This is a research bet, not a settled design.


Current Work in Progress

Onshape enclosure CAD model

The current state of the project:

  • ·Hardware: working prototype mounted on the ski boot.
  • ·Firmware: stable 50 Hz CSV logging to SD card, with calibration-score columns and per-session timestamping.
  • ·Labelling app: built, on-device, exports CSV/JSON for join-back against the IMU log.
  • ·Data collection: starting next ski-son (skiing season).
  • ·Models: not started, deliberately waiting until the data is real and I know what I'm looking at.

Problem I'm Struggling With

The single hardest lesson so far is that labels are more expensive than models. Building the labelling app, designing a per-run schema that I can actually fill in on a chairlift, and remembering to annotate before the memory fades. All of it has been more work than I expected, and none of it produces a single line of code that does anything visible. It is also the part that, if I skip or do badly, will silently sabotage everything downstream.

The secondary struggle is hardware fragility: jumper-wire connections that fail under cold and vibration, SPI timing issues on the SD card, occasional calibration loss when the magnetometer sees the lift's metal structure. Each of those breaks the logging pipeline in a way that is only visible after the ski day ends.


Future Work

Hardware. Move from breadboard-and-jumpers to a small PCB. Waterproof enclosure. Integrated LiPo with charge management instead of an external USB-C bank. Possibly a second IMU on the opposite boot for symmetry analysis.

Signal processing. Drift correction on the long-running orientation estimate. Automatic turn segmentation from gyro_z zero-crossings before any ML enters the picture. A simple feature-extraction baseline (turn rate, edge-angle range, lateral G peaks per turn) to compare against the eventual deep model.

Modelling. A self-supervised pre-training stage on unlabelled data, followed by a small supervised head for technique classification on the labelled subset. Possibly anomaly detection on quality scores to surface "interesting" sections of a run automatically.

Visualization. A run-replay dashboard that lets me scrub through a session with the IMU signal and the labels overlaid. Side-by-side comparison of two runs to look at technique consistency.

Mobile integration. Live BLE streaming from the logger to the phone, so labelling can happen during the run rather than after.


Why this is a feature project

This is the project I am committing real time to. It sits at the intersection of everything I want to be doing: hardware, embedded firmware, signal processing, sensor fusion, transformer-based modelling, sports analytics, and it is mine end to end. Most of the other things on this site are completed, frozen, and behind glass. This one is alive, and the most interesting parts of it have not been built yet.