What activity recognition algorithms are used in wearables?
The most important factors in activity recognition are often said to be:
→ Window size (e.g., 1 second vs. 3 seconds)
→ Features (Mean, STD, FFT, Peak, etc.). Is this actually the case?
It’s not for nothing that they say you should double-check your window and feature set before changing your model in activity recognition.
1. Window Size: “Absolutely” More Important Than You Think
It really does make a big difference.
*1-second window
– Fast response
– But walking, running, and cycling can all look similar
*3-5-second window
– Patterns begin to emerge
– Stability significantly increases
* Too long
– Transitions (walking and stopping) are missed
* Common trade-offs in the field:
– 2-3 second window
– 50% overlap
-> I’ve seen F1 scores jump by several percentage points just by changing this.
2. Features: “The language that teaches the model.”
More than half of model performance depends on “how good the features are.”
Classic combinations are usually:
*Time domain
– mean, std, RMS, zero-crossing, signal magnitude
*Frequency domain
– FFT energy, dominant frequency, spectral entropy
*Event-based
– peak count, peak interval variance
* Important points:
– More doesn’t always equal better
– A few features with different meanings are much stronger
-> RF, in particular, feels like “the features are the model”
3. FFT vs. just statistics
This question comes up often, lol
* Walking/running distinction → FFT is really helpful
* Static activity (standing/sitting) → Statistical features are sufficient
* Cycling:
– FFT + autocorrelation increases the odds of success
-> FFT is not a “solution” but rather a card to be played when distinction is difficult.
4. Deep learning also solves a similar problem.
Using DL doesn’t create features? → Only half right. – Window size still matters
– Sampling rate also has a significant impact on performance
– If you look closely, the CNN filter is essentially doing something similar to an automatic FFT
5. Conclusion in a nutshell
*Yes, the window + features are the most important
*The model comes next
*If performance isn’t good:
– Try changing the window first
– Reexamine the meaning of the features
– Then consider the model
This order is also good for your mental health.
Thank you, David Mun.
I totally agree with this post, lol. But I have a question.
You mentioned a 2-3 second window. Is that realistic, even for real-time processing?
Wouldn’t that result in too much delay in response?
Good point. In real-time, that’s the first thing everyone thinks about.
Usually, we compromise by setting the window to 3 seconds and updating the results every second.
We either use a 50% overlap or a sliding one, opting for “a slightly slower but much more stable result.”
For things like walking/running, even a 1-second delay doesn’t significantly affect the UX.
Oh… So you added all the features from the beginning and then reduced them based on their importance?
Or did you start with a minimal set?
I always started with a minimal set. Otherwise, debugging would be a nightmare.
The cleanest approach was to create a baseline with something like mean/std/RMS + step-freeq,
and then add FFT or peak features when performance was limited. Using RF, considering feature importance is also very helpful.
One last question… If I use DL, will I have to worry about these things less?
Honestly, it doesn’t really reduce them at all, lol.
You still have to worry about windowing, sampling, and preprocessing,
but you just don’t see the “why it’s not working?” question anymore.
That’s why I always recommend the classic approach, followed by DL.
![WEARABLE_INSIGHT [FORUM]](https://wearableinsight.net/wp-content/uploads/2025/04/로고-3WEARABLE-INSIGHT1344x256.png)

