I'm thinking about developing a wearable. What tips do you have for preprocessing collected data?
Are there any tips for preprocessing data?
What are your thoughts on the following?
– Is Butterworth often used to remove gravity (g)?
– Is error correction essential?
– Is it appropriate to create separate models for each sensor location (wrist/waist/ankle)?
These questions usually lead to lengthy comments, lol.
To summarize briefly, “A lot of things are theoretically correct, but in practice, compromises are more important.”
1. Is the Butterworth filter often used for gravity removal?
→ Yes, it’s used quite often. It’s practically the national standard.
When cutting low frequencies (roughly <0.3~0.5Hz) to isolate gravity components.
It’s preferred in the field because of its simple implementation and stability.
Pros:
– Intuitive parameters
– No tuning hell
Cons:
– Phase delay (but mostly negligible for activity recognition)
* If you need to go fast without Kalman/sensor fusion, a Butterworth low-pass + high-pass combination is sufficient for practical use.
2. Is error correction (calibration) essential?
→ It depends on the sensor quality and application.
Advanced IMUs:
– Often already internally calibrated
– Many teams skip this for activity recognition
– Low-cost sensors / long-term use:
– Temperature drift, bias buildup → Not negligible
A realistic approach:
– One-time offset correction during production
– No complete error modeling at runtime
– Instead, robust feature design
* “Deep calibration” is usually required for orientation
* Often unnecessary for simple activity recognition
3. Is it appropriate to create separate models for each sensor location?
→ To be honest: Yes, almost everyone does.
Wrists, waists, and ankles
– have completely different signal shapes
– Even for the same activity, feature distributions are completely different
Many attempts to “cover with a single model”
– result in poor performance
– and a debugging nightmare
Practically common architectures:
– Separate models by location are OK
– Or, use location as an input feature for gate processing
– At least separate threshold/feature sets
* Papers favor the unified model,
* but in reality, separate models are much more stable.
4. One-line summary of practical applications
– Butterworth: OK, just use it
– Error correction: No, not absolutely necessary (case-by-case)
– Sensor location: OK, separate if possible
And one really important note: In preprocessing, “consistency” has a greater impact on performance than “perfection.”
David Mun, thank you so much for answering every question.
I really enjoyed the summary. I have one more question.
When removing gravity with Butterworth, what low-pass cutoff do you usually set?
This is always a concern, as activity intensities vary from person to person.
We’ve all experienced that at least once, lol.
In my experience, starting between 0.3 and 0.5 Hz and not deviating too much has been the most stable approach.
Tuning for each person is endless, so instead, I fixed the cutoff and adjusted the window length and feature filtering to reduce the tiring factor.
In activity recognition, “consistent separation” is more important than “perfect separation.”
Ah… I understand.
Then, wouldn’t it be more difficult to manage models for each sensor location if they were completely separate?
Yes, it does increase the management overhead.
So, we usually separate the models, but share the preprocessing pipeline as much as possible. Alternatively, you can tag only the location and set different thresholds.
Ideally, an integrated model would be better, but in practice, a separate model is more comfortable.
![WEARABLE_INSIGHT [FORUM]](https://wearableinsight.net/wp-content/uploads/2025/04/로고-3WEARABLE-INSIGHT1344x256.png)

