Demo notebook: Analysing tracker data

Introduction

Fitness tracker is a rich source of longitudinal data captured at high frequency. Those can include step counts, heart rate, calories expenditure, or sleep time. This notebook explains how we can use niimpy to extract some basic statistic and features from step count data.

Read data

[1]:
import niimpy
import pandas as pd
import niimpy.preprocessing.tracker as tracker
from niimpy import config
import warnings
warnings.filterwarnings("ignore")
[2]:
data = pd.read_csv(config.STEP_SUMMARY_PATH, index_col=0)
# Converting the index as date
data.index = pd.to_datetime(data.index)
data.shape
[2]:
(73, 4)
[3]:
data.head()
[3]:
user date time steps
2021-07-01 00:00:00 wiam9xme 2021-07-01 00:00:00.000 0
2021-07-01 01:00:00 wiam9xme 2021-07-01 01:00:00.000 0
2021-07-01 02:00:00 wiam9xme 2021-07-01 02:00:00.000 0
2021-07-01 03:00:00 wiam9xme 2021-07-01 03:00:00.000 0
2021-07-01 04:00:00 wiam9xme 2021-07-01 04:00:00.000 0

Getting basic statistics

Using niimpy we can extract a user’s step count statistic within a time window. The statistics include:

  • mean: average number of steps taken within the time range

  • standard deviation: standard deviation of steps

  • max: max steps taken within a day during the time range

  • min: min steps taken within a day during the time range

[4]:
tracker.step_summary(data, {'value_col': 'steps'})
[4]:
user median_sum_step avg_sum_step std_sum_step min_sum_step max_sum_step
0 wiam9xme 6480.0 8437.383562 3352.347745 5616 13025

Feature extraction

Assuming that the step count comes in at hourly resolution, we can compute the distribution of daily step count at each hour. The daily distribution is helpful to look at if for example, we want to see at what hours a user is most active at.

[5]:
f = tracker.tracker_daily_step_distribution
step_distribution = tracker.extract_features_tracker(data, features={f: {}})
step_distribution
{<function tracker_daily_step_distribution at 0x7f206b5a19e0>: {}} {}
[5]:
user date time steps month day daily_sum hour daily_distribution
0 wiam9xme 2021-07-01 2021-07-01 00:00:00 0 7 1 5616 0 0.000000
1 wiam9xme 2021-07-01 2021-07-01 01:00:00 0 7 1 5616 1 0.000000
2 wiam9xme 2021-07-01 2021-07-01 02:00:00 0 7 1 5616 2 0.000000
3 wiam9xme 2021-07-01 2021-07-01 03:00:00 0 7 1 5616 3 0.000000
4 wiam9xme 2021-07-01 2021-07-01 04:00:00 0 7 1 5616 4 0.000000
... ... ... ... ... ... ... ... ... ...
67 wiam9xme 2021-07-03 2021-07-03 19:00:00 302 7 3 12002 19 0.025162
68 wiam9xme 2021-07-03 2021-07-03 20:00:00 12 7 3 12002 20 0.001000
69 wiam9xme 2021-07-03 2021-07-03 21:00:00 354 7 3 12002 21 0.029495
70 wiam9xme 2021-07-03 2021-07-03 22:00:00 0 7 3 12002 22 0.000000
71 wiam9xme 2021-07-03 2021-07-03 23:00:00 0 7 3 12002 23 0.000000

72 rows × 9 columns