Demo notebook: Analysing battery data

Read data

[1]:
import pandas as pd
import niimpy
import niimpy.preprocessing.battery as battery
from niimpy import config
import warnings
warnings.filterwarnings("ignore")
[2]:
data = niimpy.read_csv(config.MULTIUSER_AWARE_BATTERY_PATH, tz='Europe/Helsinki')
data.shape
[2]:
(505, 8)

Introduction

In this notebook , we will extract battery data from the Aware platform and infer users’ behavioral patterns from their interaction with the phone. The below functions will be described in this notebook:

  • niimpy.preprocessing.battery.battery_shutdown_info: returns the timestamp when the device is shutdown or rebooted

  • niimpy.preprocessing.battery.battery_occurrences: returns the number of battery samples within a time range

  • niimpy.preprocessing.battery.battery_gaps: returns the time gaps between two battery sample

[3]:
data.head()
[3]:
user device time battery_level battery_status battery_health battery_adaptor datetime
2020-01-09 02:20:02.924999936+02:00 jd9INuQ5BBlW 3p83yASkOb_B 1.578529e+09 74 3 2 0 2020-01-09 02:20:02.924999936+02:00
2020-01-09 02:21:30.405999872+02:00 jd9INuQ5BBlW 3p83yASkOb_B 1.578529e+09 73 3 2 0 2020-01-09 02:21:30.405999872+02:00
2020-01-09 02:24:12.805999872+02:00 jd9INuQ5BBlW 3p83yASkOb_B 1.578529e+09 72 3 2 0 2020-01-09 02:24:12.805999872+02:00
2020-01-09 02:35:38.561000192+02:00 jd9INuQ5BBlW 3p83yASkOb_B 1.578530e+09 72 2 2 0 2020-01-09 02:35:38.561000192+02:00
2020-01-09 02:35:38.953000192+02:00 jd9INuQ5BBlW 3p83yASkOb_B 1.578530e+09 72 2 2 2 2020-01-09 02:35:38.953000192+02:00

Feature extraction

By default, Niimpy data should be ordered by the timestamp in ascending order. We start by sorting the data to make sure it’s compatible.

[4]:
data = data.sort_index()

Next, we will use Niimpy to extract features from the data. These are useful for inspecting the data and can be part of a full analysis workflow.

Usin the battery_occurences function, we can count the amount the battery samples every 10 minutes. This function requires the index to be sorted.

[5]:
battery.battery_occurrences(data, {"resample_args": {"rule": "10T"}})
[5]:
occurrences
user
iGyXetHE3S8u 2019-08-05 14:00:00+03:00 2
2019-08-05 14:10:00+03:00 0
2019-08-05 14:20:00+03:00 0
2019-08-05 14:30:00+03:00 1
2019-08-05 14:40:00+03:00 0
... ... ...
jd9INuQ5BBlW 2020-01-09 22:50:00+02:00 0
2020-01-09 23:00:00+02:00 1
2020-01-09 23:10:00+02:00 1
2020-01-09 23:20:00+02:00 1
2020-01-09 23:30:00+02:00 2

626 rows × 1 columns

The above dataframe gives the battery information of all users. You can also get the information for an individual by passing a filtered dataframe.

[6]:
f = niimpy.preprocessing.battery.battery_occurrences
data_filtered = data.query('user == "jd9INuQ5BBlW"')
individual_occurences = battery.extract_features_battery(data_filtered, feature_functions={f: {"resample_args": {"rule": "10T"}}})
individual_occurences.head()
<function battery_occurrences at 0x7fd699bf72e0> {'resample_args': {'rule': '10T'}}
[6]:
occurrences
user
jd9INuQ5BBlW 2020-01-09 02:00:00+02:00 3
2020-01-09 02:10:00+02:00 1
2020-01-09 02:20:00+02:00 5
2020-01-09 02:30:00+02:00 16
2020-01-09 02:40:00+02:00 14

Next, you can extract the gaps between two consecutive battery samples with the battery_gaps function.

[7]:
f = niimpy.preprocessing.battery.battery_gaps
gaps = battery.battery_gaps(data, {})
gaps
[7]:
battery_gap
user
iGyXetHE3S8u 2019-08-05 14:00:00+03:00 0 days 00:01:18.600000
2019-08-05 14:30:00+03:00 0 days 00:27:18.396000
2019-08-05 15:00:00+03:00 0 days 00:51:11.997000192
2019-08-05 15:30:00+03:00 NaT
2019-08-05 16:00:00+03:00 0 days 00:59:23.522999808
... ... ...
jd9INuQ5BBlW 2020-01-09 21:30:00+02:00 0 days 00:05:41.859499968
2020-01-09 22:00:00+02:00 0 days 00:14:10.238500096
2020-01-09 22:30:00+02:00 0 days 00:21:09.899999744
2020-01-09 23:00:00+02:00 0 days 00:13:20.001333418
2020-01-09 23:30:00+02:00 0 days 00:08:26.416999936

210 rows × 1 columns

Knowing when the phone is shutdown is essential if we want to infer the usage behaviour of the subjects. This can be done by calling the shutdown_info function. The function returns the timestamp when the phone is shut down or rebooted (e.g: battery_status = -1).

[8]:
shutdown = battery.shutdown_info(data, feature_functions={'battery_column_name': 'battery_status'})
shutdown
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_306302/2493873374.py in ?()
----> 1 shutdown = battery.shutdown_info(data, feature_functions={'battery_column_name': 'battery_status'})
      2 shutdown

~/src/niimpy/niimpy/preprocessing/battery.py in ?(df, feature_functions)
     29
     30     df[col_name] = pd.to_numeric(df[col_name]) #convert to numeric in case it is not
     31
     32     shutdown = df[df[col_name].between(-3, 0, inclusive="neither")]
---> 33     return shutdown[col_name].to_dataframe()

~/miniconda3/envs/niimpy/lib/python3.11/site-packages/pandas/core/generic.py in ?(self, name)
   5985             and name not in self._accessors
   5986             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5987         ):
   5988             return self[name]
-> 5989         return object.__getattribute__(self, name)

AttributeError: 'Series' object has no attribute 'to_dataframe'

Extracting features with the extract_features call

We have seen above how to extract battery features using niimpy. Sometimes, we need more than one features and it would be inconvenient to extract everything one by one. niimpy provides a extract_feature call to allow you extracting all the features available and combining them into a single data frame. The extractable features must start with the prefix battery_.

[ ]:
# Start by defining the feature name
f0 = niimpy.preprocessing.battery.battery_occurrences
f1 = niimpy.preprocessing.battery.battery_gaps
f2 = niimpy.preprocessing.battery.battery_charge_discharge

# The extract_feature function requires a feature_functions parameter.
# This parameter accepts a dictionary where the key is the feature name and value
# is a dictionary containing values passed to the function.
features = battery.extract_features_battery(
    data,
    feature_functions={f0: {'rule': "10min"},
    f1: {},
    f2: {}
})
features.head()
<function battery_occurrences at 0x7f15ba5bb2e0> {'rule': '10min'}
<function battery_gaps at 0x7f15ba5bb380> {}
<function battery_charge_discharge at 0x7f15ba5bb420> {}
occurrences battery_gap bdelta charge/discharge
user
iGyXetHE3S8u 2019-08-05 14:00:00+03:00 2 0 days 00:01:18.600000 -0.5 -0.006361
2019-08-05 14:30:00+03:00 1 0 days 00:27:18.396000 -1.0 -0.000610
2019-08-05 15:00:00+03:00 1 0 days 00:51:11.997000192 -1.0 -0.000326
2019-08-05 15:30:00+03:00 0 NaT NaN NaN
2019-08-05 16:00:00+03:00 1 0 days 00:59:23.522999808 -1.0 -0.000281