niimpy.reading.database module

Read data from sqlite3 databases.

Direct use of this module is mostly deprecated.

Read data from sqlite3 databases, both into pandas.DataFrame:s (Database.raw(), among other functions), and Database objects. The Database object does not immediately load data, but provides some methods to load data on demand later, possibly doing various filtering and preprocessing already at the loading stage. This can save memory and processing time, but is much more complex.

This module is mostly out-of-use now: read.read_sqlite is used instead, which wraps the .raw() method and reads all data into memory.

Database format

When reading data, a table name must be specified (which allows multiple datasets to be put in one file). Table column names map to dataframe column names, with various standard processing (for example the ‘time’ column being converted to the index)

Quick usage

db = database.open(FILE_NAME, tz=TZ) df = db.raw(TABLE_NAME, user=database.ALL)

Recommend usage:

df = niimpy.read_sqlite(FILE_NAME, TABLE_NAME, tz=TZ)

See also

niimpy.reading.read_*: currently recommended functions to access all types of data, including databases.

class niimpy.reading.database.ALL[source]

Bases: object

Sentinel value for all users

class niimpy.reading.database.Data1(db, tz=None)[source]

Bases: object

Database wrapper for niimpy data.

This opens a database and provides methods to do common operations.

Methods

count(*args, **kwargs)

Return the number of rows

execute(*args, **kwargs)

Execute rauw SQL code.

exists(*args, **kwargs)

Returns True if any data exists

first(table, user[, start, end, offset, ...])

Return earliest data point.

get_survey_score(table, user, survey[, ...])

Get the survey results, summing scores.

last(*args, **kwargs)

Return the latest timestamp.

raw(table, user[, limit, offset, start, end])

Read all data in a table and return it as a DataFrame.

tables()

List all tables that are inside of this database.

user_table_counts()

Return table of number of data points per user, per table.

users([table])

Return set of all users in all tables

validate_username(user)

Validate a username, for single/multiuser database and so on.

hourly

occurrence

timestamps

count(*args, **kwargs)[source]

Return the number of rows

See the “first” for more information.

execute(*args, **kwargs)[source]

Execute rauw SQL code.

Execute raw SQL. Smply proxy all arguments to self.conn.execute(). This is simply a convenience shortcut.

exists(*args, **kwargs)[source]

Returns True if any data exists

Follows the same syntax as .first(), .last(), and .count(), but the limit argument is not used.

first(table, user, start=None, end=None, offset=None, _aggregate='min', _limit=None)[source]

Return earliest data point.

Return None if there is no data.

get_survey_score(table, user, survey, limit=None, start=None, end=None)[source]

Get the survey results, summing scores.

survey: The servey prefix in the ‘id’ column, e.g. ‘PHQ9’. An ‘_’ is appended.

hourly(table, user, columns=[], limit=None, offset=None, start=None, end=None)[source]
last(*args, **kwargs)[source]

Return the latest timestamp.

See the “first” for more information.

occurrence(table, user, bin_width=720, limit=None, offset=None, start=None, end=None)[source]
raw(table, user, limit=None, offset=None, start=None, end=None)[source]

Read all data in a table and return it as a DataFrame.

This reads all data (subject to several possible filters) and returns it as a DataFrame.

tables()[source]

List all tables that are inside of this database.

Returns a set.

timestamps(table, user, limit=None, offset=None, start=None, end=None)[source]
user_table_counts()[source]

Return table of number of data points per user, per table.

Return a dataframe of row=table, column=user, value=number of counts of that user in that table.

users(table=None)[source]

Return set of all users in all tables

validate_username(user)[source]

Validate a username, for single/multiuser database and so on.

This function considers if the database is single or multi-user, and ensures a valid username or ALL.

It returns a valid username, so can be used as a wrapper, to handle future special cases, e.g.:

user = db.validate_username(user)
niimpy.reading.database.open(db, tz=None)[source]

Open a database and return a Data1 object

class niimpy.reading.database.sqlite3_stdev[source]

Bases: object

Sqlite sample standard deviation function in pure Python.

With conn.create_aggregate(“stdev”, 1, sqlite3_stdev), this adds a stdev function to sqlite.

Edge cases:

  • Empty list = nan (different than C function, which is zero)

  • Ignores nan input values (does not count them). (different than numpy: returns nan)

  • ignores non-numeric types (no conversion)

Methods

finalize

step

finalize()[source]
step(value)[source]