niimpy.exploration.eda.missingness module

This module is rewritten based on the missingno package. The original files can be found here: https://github.com/ResidentMario/missingno

niimpy.exploration.eda.missingness.bar(df, columns=None, title='Data frequency', xaxis_title='', yaxis_title='', sampling_freq=None, sampling_method='mean')[source]

Display bar chart visualization of the nullity of the given DataFrame.

Parameters
df: pandas Dataframe

Dataframe to plot

columns: list, optional

Columns from input dataframe to investigate missingness. If none is given, uses all columns.

title: str

Figure’s title

xaxis_title: str, optional

x_axis’s label

yaxis_title: str, optional

y_axis’s label

sampling_freq: str, optional

Frequency to resample the data. Requires the dataframe to have datetime-like index. Possible values: ‘H’, ‘T’

sampling_method: str, optional

Resampling method. Possible values: ‘sum’, ‘mean’. Default value is ‘mean’.

Returns
——-
fig: Plotly figure.
niimpy.exploration.eda.missingness.bar_count(df, columns=None, title='Data frequency', xaxis_title='', yaxis_title='', sampling_freq='H')[source]

Display bar chart visualization of the nullity of the given DataFrame.

Parameters
df: pandas Dataframe

Dataframe to plot

columns: list, optional

Columns from input dataframe to investigate missingness. If none is given, uses all columns.

title: str

Figure’s title

xaxis_title: str, optional

x_axis’s label

yaxis_title: str, optional

y_axis’s label

sampling_freq: str, optional

Frequency to resample the data. Requires the dataframe to have datetime-like index. Possible values: ‘H’, ‘T’

Returns
fig: Plotly figure.
niimpy.exploration.eda.missingness.heatmap(df, height=800, width=800, title='', xaxis_title='', yaxis_title='')[source]

Return ‘plotly’ heatmap visualization of the nullity correlation of the Dataframe.

Parameters
df: pandas Dataframe

Dataframe to plot

width: int:

Figure’s width

height: int:

Figure’s height

Returns
——-
fig: Plotly figure.
niimpy.exploration.eda.missingness.matrix(df, height=500, title='Data frequency', xaxis_title='', yaxis_title='', sampling_freq=None, sampling_method='mean')[source]

Return matrix visualization of the nullity of data. For now, this function assumes that the data frame is datetime indexed.

Parameters
df: pandas Dataframe

Dataframe to plot

columns: list, optional

Columns from input dataframe to investigate missingness. If none is given, uses all columns.

title: str

Figure’s title

xaxis_title: str, optional

x_axis’s label

yaxis_title: str, optional

y_axis’s label

sampling_freq: str, optional

Frequency to resample the data. Requires the dataframe to have datetime-like index. Possible values: ‘H’, ‘T’

sampling_method: str, optional

Resampling method. Possible values: ‘sum’, ‘mean’. Default value is ‘mean’.

Returns
——-
fig: Plotly figure.