API reference¶

fairness.data¶

Data loading utilities used by the fairness toolkit.

This module loads tabular datasets into a pandas DataFrame, while preserving row order and/or indices so that downstream steps can guarantee alignment between:

model predictions (y_pred)
true labels (y_test)
protected attributes used to construct intersectional groups

Dataset-specific logic (e.g., mapping target labels, binning ages, cleaning special missing-value encodings such as '?') should live in small adapter functions.

Typical usage

from fairness.data import load_csv, load_features_and_target df = load_csv("data/heart.csv") X, y = load_features_and_target(df, target_col="HeartDisease")

`load_csv` ¶

Load a CSV file into a pandas DataFrame.

The CSV may be provided either as a local file path or as a URL (e.g. an HTTP(S) link to a raw CSV file).

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	Path or URL to the CSV file.	required
`index_col`	`Optional[Union[int, str]]`	Column to use as the row index (passed to pandas.read_csv). If None, pandas uses a default integer index.	`None`
`na_values`	`Optional[Union[str, Sequence[str]]]`	Additional strings to recognise as NA/NaN.	`None`

Returns:

Type	Description
`DataFrame`	The dataset as a DataFrame.

Raises:

Type	Description
`FileNotFoundError`	If a local file path does not exist.
`ValueError`	If the loaded CSV is empty.

`load_features_and_target` ¶

Split a DataFrame into features X and target y.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Full dataset containing both features and target.	required
`target_col`	`str`	Name of the target column.	required
`drop_cols`	`Sequence[str]`	Additional columns to drop from X (e.g., derived protected attributes used only for fairness analysis such as 'age_group').	`()`

Returns:

Type	Description
`(X, y):`	X is a DataFrame of features, y is a Series of labels.

Raises:

Type	Description
`ValueError`	If target_col is not in df, or if resulting X is empty.

`load_heart_csv` ¶

Load the Heart Disease CSV used in the tutorial.

This is a wrapper around load_csv()

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	Path to heart.csv.	required
`target_col`	`str`	Expected target column name (used for validation).	`'HeartDisease'`

Returns:

Type	Description
`DataFrame`	Loaded dataset.

Raises:

Type	Description
`ValueError`	If the expected target column is missing.

`validate_columns` ¶

Validate that required columns exist in the DataFrame.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input DataFrame.	required
`required`	`Iterable[str]`	Column names that must be present.	required

Raises:

Type	Description
`ValueError`	If any required column is missing.

fairness.preprocess¶

Preprocessing utilities for tabular datasets used in fairness analysis.

This module includes: - feature engineering (e.g., binning age into age_group) - converting raw tabular data into numeric features suitable for ML - producing reproducible train/test splits while preserving indices

Design notes

The toolkit is model-agnostic: these functions do not require sklearn pipelines, but they produce outputs compatible with sklearn and similar libraries.
Protected attributes may be used for fairness analysis even if they are excluded from model training. Derived protected attributes (e.g. age_group) are excluded from model inputs.

Typical usage

from fairness.data import load_csv from fairness.preprocess import add_age_group, preprocess_tabular, make_train_test_split df = load_csv("data/heart.csv") df = add_age_group(df) df_model = preprocess_tabular(df) split = make_train_test_split(df_model, target_col="HeartDisease", drop_cols=("age_group",))

`SplitData` `dataclass` ¶

Container for a reproducible train/test split.

Attributes:

Name	Type	Description
`X_train, X_test`		Feature matrices for training and testing.
`y_train, y_test`		Target vectors for training and testing.

`add_age_group` ¶

Add a categorical age-group column derived from a continuous age column.

This is useful for fairness analysis because continuous protected attributes (like age) create too many groups; binning yields interpretable groups.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input dataset.	required
`age_col`	`str`	Name of the column containing numeric ages.	`'Age'`
`new_col`	`str`	Name of the derived categorical column to create.	`'age_group'`
`bins`	`Sequence[float]`	Bin edges passed to pandas.cut.	`(0, 55, 120)`
`labels`	`Sequence[str]`	Labels assigned to the bins.	`('young', 'older')`

Returns:

Type	Description
`DataFrame`	Copy of df with the new categorical column added.

Raises:

Type	Description
`ValueError`	If age_col is missing or binning produces missing values.

`apply_transforms` ¶

Apply a sequence of DataFrame -> DataFrame transforms in order.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input dataset.	required
`transforms`	`Sequence[Callable[[DataFrame], DataFrame]]`	Sequence of callables each returning a modified DataFrame.	required

Returns:

Type	Description
`DataFrame`	Transformed DataFrame.

`make_train_test_split` ¶

Create a reproducible train/test split for modelling.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Preprocessed dataset containing features and target.	required
`target_col`	`str`	Name of the target column.	required
`drop_cols`	`Sequence[str]`	Additional columns to exclude from X (e.g. derived protected attributes).	`()`
`test_size`	`float`	Fraction of rows assigned to the test set.	`0.3`
`random_state`	`int`	Random seed for reproducibility.	`42`
`stratify`	`bool`	If True, stratify split by the target to preserve class balance.	`True`

Returns:

Type	Description
`SplitData`	Container holding X_train, X_test, y_train, y_test.

Raises:

Type	Description
`ValueError`	If target_col is missing or df is empty.

`map_binary_column` ¶

Map values of a binary/categorical column to new values (e.g., 'M'/'F' -> 1/0).

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input dataset.	required
`col`	`str`	Column name to map.	required
`mapping`	`Mapping[object, object]`	Dictionary defining how to map values.	required
`strict`	`bool`	If True, raise if unmapped values occur. If False, leave unmapped as-is.	`True`

Returns:

Type	Description
`DataFrame`	Copy of df with mapped column.

Raises:

Type	Description
`ValueError`	If strict=True and unmapped values are found.

`preprocess_tabular` ¶

Convert a tabular DataFrame into numeric ML-ready features.

Performs one-hot encoding for categorical columns (object/category) and leaves numeric columns unchanged.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Input dataset.	required
`drop_cols`	`Sequence[str]`	Columns to drop prior to encoding	`()`
`one_hot`	`bool`	Whether to one-hot encode categorical columns.	`True`
`drop_first`	`bool`	If one_hot=True, drop the first level for each categorical variable to avoid perfect multicollinearity in logistic regression models.	`True`

Returns:

Type	Description
`DataFrame`	A numeric DataFrame compatible with scikit-learn.

fairness.groups¶

Minimal utilities for constructing intersectional group labels and producing an evaluation DataFrame aligned with model predictions and true labels.

Primary output: a tidy DataFrame with columns: - subject_label (intersectional group label per individual) - y_pred (model prediction) - y_true (true label)

`make_eval_df` ¶

Build an evaluation DataFrame for group-based metric functions.

The handoff format for metrics such as accuracy_diff:

subject_labels = eval_df[label_col].tolist()
predictions    = eval_df["y_pred"].tolist()
true_statuses  = eval_df["y_true"].tolist()

Parameters:

Name	Type	Description	Default
`df_test`	`DataFrame`	Test-set DataFrame in the SAME row order as y_pred and y_true (typically df.loc[split.X_test.index]).	required
`protected`	`Sequence[str]`	Protected columns used to define intersectional groups.	required
`y_pred`	`Sequence`	Model predictions aligned to df_test rows.	required
`y_true`	`Sequence`	True labels aligned to df_test rows.	required
`label_col`	`str`	Name of the intersectional label column.	`'subject_label'`

Returns:

Type	Description
`DataFrame`	Columns: subject_label, y_pred, y_true (index preserved).

`make_intersectional_labels` ¶

Create an intersectional group label for each row of df.

Example: Sex=1|age_group=older

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame containing protected columns.	required
`protected`	`Sequence[str]`	Column names to intersect (order defines label format).	required
`sep`	`str`	Formatting separators for the label.	`'\|'`
`kv_sep`	`str`	Formatting separators for the label.	`'\|'`
`missing`	`str`	Placeholder for missing values.	`'NA'`

Returns:

Type	Description
`list[str]`	One label per row, aligned with df.

fairness.adapters¶

`make_subject_labels_dict` ¶

Build the dict-of-lists format expected by intersect_* functions.

Parameters:

Name	Type	Description	Default
`df_test`	`DataFrame`	Test-set DataFrame containing the protected columns.	required
`protected_cols`	`list[str]`	E.g. ["Sex", "age_group"]	required

Returns:

Type	Description
`dict[str, list]`	{col: list_of_values_aligned_rowwise_with_eval_df}

`unpack_eval_df` ¶

Convert eval_df into the list inputs expected by group_* metric functions.

Expects eval_df columns: - subject_label (str) - y_pred (0/1) - y_true (0/1)

Returns:

Name	Type	Description
`subject_labels`	`list[str]`
`predictions`	`list[int]`
`true_statuses`	`list[int]`

fairness.metrics¶

`all_intersect_accs` ¶

Calculate accuracies for all possible intersectional groups.

Computes accuracy for every combination of categories in the dataset (e.g., all age-group-gender combinations).

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`dict`	Dictionary mapping intersectional group names (formatted as "label1 + label2 + ...") to their respective accuracies.

`all_intersect_fdrs` ¶

Calculate false discovery rates for all possible intersectional groups.

Computes false discovery rate for every combination of categories in the dataset (e.g., all age-group-gender combinations).

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`dict`	Dictionary mapping intersectional group names (as strings with ' + ' separating categories) to their false discovery rates.

`all_intersect_fnrs` ¶

Calculate false negative rates for all possible intersectional groups.

Computes false negative rate for every combination of categories in the dataset (e.g., all age-group-gender combinations).

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`dict`	Dictionary mapping intersectional group names (as strings with ' + ' separating categories) to their false negative rates.

`all_intersect_fors` ¶

Calculate false omission rates for all possible intersectional groups.

Computes false omission rate for every combination of categories in the dataset (e.g., all age-group-gender combinations).

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`dict`	Dictionary mapping intersectional group names (as strings with ' + ' separating categories) to their false omission rates.

`all_intersect_fprs` ¶

Calculate false positive rates for all possible intersectional groups.

Computes false positive rate for every combination of categories in the dataset (e.g., all age-group-gender combinations).

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`dict`	Dictionary mapping intersectional group names (as strings with ' + ' separating categories) to their false positive rates.

`group_acc` ¶

Find the accuracy of a group with a specific label.

Parameters:

Name	Type	Description	Default
`group_label`	`str or int`	The label of the group for which the accuracy of the model should be evaluated.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The accuracy of the model in the specified group. Returns np.nan if the group has no observations.

`group_acc_diff` ¶

Calculate the absolute difference in accuracy between two groups.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The absolute difference in accuracy between the two groups. Returns np.nan if either group has no observations.

`group_acc_ratio` ¶

Calculate the ratio of accuracies between two groups.

Computes the maximum of the two possible ratios (group A / group B and group B / group A) to ensure the ratio is always >= 1.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of accuracies between the two groups. Returns np.nan if either group has no observations or if either accuracy is 0.

`group_fdr` ¶

Find the false discovery rate of a group with a specific label.

Parameters:

Name	Type	Description	Default
`group_label`	`str or int`	The label of the group for which the false discovery rate of the model should be evaluated.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false discovery rate of the model in the specified group. Returns np.nan if the group has no observations.

`group_fdr_diff` ¶

Calculate the absolute difference in false discovery rate between two groups.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The absolute difference in false discovery rate between the two groups. Returns np.nan if either group has no observations.

`group_fdr_ratio` ¶

Calculate the ratio of false discovery rates between two groups.

Computes the maximum of the two possible ratios (group A / group B and group B / group A) to ensure the ratio is always >= 1.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of false discovery rates between the two groups. Returns np.nan if either group has no observations or if either false discovery rate is 0.

`group_fnr` ¶

Find the false negative rate of a group with a specific label.

Parameters:

Name	Type	Description	Default
`group_label`	`str or int`	The label of the group for which the false negative rate of the model should be evaluated.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false negative rate of the model in the specified group. Returns np.nan if the group has no observations.

`group_fnr_diff` ¶

Calculate the absolute difference in false negative rate between two groups.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The absolute difference in false negative rate between the two groups. Returns np.nan if either group has no observations.

`group_fnr_ratio` ¶

Calculate the ratio of false negative rates between two groups.

Computes the maximum of the two possible ratios (group A / group B and group B / group A) to ensure the ratio is always >= 1.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of false negative rates between the two groups. Returns np.nan if either group has no observations or if either false negative rate is 0.

`group_for` ¶

Find the false omission rate of a group with a specific label.

Parameters:

Name	Type	Description	Default
`group_label`	`str or int`	The label of the group for which the false omission rate of the model should be evaluated.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false omission rate of the model in the specified group. Returns np.nan if the group has no observations.

`group_for_diff` ¶

Calculate the absolute difference in false omission rate between two groups.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The absolute difference in false omission rate between the two groups. Returns np.nan if either group has no observations.

`group_for_ratio` ¶

Calculate the ratio of false omission rates between two groups.

Computes the maximum of the two possible ratios (group A / group B and group B / group A) to ensure the ratio is always >= 1.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of false omission rates between the two groups. Returns np.nan if either group has no observations or if either false omission rate is 0.

`group_fpr` ¶

Find the false positive rate of a group with a specific label.

Parameters:

Name	Type	Description	Default
`group_label`	`str or int`	The label of the group for which the false positive rate of the model should be evaluated.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false positive rate of the model in the specified group. Returns np.nan if the group has no observations.

`group_fpr_diff` ¶

Calculate the absolute difference in false positive rate between two groups.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The absolute difference in false positive rate between the two groups. Returns np.nan if either group has no observations.

`group_fpr_ratio` ¶

Calculate the ratio of false positive rates between two groups.

Computes the maximum of the two possible ratios (group A / group B and group B / group A) to ensure the ratio is always >= 1.

Parameters:

Name	Type	Description	Default
`group_a_label`	`str or int`	The label of the first group.	required
`group_b_label`	`str or int`	The label of the second group.	required
`subject_labels`	`dict`	A dictionary containing subject labels for every observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of false positive rates between the two groups. Returns np.nan if either group has no observations or if either false positive rate is 0.

`intersect_acc` ¶

Calculate accuracy for an intersectional group.

An intersectional group is defined by membership in specific categories across multiple dimensions (e.g., specific age category and specific gender).

Parameters:

Name	Type	Description	Default
`group_labels_dict`	`dict`	Dictionary mapping category names to specific group labels that define the intersectional group (e.g., {'age': 'Older', 'gender': 'Female'}).	required
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset. predictions : list[bool] A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The accuracy of the model in the specified intersectional group. Returns np.nan if the group has no observations.

`intersect_fdr` ¶

Calculate false discovery rate for an intersectional group.

An intersectional group is defined by membership in specific categories across multiple dimensions (e.g., specific age category and specific gender).

Parameters:

Name	Type	Description	Default
`group_labels_dict`	`dict`	Dictionary mapping category names to specific group labels that define the intersectional group (e.g., {'age': 'Older', 'gender': 'Female'}).	required
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false discovery rate of the model in the specified intersectional group. Returns np.nan if the group has no observations.

`intersect_fnr` ¶

Calculate false negative rate for an intersectional group.

An intersectional group is defined by membership in specific categories across multiple dimensions (e.g., specific age category and specific gender).

Parameters:

Name	Type	Description	Default
`group_labels_dict`	`dict`	Dictionary mapping category names to specific group labels that define the intersectional group (e.g., {'age': 'Older', 'gender': 'Female'}).	required
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false negative rate of the model in the specified intersectional group. Returns np.nan if the group has no observations.

`intersect_for` ¶

Calculate false omission rate for an intersectional group.

An intersectional group is defined by membership in specific categories across multiple dimensions (e.g., specific age category and specific gender).

Parameters:

Name	Type	Description	Default
`group_labels_dict`	`dict`	Dictionary mapping category names to specific group labels that define the intersectional group (e.g., {'age': 'Older', 'gender': 'Female'}).	required
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false omission rate of the model in the specified intersectional group. Returns np.nan if the group has no observations.

`intersect_fpr` ¶

Calculate false positive rate for an intersectional group.

An intersectional group is defined by membership in specific categories across multiple dimensions (e.g., specific age category and specific gender).

Parameters:

Name	Type	Description	Default
`group_labels_dict`	`dict`	Dictionary mapping category names to specific group labels that define the intersectional group (e.g., {'age': 'Older', 'gender': 'Female'}).	required
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The false positive rate of the model in the specified intersectional group. Returns np.nan if the group has no observations.

`max_intersect_acc_diff` ¶

Calculate the maximum difference in accuracy across intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The maximum difference between any two intersectional group accuracies. Returns np.nan if any group has no observations.

`max_intersect_acc_ratio` ¶

Calculate the maximum ratio of accuracies across intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of the maximum to minimum accuracy across all intersectional groups. Returns np.nan if any group has no observations or if any accuracy is 0.

`max_intersect_fdr_diff` ¶

Calculate the maximum difference in false discovery rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The difference between the maximum and minimum false discovery rate across all intersectional groups. Returns np.nan if any group has no observations.

`max_intersect_fdr_ratio` ¶

Calculate the ratio of the maximum to minimum false discovery rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of the maximum to minimum false discovery rate across all intersectional groups. Returns np.nan if any group has no observations or if any false discovery rate is 0.

`max_intersect_fnr_ratio` ¶

Calculate the ratio of the maximum to minimum false negative rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of the maximum to minimum false negative rate across all intersectional groups. Returns np.nan if any group has no observations or if any false negative rate is 0.

`max_intersect_for_diff` ¶

Calculate the maximum difference in false omission rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The difference between the maximum and minimum false omission rate across all intersectional groups. Returns np.nan if any group has no observations.

`max_intersect_for_ratio` ¶

Calculate the ratio of the maximum to minimum false omission rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of the maximum to minimum false omission rate across all intersectional groups. Returns np.nan if any group has no observations or if any false omission rate is 0.

`max_intersect_fpr_diff` ¶

Calculate the maximum difference in false positive rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required

Returns:

Type	Description
`float`	The difference between the maximum and minimum false positive rate across all intersectional groups. Returns np.nan if any group has no observations.

`max_intersect_fpr_ratio` ¶

Calculate the ratio of the maximum to minimum false positive rate across all intersectional groups.

Parameters:

Name	Type	Description	Default
`subject_labels_dict`	`dict`	Dictionary mapping category names to lists of labels for each observation in the evaluation dataset.	required
`predictions`	`list[bool]`	A list of predicted diagnoses for each observation in the evaluation dataset.	required
`true_statuses`	`list[bool]`	A list of true diagnoses for each observation in the evaluation dataset.	required
`natural_log`	`bool`	If True, return the natural logarithm of the ratio. Default is True.	`True`

Returns:

Type	Description
`float`	The (log) ratio of the maximum to minimum false positive rate across all intersectional groups. Returns np.nan if any group has no observations or if any false positive rate is 0.

fairness.single_metrics¶

`calculate_AOD` ¶

Compute the Average Odds Difference (AOD) between demographic groups.

Average Odds Difference measures the average difference in both True Positive Rates (TPR) and False Positive Rates (FPR) between the underprivileged and privileged groups. It captures disparities in model performance for both positive and negative outcomes.

Parameters:

Name	Type	Description	Default
`y_test`	`array-like of shape (n_samples,)`	Ground-truth binary labels. Expected values: 0 (negative outcome) or 1 (positive outcome).	required
`y_pred`	`array-like of shape (n_samples,)`	Predicted binary labels from a classifier. Expected values: 0 (negative outcome) or 1 (positive outcome).	required
`group_labels`			required
`protected`		Each entry corresponds to the same-indexed sample in y_test and y_pred.	required
`privileged_label`	`str`	The label within group_labels considered to be the privileged group (e.g. 'Male' for sex, 'Older' for age). All other labels are treated as unprivileged.	required

Returns:

Name	Type	Description
`AOD`	`float`	Average Odds Difference, defined as: `AOD = 0.5 × [ (FPR_underprivileged − FPR_privileged) + (TPR_underprivileged − TPR_privileged) ]` Values closer to 0 indicate better fairness.

`calculate_DI` ¶

Compute Disparate Impact (DI) between demographic groups.

Disparate Impact measures the ratio of positive prediction rates between the underprivileged and privileged groups. It evaluates whether one group receives favorable outcomes less frequently than another, regardless of ground-truth labels.

Parameters:

Name	Type	Description	Default
`y_pred`	`array-like of shape (n_samples,)`	Predicted binary labels from a classifier. Expected values: 0 (negative outcome) or 1 (positive outcome).	required
`group_labels`			required
`labels`		Each entry corresponds to the same-indexed sample in y_test and y_pred.	required
`privileged_label`	`str`	The label within group_labels considered to be the privileged group (e.g. 'Male' for sex, 'Older' for age). All other labels are treated as unprivileged.	required

Returns:

Name	Type	Description
`DI`	`float`	Disparate Impact, defined as: `DI = P(ŷ = 1 \| underprivileged) / P(ŷ = 1 \| privileged)` where P(ŷ = 1 \| group) is the positive prediction rate for the specified group.

`calculate_EOD` ¶

Compute the Equal Opportunity Difference (EOD) between demographic groups.

Equal Opportunity Difference measures the absolute difference in True Positive Rates (TPR) between the underprivileged and privileged groups. A lower EOD indicates fairer performance with respect to correctly identifying positive cases across groups.

Parameters:

Name	Type	Description	Default
`y_test`	`array-like of shape (n_samples,)`	Ground-truth binary labels. Expected values: 0 (negative outcome) or 1 (positive outcome).	required
`y_pred`	`array-like of shape (n_samples,)`	Predicted binary labels from a classifier. Expected values: 0 (negative outcome) or 1 (positive outcome).	required
`group_labels`			required
`labels`		Each entry corresponds to the same-indexed sample in y_test and y_pred.	required
`privileged_label`	`str`	The label within group_labels considered to be the privileged group (e.g. 'Male' for sex, 'Older' for age). All other labels are treated as unprivileged.	required

Returns:

Name	Type	Description
`EOD`	`float`	Equal Opportunity Difference, defined as: `EOD = \|TPR_underprivileged − TPR_privileged\|` Values closer to 0 indicate better fairness.

Notes

EOD focuses exclusively on the positive class (y = 1).

`calculate_TPR_TNR_FPR_FNR` ¶

Compute classification rate metrics derived from the confusion matrix.

Notes

Counts must be non-negative integers.
Label 1 is assumed to be the positive outcome.

`calculate_TP_FN_FP_TN` ¶

Computes the confusion matrix components: True Positives (TP), False Negatives (FN), True Negatives (TN), and False Positives (FP).

Notes

Binary classification is assumed.
Label 1 denotes the positive outcome.
Label 0 denotes the negative outcome.

`group_to_binary` ¶

Adapts single fairness functions to the intersectional ones labels: list of group labels (e.g. 'Male', 'Female') privileged_label: label considered privileged returns: numpy array (1 = privileged, 0 = unprivileged)

fairness.visualisation¶

Visualization helpers for fairness metrics.

This module contains lightweight plotting utilities that sit on top of the fairness.metrics and fairness.single_metrics APIs. The functions do not compute metrics themselves; they only visualize metric outputs computed from group labels, predictions, and ground-truth labels.

The typical workflow is: 1) Prepare evaluation inputs (see fairness.groups.make_eval_df and fairness.adapters). 2) Compute or select a metric function from fairness.metrics or fairness.single_metrics. 3) Use the plotting helpers here to visualize metric values across groups.

All plotting helpers return a Matplotlib Figure so callers can further customize or save the plots as needed.

`plot_group_metric` ¶

Plot a group-level metric computed with fairness.metrics (group_*).

This function expects a metric that takes a single group label, a list of subject labels, predictions, and true labels, and returns a scalar value for that group (e.g., group_acc, group_fnr, group_fpr).

Parameters:

Name	Type	Description	Default
`metric_fn`	`callable`	A function from `fairness.metrics` with signature: (group_label, subject_labels, predictions, true_statuses) -> float.	required
`subject_labels`	`Iterable`	Group label for each sample (e.g., intersectional labels).	required
`predictions`	`Iterable`	Predicted labels aligned with `subject_labels`.	required
`true_statuses`	`Iterable`	Ground-truth labels aligned with `subject_labels`.	required
`groups`	`Sequence or None`	Subset/ordering of groups to plot. If None, all unique labels are used.	`None`
`title`	`str or None`	Plot title. Defaults to the metric function name.	`None`
`rotation`	`int`	Rotation angle for x tick labels.	`45`
`figsize`	`tuple[float, float] or None`	Figure size in inches. If None, a default size is chosen.	`None`
`sort`	`bool`	If True, sort bars by metric value (NaNs placed at the end).	`False`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

Raises:

Type	Description
`ValueError`	If inputs do not share the same length.

`plot_group_metric_from_eval_df` ¶

Convenience wrapper for an eval_df produced by fairness.groups.make_eval_df.

Parameters:

Name	Type	Description	Default
`metric_fn`	`callable`	A `fairness.metrics` group_* function.	required
`eval_df`	`DataFrame`	DataFrame with columns `label_col`, `y_pred`, and `y_true`.	required
`label_col`	`str`	Column name for group labels (default "subject_label").	`'subject_label'`
`title`	`str or None`	Plot title.	`None`
`rotation`	`int`	Rotation angle for x tick labels.	`45`
`figsize`	`tuple[float, float] or None`	Figure size in inches.	`None`
`sort`	`bool`	If True, sort bars by metric value (NaNs placed at the end).	`False`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

Raises:

Type	Description
`ValueError`	If required columns are missing from eval_df.

`plot_intersectional_metric` ¶

Plot an all_intersect_* metric from fairness.metrics (dict -> bar plot).

Functions such as all_intersect_accs, all_intersect_fprs, etc. return a dictionary mapping intersectional group labels to metric values. This helper converts that dictionary into a horizontal bar plot.

Parameters:

Name	Type	Description	Default
`metric_fn`	`callable`	An `all_intersect_*` function with signature: (subject_labels_dict, predictions, true_statuses) -> dict.	required
`subject_labels_dict`	`Mapping[str, Sequence]`	Mapping from protected attribute name to labels per sample.	required
`predictions`	`Iterable`	Predicted labels aligned with `subject_labels_dict` values.	required
`true_statuses`	`Iterable`	Ground-truth labels aligned with `subject_labels_dict` values.	required
`title`	`str or None`	Plot title. Defaults to the metric function name.	`None`
`rotation`	`int`	Rotation angle for tick labels.	`0`
`figsize`	`tuple[float, float] or None`	Figure size in inches.	`None`
`sort`	`bool`	If True, sort bars by metric value (NaNs placed at the end).	`True`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

Raises:

Type	Description
`ValueError`	If predictions and true_statuses lengths differ.
`TypeError`	If metric_fn does not return a dictionary.

`plot_pairwise_group_metric` ¶

Plot pairwise group metrics (group_diff, group_ratio).

Pairwise metric functions compare two groups at a time and return a scalar (e.g., difference or ratio of accuracies).

Parameters:

Name	Type	Description	Default
`metric_fn`	`callable`	A function from `fairness.metrics` with signature: (group_a, group_b, subject_labels, predictions, true_statuses) -> float.	required
`subject_labels`	`Iterable`	Group label for each sample.	required
`predictions`	`Iterable`	Predicted labels aligned with `subject_labels`.	required
`true_statuses`	`Iterable`	Ground-truth labels aligned with `subject_labels`.	required
`group_pairs`	`Sequence[tuple] or None`	Explicit list of (group_a, group_b) pairs to plot. If None, all pairwise combinations of unique groups are used.	`None`
`title`	`str or None`	Plot title. Defaults to the metric function name.	`None`
`rotation`	`int`	Rotation angle for x tick labels (used for vertical plots only).	`45`
`figsize`	`tuple[float, float] or None`	Figure size in inches.	`None`
`sort`	`bool`	If True, sort bars by metric value (NaNs placed at the end).	`True`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

Raises:

Type	Description
`ValueError`	If no group pairs are provided or generated.

`plot_scalar_metrics` ¶

Plot one or more scalar metrics (e.g., max_intersect_* outputs).

Parameters:

Name	Type	Description	Default
`metrics`	`Mapping[str, float]`	Mapping from metric name to scalar value.	required
`title`	`str or None`	Plot title.	`None`
`rotation`	`int`	Rotation angle for x tick labels.	`0`
`figsize`	`tuple[float, float] or None`	Figure size in inches.	`None`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

`plot_single_metrics` ¶

Plot single-attribute fairness metrics from fairness.single_metrics.

This helper computes and visualizes metrics such as EOD, AOD, and DI for a single protected attribute with a specified privileged group. Note that DI uses only predictions, while EOD and AOD require y_test.

Parameters:

Name	Type	Description	Default
`y_test`	`Iterable`	Ground-truth binary labels (0/1).	required
`y_pred`	`Iterable`	Predicted binary labels (0/1).	required
`group_labels`	`Iterable`	Protected attribute labels aligned to y_test/y_pred.	required
`privileged_label`	`object`	Label treated as the privileged group.	required
`metrics`	`Sequence[str] or None`	Subset of {"EOD", "AOD", "DI"} to compute. Defaults to all.	`None`
`title`	`str or None`	Plot title.	`None`
`rotation`	`int`	Rotation angle for x tick labels.	`0`
`figsize`	`tuple[float, float] or None`	Figure size in inches.	`None`

Returns:

Type	Description
`Figure`	The created Matplotlib figure.

Raises:

Type	Description
`ValueError`	If an unknown metric name is requested.

API reference¶

fairness.data¶

fairness.data¶

load_csv ¶

load_features_and_target ¶

load_heart_csv ¶

validate_columns ¶

fairness.preprocess¶

fairness.preprocess¶

SplitData dataclass ¶

add_age_group ¶

apply_transforms ¶

make_train_test_split ¶

map_binary_column ¶

preprocess_tabular ¶

fairness.groups¶

fairness.groups¶

make_eval_df ¶

make_intersectional_labels ¶

fairness.adapters¶

make_subject_labels_dict ¶

unpack_eval_df ¶

fairness.metrics¶

all_intersect_accs ¶

all_intersect_fdrs ¶

all_intersect_fnrs ¶

all_intersect_fors ¶

all_intersect_fprs ¶

group_acc ¶

group_acc_diff ¶

group_acc_ratio ¶

group_fdr ¶

group_fdr_diff ¶

group_fdr_ratio ¶

group_fnr ¶

group_fnr_diff ¶

group_fnr_ratio ¶

group_for ¶

group_for_diff ¶

group_for_ratio ¶

group_fpr ¶

group_fpr_diff ¶

group_fpr_ratio ¶

intersect_acc ¶

intersect_fdr ¶

intersect_fnr ¶

intersect_for ¶

intersect_fpr ¶

max_intersect_acc_diff ¶

max_intersect_acc_ratio ¶

max_intersect_fdr_diff ¶

max_intersect_fdr_ratio ¶

max_intersect_fnr_ratio ¶

max_intersect_for_diff ¶

max_intersect_for_ratio ¶

max_intersect_fpr_diff ¶

max_intersect_fpr_ratio ¶

fairness.single_metrics¶

calculate_AOD ¶

calculate_DI ¶

calculate_EOD ¶

calculate_TPR_TNR_FPR_FNR ¶

calculate_TP_FN_FP_TN ¶

group_to_binary ¶

fairness.visualisation¶

plot_group_metric ¶

plot_group_metric_from_eval_df ¶

plot_intersectional_metric ¶

plot_pairwise_group_metric ¶

plot_scalar_metrics ¶

plot_single_metrics ¶

`load_csv` ¶

`load_features_and_target` ¶

`load_heart_csv` ¶

`validate_columns` ¶

`SplitData` `dataclass` ¶

`add_age_group` ¶

`apply_transforms` ¶

`make_train_test_split` ¶

`map_binary_column` ¶

`preprocess_tabular` ¶

`make_eval_df` ¶

`make_intersectional_labels` ¶

`make_subject_labels_dict` ¶

`unpack_eval_df` ¶

`all_intersect_accs` ¶

`all_intersect_fdrs` ¶

`all_intersect_fnrs` ¶

`all_intersect_fors` ¶

`all_intersect_fprs` ¶

`group_acc` ¶

`group_acc_diff` ¶

`group_acc_ratio` ¶

`group_fdr` ¶

`group_fdr_diff` ¶

`group_fdr_ratio` ¶

`group_fnr` ¶

`group_fnr_diff` ¶

`group_fnr_ratio` ¶

`group_for` ¶

`group_for_diff` ¶

`group_for_ratio` ¶

`group_fpr` ¶

`group_fpr_diff` ¶

`group_fpr_ratio` ¶

`intersect_acc` ¶

`intersect_fdr` ¶

`intersect_fnr` ¶

`intersect_for` ¶

`intersect_fpr` ¶

`max_intersect_acc_diff` ¶

`max_intersect_acc_ratio` ¶

`max_intersect_fdr_diff` ¶

`max_intersect_fdr_ratio` ¶

`max_intersect_fnr_ratio` ¶

`max_intersect_for_diff` ¶

`max_intersect_for_ratio` ¶

`max_intersect_fpr_diff` ¶

`max_intersect_fpr_ratio` ¶

`calculate_AOD` ¶

`calculate_DI` ¶

`calculate_EOD` ¶

`calculate_TPR_TNR_FPR_FNR` ¶

`calculate_TP_FN_FP_TN` ¶

`group_to_binary` ¶

`plot_group_metric` ¶

`plot_group_metric_from_eval_df` ¶

`plot_intersectional_metric` ¶

`plot_pairwise_group_metric` ¶

`plot_scalar_metrics` ¶

`plot_single_metrics` ¶