`slickml.metrics._classification`#

Module Contents#

Classes#

BinaryClassificationMetrics

BinaryClassificationMetrics calculates binary classification metrics in one place.

class slickml.metrics._classification.BinaryClassificationMetrics[source]#

BinaryClassificationMetrics calculates binary classification metrics in one place.

Binary metrics are computed based on three methods for calculating the thresholds to binarize the prediction probabilities. Threshold computations including:

Youden Index [youden-j-index].

Maximizing Precision-Recall.

Maximizing Sensitivity-Specificity.

Parameters:

y_true (Union[List[int], np.ndarray, pd.Series]) – List of ground truth values such as [0, 1] for binary problems
y_pred_proba (Union[List[float], np.ndarray, pd.Series]) – List of predicted probabilities for the positive class (class=1) in binary problems or y_pred_proba[:, 1] in scikit-learn API
threshold (float, optional) – Inclusive threshold value to binarize y_pred_prob to y_pred where any value that satisfies y_pred_prob >= threshold will set to class=1 (positive class). Note that for ">=" is used instead of ">", by default 0.5
average_method (str, optional) – Method to calculate the average of any metric. Possible values are "micro", "macro", "weighted", "binary", by default “binary”
precision_digits (int, optional) – The number of precision digits to format the scores dataframe, by default 3
display_df (bool, optional) – Whether to display the formatted scores’ dataframe, by default True

plot(figsize=(12, 12), save_path=None, display_plot=False, return_fig=False)[source]#: Plots classification metrics

get_metrics(dtype='dataframe')[source]#: Returns calculated classification metrics

y_pred_#

Predicted class based on the threshold. The threshold value inclusively binarizes y_pred_prob to y_pred where any value that satisfies y_pred_prob >= threshold will set to class=1 (positive class). Note that for ">=" is used instead of ">"

Type:: np.ndarray

accuracy_#

Accuracy based on the initial threshold value with a possible value between 0.0 and 1.0

Type:: float

balanced_accuracy_#

Balanced accuracy based on the initial threshold value considering the prevalence of the classes with a possible value between 0.0 and 1.0

Type:: float

fpr_list_#

List of calculated false-positive-rates based on roc_thresholds_

Type:: np.ndarray

tpr_list_#

List of calculated true-positive-rates based on roc_thresholds_

Type:: np.ndarray

roc_thresholds_#

List of thresholds value to calculate fpr_list_ and tpr_list_

Type:: np.ndarray

auc_roc_#

Area under ROC curve with a possible value between 0.0 and 1.0

Type:: float

precision_list_#

List of calculated precision based on pr_thresholds_

Type:: np.ndarray

recall_list_#

List of calculated recall based on pr_thresholds_

Type:: np.ndarray

pr_thresholds_#

List of precision-recall thresholds value to calculate precision_list_ and recall_list_

Type:: numpy.ndarray

auc_pr_#

Area under Precision-Recall curve with a possible value between 0.0 and 1.0

Type:: float

precision_#

Precision based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

recall_#

Recall based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

f1_#

F1-score based on the threshold value (beta=1.0) with a possible value between 0.0 and 1.0

Type:: float

f2_#

F2-score based on the threshold value (beta=2.0) with a possible value between 0.0 and 1.0

Type:: float

f05_#

F(1/2)-score based on the threshold value (beta=0.5) with a possible value between 0.0 and 1.0

Type:: float

average_precision_#

Avearge precision based on the threshold value and class prevalence with a possible value between 0.0 and 1.0

Type:: float

tn_#

True negative counts based on the threshold value

Type:: np.int64

fp_#

False positive counts based on the threshold valuee

Type:: np.int64

fn_#

False negative counts based on the threshold value

Type:: np.int64

tp_#

True positive counts based on the threshold value

Type:: np.int64

threat_score_#

Threat score based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

youden_index_#

Index of the calculated Youden index threshold

Type:: np.int64

youden_threshold_#

Threshold calculated based on Youden Index with a possible value between 0.0 and 1.0

Type:: float

sens_spec_threshold_#

Threshold calculated based on maximized sensitivity-specificity with a possible value between 0.0 and 1.0

Type:: float

prec_rec_threshold_#

Threshold calculated based on maximized precision-recall with a possible value between 0.0 and 1.0

Type:: float

thresholds_dict_#

Calculated thresholds based on different algorithms including Youden Index youden_threshold_, maximizing the area under sensitivity-specificity curve sens_spec_threshold_, and maximizing the area under precision-recall curver prec_rec_threshold_

Type:: Dict[str, float]

metrics_dict_#

Rounded metrics based on the number of precision digits

Type:: Dict[str, float]

metrics_df_#

Pandas DataFrame of all calculated metrics with threshold set as index

Type:: pd.DataFrame

average_methods_#

List of all possible average methods

Type:: List[str]

plotting_dict_#

Plotting properties

Type:: Dict[str, Any]

References

[youden-j-index]

https://en.wikipedia.org/wiki/Youden%27s_J_statistic

Examples

>>> from slickml.metrics import BinaryClassificationMetrics
>>> cm = BinaryClassificationMetrics(
...     y_true=[1, 1, 0, 0],
...     y_pred_proba=[0.95, 0.3, 0.1, 0.9]
... )
>>> f = cm.plot()
>>> m = cm.get_metrics()

average_method :Optional[str] = binary#

display_df :Optional[bool] = True#

precision_digits :Optional[int] = 3#

threshold :Optional[float] = 0.5#

y_pred_proba :Union[List[float], numpy.ndarray, pandas.Series]#

y_true :Union[List[int], numpy.ndarray, pandas.Series]#

__post_init__() → None[source]#: Post instantiation validations and assignments.

get_metrics(dtype: Optional[str] = 'dataframe') → Union[pandas.DataFrame, Dict[str, Optional[float]]][source]#

Returns calculated metrics with desired dtypes.

Currently, available output types are “dataframe” and “dict”.

Parameters:: dtype (str, optional) – Results dtype, by default “dataframe”
Returns:: Union[pd.DataFrame, Dict[str, Optional[float]]]

plot(figsize: Optional[Tuple[float, float]] = (12, 12), save_path: Optional[str] = None, display_plot: Optional[bool] = False, return_fig: Optional[bool] = False) → Optional[matplotlib.figure.Figure][source]#

Plots classification metrics.

Parameters:

figsize (Tuple[float, float], optional) – Figure size, by default (12, 12)
save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None
display_plot (bool, optional) – Whether to show the plot, by default False
return_fig (bool, optional) – Whether to return figure object, by default False

Returns:

Figure

slickml.metrics._classification#

Module Contents#

Classes#

`slickml.metrics._classification`#