slickml.metrics¶

Classes¶

`BinaryClassificationMetrics`	BinaryClassificationMetrics calculates binary classification metrics in one place.
`RegressionMetrics`	Regression Metrics is a wrapper to calculate all the regression metrics in one place.

Package Contents¶

class slickml.metrics.BinaryClassificationMetrics[source]¶

BinaryClassificationMetrics calculates binary classification metrics in one place.

Binary metrics are computed based on three methods for calculating the thresholds to binarize the prediction probabilities. Threshold computations including:

Youden Index [youden-j-index].

Maximizing Precision-Recall.

Maximizing Sensitivity-Specificity.

Parameters:

y_true (Union[List[int], np.ndarray, pd.Series]) – List of ground truth values such as [0, 1] for binary problems
y_pred_proba (Union[List[float], np.ndarray, pd.Series]) – List of predicted probabilities for the positive class (class=1) in binary problems or y_pred_proba[:, 1] in scikit-learn API
threshold (float, optional) – Inclusive threshold value to binarize y_pred_prob to y_pred where any value that satisfies y_pred_prob >= threshold will set to class=1 (positive class). Note that for ">=" is used instead of ">", by default 0.5
average_method (str, optional) – Method to calculate the average of any metric. Possible values are "micro", "macro", "weighted", "binary", by default “binary”
precision_digits (int, optional) – The number of precision digits to format the scores dataframe, by default 3
display_df (bool, optional) – Whether to display the formatted scores’ dataframe, by default True

plot(figsize=(12, 12), save_path=None, display_plot=False, return_fig=False)[source]¶: Plots classification metrics

get_metrics(dtype='dataframe')[source]¶: Returns calculated classification metrics

y_pred_¶

Predicted class based on the threshold. The threshold value inclusively binarizes y_pred_prob to y_pred where any value that satisfies y_pred_prob >= threshold will set to class=1 (positive class). Note that for ">=" is used instead of ">"

Type:: np.ndarray

accuracy_¶

Accuracy based on the initial threshold value with a possible value between 0.0 and 1.0

Type:: float

balanced_accuracy_¶

Balanced accuracy based on the initial threshold value considering the prevalence of the classes with a possible value between 0.0 and 1.0

Type:: float

fpr_list_¶

List of calculated false-positive-rates based on roc_thresholds_

Type:: np.ndarray

tpr_list_¶

List of calculated true-positive-rates based on roc_thresholds_

Type:: np.ndarray

roc_thresholds_¶

List of thresholds value to calculate fpr_list_ and tpr_list_

Type:: np.ndarray

auc_roc_¶

Area under ROC curve with a possible value between 0.0 and 1.0

Type:: float

precision_list_¶

List of calculated precision based on pr_thresholds_

Type:: np.ndarray

recall_list_¶

List of calculated recall based on pr_thresholds_

Type:: np.ndarray

pr_thresholds_¶

List of precision-recall thresholds value to calculate precision_list_ and recall_list_

Type:: numpy.ndarray

auc_pr_¶

Area under Precision-Recall curve with a possible value between 0.0 and 1.0

Type:: float

precision_¶

Precision based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

recall_¶

Recall based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

f1_¶

F1-score based on the threshold value (beta=1.0) with a possible value between 0.0 and 1.0

Type:: float

f2_¶

F2-score based on the threshold value (beta=2.0) with a possible value between 0.0 and 1.0

Type:: float

f05_¶

F(1/2)-score based on the threshold value (beta=0.5) with a possible value between 0.0 and 1.0

Type:: float

average_precision_¶

Avearge precision based on the threshold value and class prevalence with a possible value between 0.0 and 1.0

Type:: float

tn_¶

True negative counts based on the threshold value

Type:: np.int64

fp_¶

False positive counts based on the threshold valuee

Type:: np.int64

fn_¶

False negative counts based on the threshold value

Type:: np.int64

tp_¶

True positive counts based on the threshold value

Type:: np.int64

threat_score_¶

Threat score based on the threshold value with a possible value between 0.0 and 1.0

Type:: float

youden_index_¶

Index of the calculated Youden index threshold

Type:: np.int64

youden_threshold_¶

Threshold calculated based on Youden Index with a possible value between 0.0 and 1.0

Type:: float

sens_spec_threshold_¶

Threshold calculated based on maximized sensitivity-specificity with a possible value between 0.0 and 1.0

Type:: float

prec_rec_threshold_¶

Threshold calculated based on maximized precision-recall with a possible value between 0.0 and 1.0

Type:: float

thresholds_dict_¶

Calculated thresholds based on different algorithms including Youden Index youden_threshold_, maximizing the area under sensitivity-specificity curve sens_spec_threshold_, and maximizing the area under precision-recall curver prec_rec_threshold_

Type:: Dict[str, float]

metrics_dict_¶

Rounded metrics based on the number of precision digits

Type:: Dict[str, float]

metrics_df_¶

Pandas DataFrame of all calculated metrics with threshold set as index

Type:: pd.DataFrame

average_methods_¶

List of all possible average methods

Type:: List[str]

plotting_dict_¶

Plotting properties

Type:: Dict[str, Any]

References

[youden-j-index]

https://en.wikipedia.org/wiki/Youden%27s_J_statistic

Examples

>>> from slickml.metrics import BinaryClassificationMetrics
>>> cm = BinaryClassificationMetrics(
...     y_true=[1, 1, 0, 0],
...     y_pred_proba=[0.95, 0.3, 0.1, 0.9]
... )
>>> f = cm.plot()
>>> m = cm.get_metrics()

__post_init__() → None[source]¶: Post instantiation validations and assignments.

average_method: str | None = 'binary'¶

display_df: bool | None = True¶

get_metrics(dtype: str | None = 'dataframe') → pandas.DataFrame | Dict[str, float | None][source]¶

Returns calculated metrics with desired dtypes.

Currently, available output types are “dataframe” and “dict”.

Parameters:: dtype (str, optional) – Results dtype, by default “dataframe”
Returns:: Union[pd.DataFrame, Dict[str, Optional[float]]]

Plots classification metrics.

Parameters:

figsize (Tuple[float, float], optional) – Figure size, by default (12, 12)
save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None
display_plot (bool, optional) – Whether to show the plot, by default False
return_fig (bool, optional) – Whether to return figure object, by default False

Returns:

Figure

precision_digits: int | None = 3¶

threshold: float | None = 0.5¶

y_pred_proba: List[float] | numpy.ndarray | pandas.Series¶

y_true: List[int] | numpy.ndarray | pandas.Series¶

class slickml.metrics.RegressionMetrics[source]¶

Regression Metrics is a wrapper to calculate all the regression metrics in one place.

Notes

In case of multioutput regression, calculation methods can be chosen among "raw_values", "uniform_average", and "variance_weighted".

Parameters:

y_true (Union[List[float], np.ndarray, pd.Series]) – Ground truth target (response) values
y_pred (Union[List[float], np.ndarray, pd.Series]) – Predicted target (response) values
multioutput (str, optional) – Method to calculate the metric for multioutput targets where possible values are "raw_values", "uniform_average", and "variance_weighted". "raw_values" returns a full set of scores in case of multioutput input. "uniform_average" scores of all outputs are averaged with uniform weight. "variance_weighted" scores of all outputs are averaged, weighted by the variances of each individual output, by default “uniform_average”
precision_digits (int, optional) – The number of precision digits to format the scores dataframe, by default 3
display_df (bool, optional) – Whether to display the formatted scores’ dataframe, by default True

plot(figsize=(12, 16), save_path=None, display_plot=False, return_fig=False)[source]¶: Plots regression metrics

get_metrics(dtype='dataframe')[source]¶: Returns calculated metrics

y_residual_¶

Residual values (errors) calculated as (y_true - y_pred)

Type:: np.ndarray

y_residual_normsq_¶

Square root of absolute value of y_residual_

Type:: np.ndarray

r2_¶

\(R^2\) score (coefficient of determination) with a possible value between 0.0 and 1.0

Type:: float

ev_¶

Explained variance score with a possible value between 0.0 and 1.0

Type:: float

mae_¶

Mean absolute error

Type:: float

mse_¶

Mean squared error

Type:: float

msle_¶

Mean squared log error

Type:: float

mape_¶

Mean absolute percentage error

Type:: float

auc_rec_¶

Area under REC curve with a possible value between 0.0 and 1.0

Type:: float

deviation_¶

Arranged deviations to plot REC curve

Type:: np.ndarray

accuracy_¶

Calculated accuracy at each deviation to plot REC curve

Type:: np.ndarray

y_ratio_¶

Ratio of y_pred/y_true

Type:: np.ndarray

mean_y_ratio_¶

Mean value of y_pred/y_true ratio

Type:: float

std_y_ratio_¶

Standard deviation value of y_pred/y_true ratio

Type:: float

cv_y_ratio_¶

Coefficient of variation calculated as std_y_ratio/mean_y_ratio

Type:: float

metrics_dict_¶

Rounded metrics based on the number of precision digits

Type:: Dict[str, Optional[float]]

metrics_df_¶

Pandas DataFrame of all calculated metrics

Type:: pd.DataFrame

plotting_dict_¶

Plotting properties

Type:: Dict[str, Any]

References

[Tahmassebi-et-al]

Tahmassebi, A., Gandomi, A. H., & Meyer-Baese, A. (2018, July). A Pareto front based evolutionary model for airfoil self-noise prediction. In 2018 IEEE Congress on Evolutionary Computation (CEC) (pp. 1-8). IEEE. https://www.amirhessam.com/assets/pdf/projects/cec-airfoil2018.pdf

[rec-curve]

Bi, J., & Bennett, K. P. (2003). Regression error characteristic curves. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 43-50). https://www.aaai.org/Papers/ICML/2003/ICML03-009.pdf

Examples

>>> from slickml.metrics import RegressionMetrics
>>> rm = RegressionMetrics(
...     y_true=[3, -0.5, 2, 7],
...     y_pred=[2.5, 0.0, 2, 8]
... )
>>> m = rm.get_metrics()
>>> rm.plot()

__post_init__() → None[source]¶: Post instantiation validations and assignments.

display_df: bool | None = True¶

get_metrics(dtype: str | None = 'dataframe') → pandas.DataFrame | Dict[str, float | None][source]¶

Returns calculated metrics with desired dtypes.

Currently, available output types are "dataframe" and "dict".

Parameters:: dtype (str, optional) – Results dtype, by default “dataframe”
Returns:: Union[pd.DataFrame, Dict[str, Optional[float]]]

multioutput: str | None = 'uniform_average'¶

Plots regression metrics.

Parameters:

figsize (Tuple[float, float], optional) – Figure size, by default (12, 16)
save_path (str, optional) – The full or relative path to save the plot including the image format such as “myplot.png” or “../../myplot.pdf”, by default None
display_plot (bool, optional) – Whether to show the plot, by default False
return_fig (bool, optional) – Whether to return figure object, by default False

Returns:

Figure, optional

precision_digits: int | None = 3¶

y_pred: List[float] | numpy.ndarray | pandas.Series¶

y_true: List[float] | numpy.ndarray | pandas.Series¶