slickml.optimization
#
Package Contents#
Classes#
XGBoost Hyper-Parameters Tuner using Bayesian Optimization. |
|
XGBoost Hyper-Parameters Tuner using HyperOpt Optimization. |
- class slickml.optimization.XGBoostBayesianOptimizer[source]#
Bases:
slickml.base.BaseXGBoostEstimator
XGBoost Hyper-Parameters Tuner using Bayesian Optimization.
This is wrapper using Bayesian Optimization algorithm [bayesian-optimization] to tune the hyper-parameter of XGBoost [xgboost-api] using
xgboost.cv()
functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.Notes
The optimizier objective is always to maximize the target values. Therefore, in case of using a metric such as
logloss
,error
,mae
,rmse
, orrmsle
, the negative value of the metric will be maximized. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from theparams_bounds
where we are looking for an integer which is not possible. Therefore, for some of cases i.e.max_depth
we must cast the sampled value which is mathematically wrong (i.e.f(1.1) != f(1)
).- Parameters:
n_iter (int, optional) – Number of iteration rounds for hyper-parameters tuning after initialization, by default 10
n_init_iter (int, optional) – Number of initial iterations to initialize the optimizer, by default 5
n_splits (int, optional) – Number of folds for cross-validation, by default 4
metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”
objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification
"binary:logistic"
and for regression"reg:logistic"
,"reg:squarederror"
, and"reg:squaredlogerror"
, by default “binary:logistic”acquisition_criterion (str, optional) – Acquisition criterion method with possible options of
"ei"
(Expected Improvement),"ucb"
(Upper Confidence Bounds), and"poi"
(Probability Of Improvement), by default “ei”params_bounds (Dict[str, Tuple[Union[int, float], Union[int, float]]], optional) – Set of hyper-parameters boundaries for Bayesian Optimization where all fields are required, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}
num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200
early_stopping_rounds (int, optional) – The criterion to early abort the
xgboost.cv()
phase if the test metric is not improved, by default 20random_state (int, optional) – Random seed number, by default 1367
stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run
xgboost.cv()
to find the best number of boosting round at each fold of each iteration, by default Trueshuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in
xgboost.cv()
, by default Truesparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with
scale_mean=True
standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default Falsescale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in
sparse_matrix
,scale_mean=False
when usingsparse_matrix=True
, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. TheStandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
, by default Falsescale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The
StandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
, by default Falseimportance_type (str, optional) – Importance type of
xgboost.train()
with possible values"weight"
,"gain"
,"total_gain"
,"cover"
,"total_cover"
, by default “total_gain”verbose (bool, optional) – Whether to show the Bayesian Optimization progress at each iteration, by default True
- optimizer_#
Returns the fitted Bayesian Optimiziation object
- results_#
Returns all the optimization results including target and params
- best_params_#
Returns the tuned hyper-parameters as a dictionary
- best_results_#
Return the results based on the best (tuned) hyper-parameters
References
- __slots__ = []#
- acquisition_criterion :Optional[str] = ei#
- early_stopping_rounds :Optional[int] = 20#
- importance_type :Optional[str] = total_gain#
- metrics :Optional[str] = auc#
- n_init_iter :Optional[int] = 5#
- n_iter :Optional[int] = 10#
- n_splits :Optional[int] = 4#
- num_boost_round :Optional[int] = 200#
- objective :Optional[str] = binary:logistic#
- params :Optional[Dict[str, Union[str, float, int]]]#
- params_bounds :Optional[Dict[str, Tuple[Union[int, float], Union[int, float]]]]#
- random_state :Optional[int] = 1367#
- scale_mean :Optional[bool] = False#
- scale_std :Optional[bool] = False#
- shuffle :Optional[bool] = True#
- sparse_matrix :Optional[bool] = False#
- stratified :Optional[bool] = True#
- verbose :Optional[bool] = True#
- __getstate__()#
- __repr__(N_CHAR_MAX=700)#
Return repr(self).
- __setstate__(state)#
- fit(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[List[float], numpy.ndarray, pandas.Series]) None [source]#
Fits the main hyper-parameter tuning algorithm.
Notes
At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Bayesian optimizier always maximizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get minimized i.e. error. For those, we can maximize (-1) * metric. One of the big pitfall of the current implementation is the way we are sampling hyper-parameters from the params_bounds where we are looking for an integer which is not possible. Therefore, for some of cases i.e. max_depth we must cast the sampled value which is mathematically wrong (i.e. f(1.1) != f(1)).
- Parameters:
X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)
y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)
- Returns:
None
- get_best_params() Dict[str, Union[str, float, int]] [source]#
Returns the tuned results of the optimization as the best set of hyper-parameters.
- Returns:
Dict[str, Union[str, float, int]]
- get_best_results() pandas.DataFrame [source]#
Returns the performance of the best (tuned) set of hyper-parameters.
- Returns:
pd.DataFrame
- get_optimizer() bayes_opt.BayesianOptimization [source]#
Return the Bayesian Optimization object.
- Returns:
BayesianOptimization
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params (dict) – Parameter names mapped to their values.
- get_params_bounds() Optional[Dict[str, Tuple[Union[int, float], Union[int, float]]]] [source]#
Returns the hyper-parameters boundaries for the tuning process.
- Returns:
Dict[str, Tuple[Union[int, float], Union[int, float]]]
- get_results() pandas.DataFrame [source]#
Returns the hyper-parameter optimization results.
- Returns:
pd.DataFrame
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self (estimator instance) – Estimator instance.
- class slickml.optimization.XGBoostHyperOptimizer[source]#
Bases:
slickml.base.BaseXGBoostEstimator
XGBoost Hyper-Parameters Tuner using HyperOpt Optimization.
This is wrapper using HyperOpt [hyperopt] a Python library for serial and parallel optimization over search spaces, which may include real-valued, discrete, and conditional dimensions to tune the hyper-parameter of XGBoost [xgboost-api] using
xgboost.cv()
functionality with n-folds cross-validation iteratively. This feature can be used to find the set of optimized set of hyper-parameters for both classification and regression tasks.Notes
The optimizier objective is always to minimize the target values. Therefore, in case of using a metric such as
auc
, oraucpr
the negative value of the metric will be minimized.- Parameters:
n_iter (int, optional) – Maximum number of iteration rounds for hyper-parameters tuning before convergance, by default 100
n_splits (int, optional) – Number of folds for cross-validation, by default 4
metrics (str, optional) – Metrics to be tracked at cross-validation fitting time depends on the task (classification vs regression) with possible values of “auc”, “aucpr”, “error”, “logloss”, “rmse”, “rmsle”, “mae”. Note this is different than eval_metric that needs to be passed to params dict, by default “auc”
objective (str, optional) – Objective function depending on the task whether it is regression or classification. Possible objectives for classification
"binary:logistic"
and for regression"reg:logistic"
,"reg:squarederror"
, and"reg:squaredlogerror"
, by default “binary:logistic”params_bounds (Dict[str, Any], optional) – Set of hyper-parameters boundaries for HyperOpt using``hyperopt.hp`` and hyperopt.pyll_utils, by default {“max_depth” : (2, 7), “learning_rate” : (0, 1), “min_child_weight” : (1, 20), “colsample_bytree”: (0.1, 1.0), “subsample” : (0.1, 1), “gamma” : (0, 1), “reg_alpha” : (0, 1), “reg_lambda” : (0, 1)}
num_boost_round (int, optional) – Number of boosting rounds to fit a model, by default 200
early_stopping_rounds (int, optional) – The criterion to early abort the
xgboost.cv()
phase if the test metric is not improved, by default 20random_state (int, optional) – Random seed number, by default 1367
stratified (bool, optional) – Whether to use stratificaiton of the targets (only available for classification tasks) to run
xgboost.cv()
to find the best number of boosting round at each fold of each iteration, by default Trueshuffle (bool, optional) – Whether to shuffle data to have the ability of building stratified folds in
xgboost.cv()
, by default Truesparse_matrix (bool, optional) – Whether to convert the input features to sparse matrix with csr format or not. This would increase the speed of feature selection for relatively large/sparse datasets. Consequently, this would actually act like an un-optimize solution for dense feature matrix. Additionally, this parameter cannot be used along with
scale_mean=True
standardizing the feature matrix to have a mean value of zeros would turn the feature matrix into a dense matrix. Therefore, by default our API banned this feature, by default Falsescale_mean (bool, optional) – Whether to standarize the feauture matrix to have a mean value of zero per feature (center the features before scaling). As laid out in
sparse_matrix
,scale_mean=False
when usingsparse_matrix=True
, since centering the feature matrix would decrease the sparsity and in practice it does not make any sense to use sparse matrix method and it would make it worse. TheStandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
, by default Falsescale_std (bool, optional) – Whether to scale the feauture matrix to have unit variance (or equivalently, unit standard deviation) per feature. The
StandardScaler
object can be accessed viacls.scaler_
ifscale_mean
orscale_strd
is used unless it isNone
, by default Falseimportance_type (str, optional) – Importance type of
xgboost.train()
with possible values"weight"
,"gain"
,"total_gain"
,"cover"
,"total_cover"
, by default “total_gain”verbose (bool, optional) – Whether to show the HyperOpt Optimization progress at each iteration, by default True
- best_params_#
Returns the tuned hyper-parameters as a dictionary
- results_#
Returns all the optimization trials as results
References
- __slots__ = []#
- early_stopping_rounds :Optional[int] = 20#
- importance_type :Optional[str] = total_gain#
- metrics :Optional[str] = auc#
- n_iter :Optional[int] = 100#
- n_splits :Optional[int] = 4#
- num_boost_round :Optional[int] = 200#
- objective :Optional[str] = binary:logistic#
- params :Optional[Dict[str, Union[str, float, int]]]#
- params_bounds :Optional[Dict[str, Any]]#
- random_state :Optional[int] = 1367#
- scale_mean :Optional[bool] = False#
- scale_std :Optional[bool] = False#
- shuffle :Optional[bool] = True#
- sparse_matrix :Optional[bool] = False#
- stratified :Optional[bool] = True#
- verbose :Optional[bool] = True#
- __getstate__()#
- __repr__(N_CHAR_MAX=700)#
Return repr(self).
- __setstate__(state)#
- fit(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[List[float], numpy.ndarray, pandas.Series]) None [source]#
Fits the main hyper-parameter tuning algorithm.
Notes
At each iteration, one set of parameters gets passed from the params_bounds and the evaluation occurs based on the cross-validation results. Hyper optimizier always minimizes the objectives. Therefore, based on the metrics we should be careful when using self.metrics that are supposed to get maximized i.e. auc. For those, we can maximize (-1) * metric.
- Parameters:
X (Union[pd.DataFrame, np.ndarray]) – Input data for training (features)
y (Union[List[float], np.ndarray, pd.Series]) – Input ground truth for training (targets)
- Returns:
None
- get_best_params() Dict[str, Union[str, float, int]] [source]#
Returns the tuned results of the optimization as the best set of hyper-parameters.
- Returns:
Dict[str, Union[str, float, int]]
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params (dict) – Parameter names mapped to their values.
- get_params_bounds() Optional[Dict[str, Any]] [source]#
Returns the hyper-parameters boundaries for the tuning process.
- Returns:
Dict[str, Any]
- get_results() List[Dict[str, Any]] [source]#
Return all trials results.
- Returns:
List[Dict[str, Any]]
- get_trials() hyperopt.Trials [source]#
Returns the Trials object passed to the optimizer.
- Returns:
hyperopt.Trials
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self (estimator instance) – Estimator instance.