Contents
Black-Box-Model¶
This black-box-model wrapper contains the possibilities to either
use pre-trained Pytorch or Tensorflow models or
user-specified models by inheriting the abstract class.
Example implementations for both use-cases can be found in our section Examples.
Interface¶
- class models.api.mlmodel.MLModel(data)¶
Abstract class to implement custom black-box-model for a given dataset with encoding and scaling processing.
- Parameters
- data: Data
Dataset inherited from Data-wrapper
- Returns
- None
- Attributes
backend
Describes the type of backend which is used for the classifier.
data
Contains the data.api.Data dataset.
feature_input_order
Saves the required order of features as list.
raw_model
Contains the raw ML model built on its framework
Methods
predict:
One-dimensional prediction of ml model for an output interval of [0, 1].
predict_proba:
Two-dimensional probability prediction of ml model.
- abstract property backend¶
Describes the type of backend which is used for the classifier.
E.g., tensorflow, pytorch, sklearn, xgboost
- Returns
- str
- property data: carla.data.api.Data¶
Contains the data.api.Data dataset.
- Returns
- carla.data.Data
- Return type
Data
- abstract property feature_input_order¶
Saves the required order of features as list.
Prevents confusion about correct order of input features in evaluation
- Returns
- list of str
- get_mutable_mask()¶
Get mask of mutable features.
For example with mutable feature “income” and immutable features “age”, the mask would be [True, False] for feature_input_order [“income”, “age”].
This mask can then be used to index data to only get the columns that are (im)mutable.
- Returns
- mutable_mask: np.array(bool)
- get_ordered_features(x)¶
Restores the correct input feature order for the ML model, this also drops the columns not in the feature order. So it drops the target column, and possibly other features, e.g. categorical.
Only works for encoded data
- Parameters
- xpd.DataFrame
Data we want to order
- Returns
- outputpd.DataFrame
Whole DataFrame with ordered feature
- abstract predict(x)¶
One-dimensional prediction of ml model for an output interval of [0, 1].
Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))
- Parameters
- xnp.Array or pd.DataFrame
Tabular data of shape N x M (N number of instances, M number of features)
- Returns
- iterable object
Ml model prediction for interval [0, 1] with shape N x 1
- abstract predict_proba(x)¶
Two-dimensional probability prediction of ml model.
Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))
- Parameters
- xnp.Array or pd.DataFrame
Tabular data of shape N x M (N number of instances, M number of features)
- Returns
- iterable object
Ml model prediction with shape N x 2
- abstract property raw_model¶
Contains the raw ML model built on its framework
- Returns
- object
Classifier, depending on used framework
Catalog¶
- class models.catalog.catalog.MLModelCatalog(data, model_type, backend, cache=True, models_home=None, load_online=True, **kws)¶
Use pretrained classifier.
- Parameters
- datadata.catalog.DataCatalog Class
Correct dataset for ML model.
- model_type{‘ann’, ‘linear’, ‘forest’}
The model architecture. Artificial Neural Network, Logistic Regression, and Random Forest respectively.
- backend{‘tensorflow’, ‘pytorch’, ‘sklearn’, ‘xgboost’}
Specifies the used framework. Tensorflow and PyTorch only support ‘ann’ and ‘linear’. Sklearn and Xgboost only support ‘forest’.
- cacheboolean, default: True
If True, try to load from the local cache first, and save to the cache if a download is required.
- models_homestring, optional
The directory in which to cache data; see
get_models_home()
.- kwskeys and values, optional
Additional keyword arguments are passed to passed through to the read model function
- load_online: bool, default: True
If true, a pretrained model is loaded. If false, a model is trained.
- Returns
- None
- Attributes
backend
Describes the type of backend which is used for the ml model.
feature_input_order
Saves the required order of feature as list.
model_type
Describes the model type
raw_model
Returns the raw ML model built on its framework
tree_iterator
A method needed specifically for tree methods.
Methods
predict:
One-dimensional prediction of ml model for an output interval of [0, 1].
predict_proba:
Two-dimensional probability prediction of ml model
- property backend: str¶
Describes the type of backend which is used for the ml model.
E.g., tensorflow, pytorch, sklearn, …
- Returns
- backendstr
Used framework
- Return type
str
- property feature_input_order: List[str]¶
Saves the required order of feature as list.
Prevents confusion about correct order of input features in evaluation
- Returns
- ordered_featureslist of str
Correct order of input features for ml model
- Return type
List
[str
]
- property model_type: str¶
Describes the model type
E.g., ann, linear
- Returns
- backendstr
model type
- Return type
str
- predict(x)¶
One-dimensional prediction of ml model for an output interval of [0, 1]
Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))
- Parameters
- xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor)
Tabular data of shape N x M (N number of instances, M number of features)
- Returns
- outputnp.ndarray, or backend specific (tensorflow or pytorch tensor)
Ml model prediction for interval [0, 1] with shape N x 1
- Return type
Union
[ndarray
,DataFrame
,Tensor
,Tensor
]
- predict_proba(x)¶
Two-dimensional probability prediction of ml model
Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))
- Parameters
- xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor)
Tabular data of shape N x M (N number of instances, M number of features)
- Returns
- outputnp.ndarray, or backend specific (tensorflow or pytorch tensor)
Ml model prediction with shape N x 2
- Return type
Union
[ndarray
,DataFrame
,Tensor
,Tensor
]
- property raw_model: Any¶
Returns the raw ML model built on its framework
- Returns
- ml_modeltensorflow, pytorch, sklearn model type
Loaded model
- Return type
Any
- train(learning_rate=None, epochs=None, batch_size=None, force_train=False, hidden_size=[18, 9, 3], n_estimators=5, max_depth=5)¶
- Parameters
- learning_rate: float
Learning rate for the training.
- epochs: int
Number of epochs to train for.
- batch_size: int
Number of samples in each batch
- force_train: bool
Force training, even if model already exists in cache.
- hidden_size: list[int]
hidden_size[i] contains the number of nodes in layer [i]
- n_estimators: int
Number of estimators in forest.
- max_depth: int
Max depth of trees in the forest.
- property tree_iterator¶
A method needed specifically for tree methods. This method should return a list of individual trees that make up the forest.