Contents

Black-Box-Model
- Interface
- Catalog

Black-Box-Model ¶

This black-box-model wrapper contains the possibilities to either

use pre-trained Pytorch or Tensorflow models or
user-specified models by inheriting the abstract class.

Example implementations for both use-cases can be found in our section Examples.

Interface ¶

class models.api.mlmodel.MLModel(data)¶

Abstract class to implement custom black-box-model for a given dataset with encoding and scaling processing.

Parameters

data: Data: Dataset inherited from Data-wrapper

Returns

None

Attributes

backend: Describes the type of backend which is used for the classifier.
data: Contains the data.api.Data dataset.
feature_input_order: Saves the required order of features as list.
raw_model: Contains the raw ML model built on its framework

Methods

predict:	One-dimensional prediction of ml model for an output interval of [0, 1].
predict_proba:	Two-dimensional probability prediction of ml model.

abstract property backend¶

Describes the type of backend which is used for the classifier.

E.g., tensorflow, pytorch, sklearn, xgboost

Returns

str

property data: carla.data.api.Data¶

Contains the data.api.Data dataset.

Returns

carla.data.Data

Return type: Data

abstract property feature_input_order¶

Saves the required order of features as list.

Prevents confusion about correct order of input features in evaluation

Returns

list of str

get_mutable_mask()¶

Get mask of mutable features.

For example with mutable feature “income” and immutable features “age”, the mask would be [True, False] for feature_input_order [“income”, “age”].

This mask can then be used to index data to only get the columns that are (im)mutable.

Returns

mutable_mask: np.array(bool)

get_ordered_features(x)¶

Restores the correct input feature order for the ML model, this also drops the columns not in the feature order. So it drops the target column, and possibly other features, e.g. categorical.

Only works for encoded data

Parameters

xpd.DataFrame: Data we want to order

Returns

outputpd.DataFrame: Whole DataFrame with ordered feature

abstract predict(x)¶

One-dimensional prediction of ml model for an output interval of [0, 1].

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters

xnp.Array or pd.DataFrame: Tabular data of shape N x M (N number of instances, M number of features)

Returns

iterable object: Ml model prediction for interval [0, 1] with shape N x 1

abstract predict_proba(x)¶

Two-dimensional probability prediction of ml model.

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters

xnp.Array or pd.DataFrame: Tabular data of shape N x M (N number of instances, M number of features)

Returns

iterable object: Ml model prediction with shape N x 2

abstract property raw_model¶

Contains the raw ML model built on its framework

Returns

object: Classifier, depending on used framework

Catalog ¶

class models.catalog.catalog.MLModelCatalog(data, model_type, backend, cache=True, models_home=None, load_online=True, **kws)¶

Use pretrained classifier.

Parameters

datadata.catalog.DataCatalog Class: Correct dataset for ML model.
model_type{‘ann’, ‘linear’, ‘forest’}: The model architecture. Artificial Neural Network, Logistic Regression, and Random Forest respectively.
backend{‘tensorflow’, ‘pytorch’, ‘sklearn’, ‘xgboost’}: Specifies the used framework. Tensorflow and PyTorch only support ‘ann’ and ‘linear’. Sklearn and Xgboost only support ‘forest’.
cacheboolean, default: True: If True, try to load from the local cache first, and save to the cache if a download is required.
models_homestring, optional: The directory in which to cache data; see get_models_home().
kwskeys and values, optional: Additional keyword arguments are passed to passed through to the read model function
load_online: bool, default: True: If true, a pretrained model is loaded. If false, a model is trained.

Returns

None

Attributes

backend: Describes the type of backend which is used for the ml model.
feature_input_order: Saves the required order of feature as list.
model_type: Describes the model type
raw_model: Returns the raw ML model built on its framework
tree_iterator: A method needed specifically for tree methods.

Methods

predict:	One-dimensional prediction of ml model for an output interval of [0, 1].
predict_proba:	Two-dimensional probability prediction of ml model

property backend: str¶

Describes the type of backend which is used for the ml model.

E.g., tensorflow, pytorch, sklearn, …

Returns

backendstr: Used framework

Return type: str

property feature_input_order: List[str]¶

Saves the required order of feature as list.

Prevents confusion about correct order of input features in evaluation

Returns

ordered_featureslist of str: Correct order of input features for ml model

Return type: List[str]

property model_type: str¶

Describes the model type

E.g., ann, linear

Returns

backendstr: model type

Return type: str

predict(x)¶

One-dimensional prediction of ml model for an output interval of [0, 1]

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters

xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor): Tabular data of shape N x M (N number of instances, M number of features)

Returns

outputnp.ndarray, or backend specific (tensorflow or pytorch tensor): Ml model prediction for interval [0, 1] with shape N x 1

Return type: Union[ndarray, DataFrame, Tensor, Tensor]

predict_proba(x)¶

Two-dimensional probability prediction of ml model

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters

xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor): Tabular data of shape N x M (N number of instances, M number of features)

Returns

outputnp.ndarray, or backend specific (tensorflow or pytorch tensor): Ml model prediction with shape N x 2

Return type: Union[ndarray, DataFrame, Tensor, Tensor]

property raw_model: Any¶

Returns the raw ML model built on its framework

Returns

ml_modeltensorflow, pytorch, sklearn model type: Loaded model

Return type: Any

train(learning_rate=None, epochs=None, batch_size=None, force_train=False, hidden_size=[18, 9, 3], n_estimators=5, max_depth=5)¶

Parameters

learning_rate: float: Learning rate for the training.
epochs: int: Number of epochs to train for.
batch_size: int: Number of samples in each batch
force_train: bool: Force training, even if model already exists in cache.
hidden_size: list[int]: hidden_size[i] contains the number of nodes in layer [i]
n_estimators: int: Number of estimators in forest.
max_depth: int: Max depth of trees in the forest.

property tree_iterator¶: A method needed specifically for tree methods. This method should return a list of individual trees that make up the forest.

Black-Box-Model¶

Interface¶

Catalog¶

Black-Box-Model ¶

Interface ¶

Catalog ¶