Black-Box-Model

This black-box-model wrapper contains the possibilities to either

  • use pre-trained Pytorch or Tensorflow models or

  • user-specified models by inheriting the abstract class.

Example implementations for both use-cases can be found in our section Examples.

Interface

class models.api.mlmodel.MLModel(data, scaling_method='MinMax', encoding_method='OneHot')

Abstract class to implement custom black-box-model for a given dataset with encoding and scaling processing.

Parameters
data: Data

Dataset inherited from Data-wrapper

scaling_method: str, default: MinMax

Type of used sklearn scaler. Can be set with property setter to any sklearn scaler.

encoding_method: str, default: OneHot

Type of OneHotEncoding [OneHot, OneHot_drop_binary]. Additional drop binary decides if one column is dropped for binary features. Can be set with property setter to any sklearn encoder.

Returns
None
Attributes
backend

Describes the type of backend which is used for the classifier.

data

Contains the data.api.Data dataset.

encoder

Contains a fitted sklearn encoder:

feature_input_order

Saves the required order of feature as list.

raw_model

Contains the raw ml model built on its framework

scaler

Contains a fitted sklearn scaler.

Methods

predict:

One-dimensional prediction of ml model for an output interval of [0, 1].

predict_proba:

Two-dimensional probability prediction of ml model

abstract property backend

Describes the type of backend which is used for the classifier.

E.g., tensorflow, pytorch, sklearn, …

Returns
str
property data: carla.data.api.Data

Contains the data.api.Data dataset.

Returns
carla.data.Data
Return type

Data

property encoder: sklearn.base.BaseEstimator

Contains a fitted sklearn encoder:

Returns
sklearn.preprocessing.BaseEstimator
Return type

BaseEstimator

abstract property feature_input_order

Saves the required order of feature as list.

Prevents confusion about correct order of input features in evaluation

Returns
list of str
abstract predict(x)

One-dimensional prediction of ml model for an output interval of [0, 1].

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters
xnp.Array or pd.DataFrame

Tabular data of shape N x M (N number of instances, M number of features)

Returns
iterable object

Ml model prediction for interval [0, 1] with shape N x 1

abstract predict_proba(x)

Two-dimensional probability prediction of ml model

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters
xnp.Array or pd.DataFrame

Tabular data of shape N x M (N number of instances, M number of features)

Returns
iterable object

Ml model prediction with shape N x 2

abstract property raw_model

Contains the raw ml model built on its framework

Returns
object

Classifier, depending on used framework

property scaler: sklearn.base.BaseEstimator

Contains a fitted sklearn scaler.

Returns
sklearn.preprocessing.BaseEstimator
Return type

BaseEstimator

Catalog

class models.catalog.catalog.MLModelCatalog(data, model_type, backend='tensorflow', cache=True, models_home=None, use_pipeline=False, load_pretrained=True, **kws)

Use pretrained classifier.

Parameters
datadata.catalog.DataCatalog Class

Correct dataset for ML model.

model_type{‘ann’, ‘linear’}

Architecture.

backend{‘tensorflow’, ‘pytorch’}

Specifies the used framework.

cacheboolean, default: True

If True, try to load from the local cache first, and save to the cache. if a download is required.

models_homestring, optional

The directory in which to cache data; see get_models_home().

kwskeys and values, optional

Additional keyword arguments are passed to passed through to the read model function

use_pipelinebool, default: False

If true, the model uses a pipeline before predict and predict_proba to preprocess the input data.

load_pretrained: bool, default: True

If true, a pretrained model is loaded. If false, a model is trained.

Returns
None
Attributes
backend

Describes the type of backend which is used for the ml model.

feature_input_order

Saves the required order of feature as list.

model_type

Describes the model type

pipeline

Returns transformations steps for input before predictions.

raw_model

Returns the raw ml model built on its framework

use_pipeline

Returns if the ML model uses the pipeline for predictions

Methods

predict:

One-dimensional prediction of ml model for an output interval of [0, 1].

predict_proba:

Two-dimensional probability prediction of ml model

get_pipeline_element:

Returns a specific element of the pipeline

perform_pipeline:

Transforms input for prediction into correct form.

property backend: str

Describes the type of backend which is used for the ml model.

E.g., tensorflow, pytorch, sklearn, …

Returns
backendstr

Used framework

Return type

str

property feature_input_order: List[str]

Saves the required order of feature as list.

Prevents confusion about correct order of input features in evaluation

Returns
ordered_featureslist of str

Correct order of input features for ml model

Return type

List[str]

get_pipeline_element(key)

Returns a specific element of the pipeline

Parameters
keystr

Element of the pipeline we want to return

Returns
Pipeline element
Return type

Callable

property model_type: str

Describes the model type

E.g., ann, linear

Returns
backendstr

model type

Return type

str

perform_pipeline(df)

Transforms input for prediction into correct form. Only possible for DataFrames without preprocessing steps.

Recommended to use to keep correct encodings, normalization and input order

Parameters
dfpd.DataFrame

Contains unnormalized and not encoded data.

Returns
outputpd.DataFrame

Prediction input in correct order, normalized and encoded

Return type

DataFrame

property pipeline: List[Tuple[str, Callable]]

Returns transformations steps for input before predictions.

Returns
pipelinelist

List of (name, transform) tuples that are chained in the order in which they are preformed.

Return type

List[Tuple[str, Callable]]

predict(x)

One-dimensional prediction of ml model for an output interval of [0, 1]

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters
xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor)

Tabular data of shape N x M (N number of instances, M number of features)

Returns
outputnp.ndarray, or backend specific (tensorflow or pytorch tensor)

Ml model prediction for interval [0, 1] with shape N x 1

Return type

Union[ndarray, DataFrame, Tensor, Tensor]

predict_proba(x)

Two-dimensional probability prediction of ml model

Shape of input dimension has to be always two-dimensional (e.g., (1, m), (n, m))

Parameters
xnp.Array, pd.DataFrame, or backend specific (tensorflow or pytorch tensor)

Tabular data of shape N x M (N number of instances, M number of features)

Returns
outputnp.ndarray, or backend specific (tensorflow or pytorch tensor)

Ml model prediction with shape N x 2

Return type

Union[ndarray, DataFrame, Tensor, Tensor]

property raw_model: Any

Returns the raw ml model built on its framework

Returns
ml_modeltensorflow, pytorch, sklearn model type

Loaded model

Return type

Any

train(learning_rate, epochs, batch_size)
Parameters
learning_rate: float

Learning rate for the training.

epochs: int

Number of epochs to train for.

batch_size: int

Number of samples in each batch

property use_pipeline: bool

Returns if the ML model uses the pipeline for predictions

Returns
bool
Return type

bool