Skip to content

API Reference

predict(deployment, data_frame, max_explanations=0, threshold_high=None, threshold_low=None, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None, prediction_endpoint=None, timeout=600)

Get predictions using the DataRobot Prediction API.

Parameters:

Name Type Description Default
deployment Union[dr.Deployment, str, None]

DataRobot deployment to use when computing predictions. Deployment can also be specified by deployment id or omitted which is used when prediction_endpoint is set, e.g. when using Portable Prediction Server.

If dr.Deployment, the prediction server and deployment id will be taken from the deployment. If str, the argument is expected to be the deployment id. If None, no deployment id is used. This can be used for Portable Prediction Server single-model mode.

required
data_frame pd.DataFrame

Input data.

required
max_explanations Union[int, str]

Number of prediction explanations to compute. If 0, prediction explanations are disabled. If "all", all explanations will be computed. This is only available for SHAP

0
threshold_high Optional[float]

Only compute prediction explanations for predictions above this threshold. If None, the default value will be used.

None
threshold_low Optional[float]

Only compute prediction explanations for predictions below this threshold. If None, the default value will be used.

None
time_series_type TimeSeriesType

Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date.

TimeSeriesType.FORECAST
forecast_point Optional[datetime.datetime]

Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised

None
predictions_start_date Optional[datetime.datetime]

Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None
predictions_end_date Optional[datetime.datetime]

End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None
prediction_endpoint Optional[str]

Specific prediction endpoint to use. This overrides any prediction server found in deployment. If None, prediction endpoint found in deployment will be used.

None
timeout int

Request timeout in seconds.

600

Returns:

Type Description
pd.DataFrame

Prediction output.

predict_unstructured(deployment, data, content_type='text/plain', accept=None, timeout=600)

Get predictions for an unstructured model deployment.

Parameters:

Name Type Description Default
deployment dr.Deployment

Deployment used to compute predictions.

required
data Any

Data to send to the endpoint. This can be text, bytes or a file-like object. Anything that the python requests library accepts as data can be used.

required
content_type str

The content type for the data.

'text/plain'
accept Optional[str]

The mimetypes supported for the return value. If None, any mimetype is supported.

None
timeout int

Request timeout in seconds.

600

Returns:

Type Description
bytes

The response content.

BaseScoringCodeModel

Bases: ABC

class_labels: Optional[Sequence[str]] property

Get the class labels for the model.

Returns:

Type Description
Optional[Sequence[str]]

List of class labels if model is a classification model, else None.

date_column: Optional[str] property

Get the date column for a Time Series model.

Returns:

Type Description
Optional[str]

Name of date column if model has one, else None.

date_format: Optional[str] property

Get the date format for a Time Series model.

Returns:

Type Description
Optional[str]

Date format having the syntax expected by datetime.strftime() or None if model is not time series.

feature_derivation_window: Optional[Tuple[int, int]] property

Get the feature derivation window for a Time Series model.

Returns:

Type Description
Optional[Tuple[int, int]]

Feature derivation window as (begin, end) if model has this, else None.

features: Dict[str, type] property

Get features names and types for the model.

Returns:

Type Description
OrderedDict[str, type]

Dictionary mapping feature name to feature type, where feature type is either str or float. The ordering of features is the same as it was during model training.

forecast_window: Optional[Tuple[int, int]] property

Get the forecast window for a Time Series model.

Returns:

Type Description
Optional[Tuple[int, int]]

Forecast window as (begin, end) if model has this, else None.

model_id: str property

Get the model id.

Returns:

Type Description
str

The model id.

model_info: Optional[Dict[str, str]] property

Get model metadata.

Returns:

Type Description
Optional[Dict[str, str]]

Dictionary with metadata if model has any, else None

model_type: ModelType property

Get the model type.

Returns:

Type Description
ModelType

One of: ModelType.CLASSIFICATION, ModelType.REGRESSION, ModelType.TIME_SERIES

series_id_column: Optional[str] property

Get the name of the series id column for a Time Series model.

Returns:

Type Description
Optional[str]

Name of the series id column if model has one, else None.

time_step: Optional[Tuple[int, str]] property

Get the time step for a Time Series model.

Returns:

Type Description
Optional[Tuple[int, str]]

Time step as (quantity, time unit) if model has this, else None. Example: (3, "DAYS")

ModelType

Bases: enum.Enum

CLASSIFICATION = 'IClassificationPredictor' class-attribute instance-attribute

Classification predictor

REGRESSION = 'IRegressionPredictor' class-attribute instance-attribute

Regression predictor

TIME_SERIES = 'ITimeSeriesRegressionPredictor' class-attribute instance-attribute

Time Series predictor

ScoringCodeModel

Bases: BaseScoringCodeModel

__init__(jar_path=None, _json_path=None, _classpath=None, _factory_class=None)

Constructor for ScoringCodeModel

Parameters:

Name Type Description Default
jar_path Optional[str]

path to a jar file

None
_json_path Optional[str]

For internal usage

None
_classpath Optional[List[str]]

For internal usage

None

predict(data_frame, max_explanations=0, threshold_high=None, threshold_low=None, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None, prediction_intervals_length=None)

Get predictions from Scoring Code model.

Parameters:

Name Type Description Default
data_frame pd.DataFrame

Input data.

required
max_explanations int

Number of prediction explanations to compute. If 0, prediction explanations are disabled.

0
threshold_high Optional[float]

Only compute prediction explanations for predictions above this threshold. If None, the default value will be used.

None
threshold_low Optional[float]

Only compute prediction explanations for predictions below this threshold. If None, the default value will be used.

None
time_series_type TimeSeriesType

Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date.

TimeSeriesType.FORECAST
forecast_point Optional[datetime.datetime]

Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised

None
predictions_start_date Optional[datetime.datetime]

Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None
predictions_end_date Optional[datetime.datetime]

End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None
prediction_intervals_length Optional[int]

The percentile to use for the size for prediction intervals. Has to be an integer between 0 and 100(inclusive). If None, prediction intervals will not be computed.

None

Returns:

Type Description
pd.DataFrame

Prediction output.

cli(model, input_csv, output_csv, forecast_point, predictions_start_date, predictions_end_date, with_explanations, prediction_intervals_length)

Command Line Interface main function.

Parameters:

Name Type Description Default
model str
required
input_csv TextIOWrapper
required
output_csv TextIOWrapper
required
forecast_point Optional[str]
required
predictions_start_date Optional[str]
required
predictions_end_date Optional[str]
required
with_explanations bool
required
prediction_intervals_length int
required

SparkScoringCodeModel

Bases: BaseScoringCodeModel

__init__(jar_path=None, allow_models_in_classpath=False)

Create a new instance of SparkScoringCodeModel

Parameters:

Name Type Description Default
jar_path Optional[str]

The path to a Scoring Code jar file to load. If None, the Scoring Code jar will be loaded from the classpath

None

allow_models_in_classpath: bool Having models in the classpath while loading a model from the filesystem using the jar_path argument can lead to unexpected behavior so this is not allowed by default but can be forced using allow_models_in_classpath. If True, models already present in the classpath will be ignored. If False, a ValueError will be raised if models are detected in the classpath.

predict(data_frame, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None)

Get predictions from the Scoring Code Spark model.

Parameters:

Name Type Description Default
data_frame Union[DataFrame, pd.DataFrame]

Input data.

required
time_series_type TimeSeriesType

Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date.

TimeSeriesType.FORECAST
forecast_point Optional[datetime.datetime]

Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised

None
predictions_start_date Optional[datetime.datetime]

Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None
predictions_end_date Optional[datetime.datetime]

End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised

None

Returns:

Type Description
pyspark.sql.DataFrame

Prediction output.