API Reference
predict(deployment, data_frame, max_explanations=0, threshold_high=None, threshold_low=None, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None, prediction_endpoint=None, timeout=600)
Get predictions using the DataRobot Prediction API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
deployment |
Union[dr.Deployment, str, None]
|
DataRobot deployment to use when computing predictions. Deployment can also be specified by deployment id or omitted which is used when prediction_endpoint is set, e.g. when using Portable Prediction Server. If dr.Deployment, the prediction server and deployment id will be taken from the deployment. If str, the argument is expected to be the deployment id. If None, no deployment id is used. This can be used for Portable Prediction Server single-model mode. |
required |
data_frame |
pd.DataFrame
|
Input data. |
required |
max_explanations |
Union[int, str]
|
Number of prediction explanations to compute. If 0, prediction explanations are disabled. If "all", all explanations will be computed. This is only available for SHAP |
0
|
threshold_high |
Optional[float]
|
Only compute prediction explanations for predictions above this threshold. If None, the default value will be used. |
None
|
threshold_low |
Optional[float]
|
Only compute prediction explanations for predictions below this threshold. If None, the default value will be used. |
None
|
time_series_type |
TimeSeriesType
|
Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date. |
TimeSeriesType.FORECAST
|
forecast_point |
Optional[datetime.datetime]
|
Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised |
None
|
predictions_start_date |
Optional[datetime.datetime]
|
Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
predictions_end_date |
Optional[datetime.datetime]
|
End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
prediction_endpoint |
Optional[str]
|
Specific prediction endpoint to use. This overrides any prediction server found in deployment. If None, prediction endpoint found in deployment will be used. |
None
|
timeout |
int
|
Request timeout in seconds. |
600
|
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
Prediction output. |
predict_unstructured(deployment, data, content_type='text/plain', accept=None, timeout=600)
Get predictions for an unstructured model deployment.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
deployment |
dr.Deployment
|
Deployment used to compute predictions. |
required |
data |
Any
|
Data to send to the endpoint. This can be text, bytes or a file-like object. Anything that the python requests library accepts as data can be used. |
required |
content_type |
str
|
The content type for the data. |
'text/plain'
|
accept |
Optional[str]
|
The mimetypes supported for the return value. If None, any mimetype is supported. |
None
|
timeout |
int
|
Request timeout in seconds. |
600
|
Returns:
| Type | Description |
|---|---|
bytes
|
The response content. |
BaseScoringCodeModel
Bases: ABC
class_labels: Optional[Sequence[str]]
property
Get the class labels for the model.
Returns:
| Type | Description |
|---|---|
Optional[Sequence[str]]
|
List of class labels if model is a classification model, else None. |
date_column: Optional[str]
property
Get the date column for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[str]
|
Name of date column if model has one, else None. |
date_format: Optional[str]
property
Get the date format for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[str]
|
Date format having the syntax expected by datetime.strftime() or None if model is not time series. |
feature_derivation_window: Optional[Tuple[int, int]]
property
Get the feature derivation window for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[Tuple[int, int]]
|
Feature derivation window as (begin, end) if model has this, else None. |
features: Dict[str, type]
property
Get features names and types for the model.
Returns:
| Type | Description |
|---|---|
OrderedDict[str, type]
|
Dictionary mapping feature name to feature type, where feature type is either str or float. The ordering of features is the same as it was during model training. |
forecast_window: Optional[Tuple[int, int]]
property
Get the forecast window for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[Tuple[int, int]]
|
Forecast window as (begin, end) if model has this, else None. |
model_id: str
property
Get the model id.
Returns:
| Type | Description |
|---|---|
str
|
The model id. |
model_info: Optional[Dict[str, str]]
property
Get model metadata.
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, str]]
|
Dictionary with metadata if model has any, else None |
model_type: ModelType
property
Get the model type.
Returns:
| Type | Description |
|---|---|
ModelType
|
One of: ModelType.CLASSIFICATION, ModelType.REGRESSION, ModelType.TIME_SERIES |
series_id_column: Optional[str]
property
Get the name of the series id column for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[str]
|
Name of the series id column if model has one, else None. |
time_step: Optional[Tuple[int, str]]
property
Get the time step for a Time Series model.
Returns:
| Type | Description |
|---|---|
Optional[Tuple[int, str]]
|
Time step as (quantity, time unit) if model has this, else None. Example: (3, "DAYS") |
ModelType
Bases: enum.Enum
CLASSIFICATION = 'IClassificationPredictor'
class-attribute
instance-attribute
Classification predictor
REGRESSION = 'IRegressionPredictor'
class-attribute
instance-attribute
Regression predictor
TIME_SERIES = 'ITimeSeriesRegressionPredictor'
class-attribute
instance-attribute
Time Series predictor
ScoringCodeModel
Bases: BaseScoringCodeModel
__init__(jar_path=None, _json_path=None, _classpath=None, _factory_class=None)
Constructor for ScoringCodeModel
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
jar_path |
Optional[str]
|
path to a jar file |
None
|
_json_path |
Optional[str]
|
For internal usage |
None
|
_classpath |
Optional[List[str]]
|
For internal usage |
None
|
predict(data_frame, max_explanations=0, threshold_high=None, threshold_low=None, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None, prediction_intervals_length=None)
Get predictions from Scoring Code model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_frame |
pd.DataFrame
|
Input data. |
required |
max_explanations |
int
|
Number of prediction explanations to compute. If 0, prediction explanations are disabled. |
0
|
threshold_high |
Optional[float]
|
Only compute prediction explanations for predictions above this threshold. If None, the default value will be used. |
None
|
threshold_low |
Optional[float]
|
Only compute prediction explanations for predictions below this threshold. If None, the default value will be used. |
None
|
time_series_type |
TimeSeriesType
|
Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date. |
TimeSeriesType.FORECAST
|
forecast_point |
Optional[datetime.datetime]
|
Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised |
None
|
predictions_start_date |
Optional[datetime.datetime]
|
Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
predictions_end_date |
Optional[datetime.datetime]
|
End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
prediction_intervals_length |
Optional[int]
|
The percentile to use for the size for prediction intervals. Has to be an integer between 0 and 100(inclusive). If None, prediction intervals will not be computed. |
None
|
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
Prediction output. |
cli(model, input_csv, output_csv, forecast_point, predictions_start_date, predictions_end_date, with_explanations, prediction_intervals_length)
Command Line Interface main function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model |
str
|
|
required |
input_csv |
TextIOWrapper
|
|
required |
output_csv |
TextIOWrapper
|
|
required |
forecast_point |
Optional[str]
|
|
required |
predictions_start_date |
Optional[str]
|
|
required |
predictions_end_date |
Optional[str]
|
|
required |
with_explanations |
bool
|
|
required |
prediction_intervals_length |
int
|
|
required |
SparkScoringCodeModel
Bases: BaseScoringCodeModel
__init__(jar_path=None, allow_models_in_classpath=False)
Create a new instance of SparkScoringCodeModel
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
jar_path |
Optional[str]
|
The path to a Scoring Code jar file to load. If None, the Scoring Code jar will be loaded from the classpath |
None
|
allow_models_in_classpath: bool Having models in the classpath while loading a model from the filesystem using the jar_path argument can lead to unexpected behavior so this is not allowed by default but can be forced using allow_models_in_classpath. If True, models already present in the classpath will be ignored. If False, a ValueError will be raised if models are detected in the classpath.
predict(data_frame, time_series_type=TimeSeriesType.FORECAST, forecast_point=None, predictions_start_date=None, predictions_end_date=None)
Get predictions from the Scoring Code Spark model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_frame |
Union[DataFrame, pd.DataFrame]
|
Input data. |
required |
time_series_type |
TimeSeriesType
|
Type of time series predictions to compute. If TimeSeriesType.FORECAST, predictions will be computed for a single forecast point specified by forecast_point. If TimeSeriesType.HISTORICAL, predictions will be computed for the range of timestamps specified by predictions_start_date and predictions_end_date. |
TimeSeriesType.FORECAST
|
forecast_point |
Optional[datetime.datetime]
|
Forecast point to use for time series forecast point predictions. If None, the forecast point is detected automatically. If not None and time_series_type is not TimeSeriesType.FORECAST, ValueError is raised |
None
|
predictions_start_date |
Optional[datetime.datetime]
|
Start date in range for historical predictions. Inclusive. If None, predictions will start from the earliest date in the input that has enough history. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
predictions_end_date |
Optional[datetime.datetime]
|
End date in range for historical predictions. Exclusive. If None, predictions will end on the last date in the input. If not None and time_series_type is not TimeSeriesType.HISTORICAL, ValueError is raised |
None
|
Returns:
| Type | Description |
|---|---|
pyspark.sql.DataFrame
|
Prediction output. |