Deployment Predictions

Deployment predictions uses the DataRobot Python Client: datarobot to handle authentication and making request to the DataRobot API. The package is automatically installed as a dependency of datarobot-predict.

The following example shows how to authenticate using an API token. For other authentication examples and usage info, take a look at the DataRobot API quickstart. More information about the DataRobot Prediction API can be found in the DataRobot docs.

To compute deployment predictions, instantiate a datarobot.Deployment and pass it to the predict function

import datarobot as dr
from datarobot_predict.deployment import predict

dr.Client(
    endpoint='https://app.datarobot.com/api/v2',
    token='NjE3ZjA3Mzk0MmY0MDFmZGFiYjQ0MztergsgsQwOk9G'
)

deployment = dr.Deployment.get(deployment_id="635f85940ca2497cb57a54b8")
result_df, response_headers = predict(deployment, input_df)

The return type of the predict function is PredictionResult which is a named tuple containing properties dataframe and response_headers. The dataframe property contains prediction results while response_headers makes it possible to retrieve various metadata like the model id associated with the deployment:

result = predict(deployment, input_df)
predictions = result.dataframe
model_id = result.response_headers["X-DataRobot-Model-Id"]

Prediction Explanations

To compute explanations, set max_explanations to a positive value

df_with_explanations, _ = predict(
    deployment,
    df,
    max_explanations=3
)

If max_explanations is 0 and explanation_algorithm is set to shap, all explanations will be computed

df_with_explanations, _ = predict(
    deployment,
    df,
    explanation_algorithm="shap"
)

Time Series

Forecast point predictions are returned by default if no other arguments are provided for a Time Series Model. The forecast point can be specified using the forecast_point parameter or auto-detected.

result_df, _ = predict(
    deployment,
    df,
    forecast_point=datetime.datetime(1958, 6, 1)
)

To do historical predictions, set time_series_type accordingly

from datarobot_predict import TimeSeriesType

result_df, _ = predict(
    deployment,
    df,
    time_series_type=TimeSeriesType.HISTORICAL,
    predictions_start_date=datetime.datetime(2020, 1, 1),
    predictions_end_date=datetime.datetime(2022, 6, 1),
)

Unstructured models

Predictions can be computed for unstructured custom models using the function predict_unstructured. The following example shows how to make a request to a unstructured deployment with a binary file as the argument.

with open("image.jpg", "rb") as file:
    content, response_headers = predict_unstructured(
        deployment,
        data=file,
        content_type="image/jpeg"
    )

It is also possible to use DataFrame input/output for custom models which use csv input/output. Simply, pass a DataFrame as the data argument:

result_df, response_headers = predict_unstructured(
        deployment,
        data=input_df
  )

Using a custom prediction endpoint

It is possible to force a specific prediction endpoint for the predict function using the prediction_endpoint parameter. This can be used in an enterprise installation that has multiple prediction servers or together with the Portable Prediction Server(PPS).

This example shows how to make a request to a PPS using single-model mode

result_df, _ = predict(
    deployment=None,
    data_frame=df,
    prediction_endpoint="http://localhost:8080"
)

The deployment parameter is set to None since there is no deployment id associated with PPS single-model mode.

For PPS multi-model mode, deployment should be set to the sub directory where the model package is located

result_df, _ = predict(
    deployment="my_model",
    data_frame=df,
    prediction_endpoint="http://localhost:8080"
)

For more information about PPS, see the DataRobot docs.

Limitations

Requests will timeout after 600 seconds.
The input DataFrame converted to csv has be less than 50MB or an exception will be raised.

For large and long-running requests, consider using the Batch Prediction API.