Deploy#

Deploy any model to ML Ops

External models#

For external “custom” models, deploy() will:

Dynamically generates a Dockerfile to build a custom environment for serving the provided model including:
- Python interpreter version
- Dependencies and associated versions
This is accomplished by introspecting the environment in which deploy() is called.
Serializes the model and any provided custom hooks, generating an appropriate custom.py file.

See also

The DataRobot User Models (DRUM) project has additional documentation on available custom hooks that can be specified.
Transmits configuration and serialized file to DataRobot and creates a new:
- Custom model environment
- Custom model environment version
- Custom model
- Custom model version
Deploys the custom model into ML Ops.
External models presently supported by drx.deploy()
- Scikit-learn pipelines and estimators
- Huggingface transformers (via ONNX export)
- Extending deploy() to additional estimators should be straightforward upon request

DataRobot models#

For drx-trained models, drx.deploy() is equivalent to calling deploy() on the model directly.

Usage#

Example 1: Simple sklearn pipeline#

Deploy#

import datarobotx as drx

deployment = drx.deploy(pipe,
                        target='readmitted',
                        classes=['True', 'False'])

Predict#

test_df = pd.read_csv('https://s3.amazonaws.com/datarobot_public_datasets/10k_diabetes_20.csv')
predictions = deployment_2.predict(test_df)

Example 2: Huggingface question answering#

Load from pretrained#

from transformers import AutoTokenizer, TFBertForQuestionAnswering

FOUNDATION_MODEL = "bert-large-uncased-whole-word-masking-finetuned-squad"
tokenizer = AutoTokenizer.from_pretrained(FOUNDATION_MODEL)
model = TFBertForQuestionAnswering.from_pretrained(FOUNDATION_MODEL)

Deploy with custom hooks#

hf_deployment = deploy(tokenizer,
                       model,
                       feature='question-answering',
                       hooks={'load_model': load_model,
                              'score_unstructured': score_unstructured})

Supporting example custom hooks for loading and scoring with the transformer

import json
import os.path
import time
import os
from transformers import AutoTokenizer
import onnxruntime as ort
import numpy as np
import pandas as pd
import io
model_load_duration = 0

def load_model(input_dir):
    global model_load_duration
    onnx_path = os.path.join(input_dir, "model.onnx")
    tokenizer_path = os.path.join(input_dir)
    start = time.time()
    tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
    sess = ort.InferenceSession(onnx_path)
    model_load_duration = time.time() - start
    log_for_drum(f"load_model - Loading ONNX BERT took :{model_load_duration}")
    return sess, tokenizer


def log_for_drum(msg):
    os.write(1, f"\n{msg}\n".encode("UTF-8"))


def _get_answer_in_text(output, input_ids, idx, tokenizer):
    answer_start = np.argmax(output[0], axis=1)[idx]
    answer_end = (np.argmax(output[1], axis=1) + 1)[idx]
    answer = tokenizer.convert_tokens_to_string(
        tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end])
    )
    return answer


def score_unstructured(model, data, query, **kwargs):
    global model_load_duration
    sess, tokenizer = model

    # Assume batch input is sent with mimetype:"text/csv"
    # Treat as single prediction input if no mimetype is set
    is_batch = kwargs["mimetype"] == "text/csv"

    if is_batch:
        input_pd = pd.read_csv(io.StringIO(data), sep="|")
        input_pairs = list(zip(input_pd["context"], input_pd["question"]))

        start = time.time()
        inputs = tokenizer.batch_encode_plus(
            input_pairs, add_special_tokens=True, padding=True, return_tensors="np"
        )
        input_ids = inputs["input_ids"]
        output = sess.run(["start_logits", "end_logits"], input_feed=dict(inputs))
        responses = []
        for i, row in input_pd.iterrows():
            answer = _get_answer_in_text(output, input_ids[i], i, tokenizer)
            response = {
                "context": row["context"],
                "question": row["question"],
                "answer": answer,
            }
            responses.append(response)
        pred_duration = time.time() - start
        to_return = json.dumps(
            {
                "predictions": responses,
                "model_load_duration": model_load_duration,
                "pred_duration": pred_duration,
            }
        )
    else:
        data_dict = json.loads(data)
        context, question = data_dict["context"], data_dict["question"]
        start = time.time()
        inputs = tokenizer(
            question,
            context,
            add_special_tokens=True,
            padding=True,
            return_tensors="np",
        )
        input_ids = inputs["input_ids"][0]
        output = sess.run(["start_logits", "end_logits"], input_feed=dict(inputs))
        answer = _get_answer_in_text(output, input_ids, 0, tokenizer)
        pred_duration = time.time() - start
        to_return = json.dumps(
            {
                "context": context,
                "question": question,
                "answer": answer,
                "model_load_duration": model_load_duration,
                "pred_duration": pred_duration,
            }
        )

    model_load_duration = 0  # reset this variable for subsequent calls
    return to_return

Predict#

input = {
    "context": "Healthcare tasks (e.g., patient care via disease treatment) and "
    + "biomedical research (e.g., scientific discovery of new therapies) require "
    + "expert knowledge that is limited and expensive. Foundation models present "
    + "clear opportunities in these domains due to the abundance of data across "
    + "many modalities (e.g., images, text, molecules) to train foundation models, "
    + "as well as the value of improved sample efficiency in adaptation due to the "
    + "cost of expert time and knowledge. Further, foundation models may allow for "
    + "improved interface design (§2.5: interaction) for both healthcare providers "
    + "and patients to interact with AI systems, and their generative capabilities "
    + "suggest potential for open-ended research problems like drug discovery. "
    + "Simultaneously, they come with clear risks (e.g., exacerbating historical "
    + "biases in medical datasets and trials). To responsibly unlock this potential "
    + "requires engaging deeply with the sociotechnical matters of data sources and "
    + "privacy as well as model interpretability and explainability, alongside "
    + "effective regulation of the use of foundation models for both healthcare and "
    + "biomedicine.",
    "question": "Where can we use foundation models?",
}
prediction = hf_deployment.predict_unstructured(input)

API Reference#

`deploy`(model, *args[, target, classes, ...])	Deploy a model to MLOps
`Deployment`([deployment_id])	DataRobot ML Ops deployment