Scikit-Learn Tutorial

In this guide, we will train and deploy a simple Scikit-Learn classifier. In particular, we show:

  • How to load the model from file system in your Ray Serve definition

  • How to parse the JSON request and evaluated in sklearn model

Please see the Key Concepts to learn more general information about Ray Serve.

Ray Serve is framework agnostic. You can use any version of sklearn.

pip install scikit-learn

Let’s import Ray Serve and some other helpers.

from ray import serve

import pickle
import json
import numpy as np
import requests
import os
import tempfile

from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import mean_squared_error

We will train a logistic regression with the iris dataset.

# Load data
iris_dataset = load_iris()
data, target, target_names = (

# Instantiate model
model = GradientBoostingClassifier()

# Training and validation split
np.random.shuffle(data), np.random.shuffle(target)
train_x, train_y = data[:100], target[:100]
val_x, val_y = data[100:], target[100:]

# Train and evaluate models, train_y)
print("MSE:", mean_squared_error(model.predict(val_x), val_y))

# Save the model and label to file
MODEL_PATH = os.path.join(tempfile.gettempdir(), "iris_model_logistic_regression.pkl")
LABEL_PATH = os.path.join(tempfile.gettempdir(), "iris_labels.json")

with open(MODEL_PATH, "wb") as f:
    pickle.dump(model, f)
with open(LABEL_PATH, "w") as f:
    json.dump(target_names.tolist(), f)

Services are just defined as normal classes with __init__ and __call__ methods. The __call__ method will be invoked per request.

class BoostingModel:
    def __init__(self):
        with open(MODEL_PATH, "rb") as f:
            self.model = pickle.load(f)
        with open(LABEL_PATH) as f:
            self.label_list = json.load(f)

    async def __call__(self, starlette_request):
        payload = await starlette_request.json()
        print("Worker: received starlette request with data", payload)

        input_vector = [
            payload["sepal length"],
            payload["sepal width"],
            payload["petal length"],
            payload["petal width"],
        prediction = self.model.predict([input_vector])[0]
        human_name = self.label_list[prediction]
        return {"result": human_name}

Now that we’ve defined our services, let’s deploy the model to Ray Serve. We will define a Serve deployment that will be exposed over an HTTP route.


Let’s query it!

sample_request_input = {
    "sepal length": 1.2,
    "sepal width": 1.0,
    "petal length": 1.1,
    "petal width": 0.9,
response = requests.get("http://localhost:8000/regressor", json=sample_request_input)
# Result:
# {
#  "result": "versicolor"
# }