Image semantic search and classification#
This tutorial implements an image semantic search application that uses batch inference, distributed training, and online serving at scale.
01-Batch-Inference.ipynb
: ingest and preprocess data at scale using Ray Data to generate embeddings for an image dataset of different dog breeds and store them.02-Distributed-Training.ipynb
: reprocess the same data to train an image classifier using Ray Train and saving model artifacts to a model registry (MLOps).03-Online-Serving.ipynb
: serve a semantic search app, using Ray Serve, that uses model predictions to filter and retrieve the most relevant images based on input queries.Create production batch Jobs for offline workloads like embedding generation, model training, etc., and production online Services that can scale.

Development#
The application is developed on Anyscale Workspaces, which enables development without worrying about infrastructure—just like working on a laptop. Workspaces come with:
Development tools: Spin up a remote session from your local IDE (Cursor, VS Code, etc.) and start coding, using the same tools you love but with the power of Anyscale’s compute.
Dependencies: Continue to install dependencies using familiar tools like pip. Anyscale propagates all dependencies to your cluster.
pip install -q "matplotlib==3.10.0" "torch==2.5.1" "transformers==4.47.1" "scikit-learn==1.6.0" "mlflow==2.19.0" "ipywidgets"
Compute: Leverage any reserved instance capacity, spot instance from any compute provider of your choice by deploying Anyscale into your account. Alternatively, you can use the Anyscale cloud for a full serverless experience.
Under the hood, a cluster spins up and is efficiently managed by Anyscale.
Debugging: Leverage a distributed debugger to get the same VS Code-like debugging experience.
Learn more about Anyscale Workspaces in the official documentation.

Note: Run the entire tutorial for free on Anyscale—all dependencies come pre-installed, and compute autoscales automatically. To run it elsewhere, install the dependencies from the containerfile
and provision the appropriate GPU resources.
Production#
Seamlessly integrate with your existing CI/CD pipelines by leveraging the Anyscale CLI or SDK to deploy highly available services and run reliable batch jobs. Developing in an environment nearly identical to production—a multi-node cluster—drastically accelerates the dev-to-prod transition. This tutorial also introduces proprietary RayTurbo features that optimize workloads for performance, fault tolerance, scale, and observability.
No infrastructure headaches#
Abstract away infrastructure from your ML/AI developers so they can focus on their core ML development. You can additionally better manage compute resources and costs with enterprise governance and observability and admin capabilities so you can set resource quotas, set priorities for different workloads and gain observability of your utilization across your entire compute fleet. Users running on a Kubernetes cloud (EKS, GKE, etc.) can still access the proprietary RayTurbo optimizations demonstrated in this tutorial by deploying the Anyscale Kubernetes Operator.