1.x to 2.x API Migration Guide#

This section covers what to consider or change in your application when migrating from Ray versions 1.x to 2.x.

What has been changed?#

In Ray Serve 2.0, we released a new deployment API. The 1.x deployment API can still be used, but it will be deprecated in the future version.

Migrating the 1.x Deployment#

Migrating handle pass between deployments#

In the 1.x deployment, we usually pass handle of deployment to chain the deployments.



@serve.deployment
class Model:
    def forward(self, input) -> str:
        # do some inference work
        return "done"


@serve.deployment
class Preprocess:
    def __init__(self, model_handle: RayServeSyncHandle):
        self.model_handle = model_handle

    async def __call__(self, input):
        # do some preprocessing works for your inputs
        return await self.model_handle.forward.remote(input)


Model.deploy()
model_handle = Model.get_handle()

Preprocess.deploy(model_handle)
preprocess_handle = Preprocess.get_handle()
ray.get(preprocess_handle.remote(1))

With the 2.0 deployment API, you can use the following code to update the above one.

@serve.deployment
class Model:
    def forward(self, input) -> str:
        # do some inference work
        return "done"


@serve.deployment
class Preprocess:
    def __init__(self, model_handle: RayServeDeploymentHandle):
        self.model_handle = model_handle

    async def __call__(self, input):
        # do some preprocessing works for your inputs
        ref = await self.model_handle.forward.remote(input)
        result = await ref
        return result


handle = serve.run(Preprocess.bind(Model.bind()))
ray.get(handle.remote(1))

Note

  • get_handle can be replaced by bind() function to fulfill same functionality.

  • serve.run will return the entry point deployment handle for your whole chained deployments.

Migrating a single deployment to the new deployment API#

In the 1.x deployment API, we usually have the following code for deployment.

@serve.deployment
class Model:
    def __call__(self, input: int):
        # some inference work
        return


Model.deploy()
handle = Model.get_handle()
handle.remote(1)

With the 2.0 deployment API, you can use the following code to update the above one.

@serve.deployment
class Model:
    def __call__(self, input: int):
        # some inference work
        return


handle = serve.run(Model.bind())
handle.remote(1)

Migrate Multiple deployment to new deployment API#

When you have multiple deployments, here is the normal code for 1.x API

@serve.deployment
class Model:
    def forward(self, input: int):
        # some inference work
        return


@serve.deployment
class Model2:
    def forward(self, input: int):
        # some inference work
        return


Model.deploy()
Model2.deploy()
handle = Model.get_handle()
handle.forward.remote(1)

handle2 = Model2.get_handle()
handle2.forward.remote(1)

With the 2.0 deployment API, you can use the following code to update the above one.

@serve.deployment
class Model:
    def forward(self, input: int):
        # some inference work
        return


@serve.deployment
class Model2:
    def forward(self, input: int):
        # some inference work
        return


with InputNode() as dag_input:
    model = Model.bind()
    model2 = Model2.bind()
    d = DAGDriver.bind(
        {
            "/model1": model.forward.bind(dag_input),
            "/model2": model2.forward.bind(dag_input),
        }
    )
handle = serve.run(d)
handle.predict_with_route.remote("/model1", 1)
handle.predict_with_route.remote("/model2", 1)

resp = requests.get("http://localhost:8000/model1", data="1")
resp = requests.get("http://localhost:8000/model2", data="1")

Note

  • predict method is defined inside DAGDriver class as an entry point to fulfil requests

  • Similar to predict method, predict_with_route method is defined inside DAGDriver class as an entry point to fulfil requests.

  • DAGDriver is a special class to handle multi entry points for different deployments

  • DAGDriver.bind can accept dictionary and each key is represented as entry point route path.

  • predict_with_route accepts a route path as the first argument to select which model to use.

  • In the example, you can also use an HTTP request to fulfill your request. Different models will bind with different route paths based on the user inputs; e.g. http://localhost:8000/model1 and http://localhost:8000/model2

Migrate deployments with route prefixes#

Sometimes, you have a customized route prefix for each deployment:

@serve.deployment(route_prefix="/my_model1")
class Model:
    def __call__(self, req: Request) -> str:
        # some inference work
        return "done"


Model.deploy()
resp = requests.get("http://localhost:8000/my_model1", data="321")

With the 2.0 deployment API, you can use the following code to update the above one.

@serve.deployment
class Model:
    def __call__(self, req: Request) -> str:
        # some inference work
        return "done"


d = DAGDriver.options(route_prefix="/my_model1").bind(Model.bind())
handle = serve.run(d)
resp = requests.get("http://localhost:8000/my_model1", data="321")

Or if you have multiple deployments and want to customize the HTTP route prefix for each model, you can use the following code:

@serve.deployment
class Model:
    def __call__(self, req: Request) -> str:
        # some inference work
        return "done"


@serve.deployment
class Model2:
    def __call__(self, req: Request) -> str:
        # some inference work
        return "done"


d = DAGDriver.bind({"/my_model1": Model.bind(), "/my_model2": Model2.bind()})
handle = serve.run(d)
resp = requests.get("http://localhost:8000/my_model1", data="321")
resp = requests.get("http://localhost:8000/my_model2", data="321")