class ray.train.huggingface.HuggingFacePredictor(pipeline: Optional[transformers.pipelines.base.Pipeline] = None, preprocessor: Optional[Preprocessor] = None, use_gpu: bool = False)[source]#

Bases: ray.train.predictor.Predictor

A predictor for HuggingFace Transformers PyTorch models.

This predictor uses Transformers Pipelines for inference.

  • pipeline – The Transformers pipeline to use for inference.

  • preprocessor – A preprocessor used to transform data batches prior to prediction.

  • use_gpu – If set, the model will be moved to GPU on instantiation and prediction happens on GPU.

PublicAPI (alpha): This API is in alpha and may change before becoming stable.

classmethod from_checkpoint(checkpoint: ray.air.checkpoint.Checkpoint, *, pipeline_cls: Optional[Type[transformers.pipelines.base.Pipeline]] = None, **pipeline_kwargs) ray.train.huggingface.huggingface_predictor.HuggingFacePredictor[source]#

Instantiate the predictor from a Checkpoint.

The checkpoint is expected to be a result of HuggingFaceTrainer.

  • checkpoint – The checkpoint to load the model, tokenizer and preprocessor from. It is expected to be from the result of a HuggingFaceTrainer run.

  • pipeline_cls – A transformers.pipelines.Pipeline class to use. If not specified, will use the pipeline abstraction wrapper.

  • **pipeline_kwargs – Any kwargs to pass to the pipeline initialization. If pipeline is None, this must contain the ‘task’ argument. Cannot contain ‘model’. Can be used to override the tokenizer with ‘tokenizer’. If use_gpu is True, ‘device’ will be set to 0 by default.

predict(data: Union[numpy.ndarray, pandas.DataFrame, Dict[str, numpy.ndarray]], feature_columns: Optional[Union[List[str], List[int]]] = None, **predict_kwargs) Union[numpy.ndarray, pandas.DataFrame, Dict[str, numpy.ndarray]][source]#

Run inference on data batch.

The data is converted into a list (unless pipeline is a TableQuestionAnsweringPipeline) and passed to the pipeline object.

  • data – A batch of input data. Either a pandas DataFrame or numpy array.

  • feature_columns – The names or indices of the columns in the data to use as features to predict on. If None, use all columns.

  • **pipeline_call_kwargs – additional kwargs to pass to the pipeline object. If use_gpu is True, ‘device’ will be set to 0 by default.


>>> import pandas as pd
>>> from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
>>> from transformers.pipelines import pipeline
>>> from ray.train.huggingface import HuggingFacePredictor
>>> model_checkpoint = "gpt2"
>>> tokenizer_checkpoint = "sgugger/gpt2-like-tokenizer"
>>> tokenizer = AutoTokenizer.from_pretrained(tokenizer_checkpoint)
>>> model_config = AutoConfig.from_pretrained(model_checkpoint)
>>> model = AutoModelForCausalLM.from_config(model_config)
>>> predictor = HuggingFacePredictor(
...     pipeline=pipeline(
...         task="text-generation", model=model, tokenizer=tokenizer
...     )
... )
>>> prompts = pd.DataFrame(
...     ["Complete me", "And me", "Please complete"], columns=["sentences"]
... )
>>> predictions = predictor.predict(prompts)

Prediction result.