ray.data.datasource.MongoDatasource
ray.data.datasource.MongoDatasource#
- class ray.data.datasource.MongoDatasource(*args, **kwds)[source]#
Bases:
ray.data.datasource.datasource.Datasource
Datasource for reading from and writing to MongoDB.
Examples
>>> import ray >>> from ray.data.datasource import MongoDatasource >>> from pymongoarrow.api import Schema >>> ds = ray.data.read_datasource( ... MongoDatasource(), ... uri="mongodb://username:[email protected]:27017/?authSource=admin", # noqa: E501 ... database="my_db", ... collection="my_collection", ... schema=Schema({"col1": pa.string(), "col2": pa.int64()}), ... )
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
- create_reader(**kwargs) ray.data.datasource.datasource.Reader [source]#
Return a Reader for the given read arguments.
The reader object will be responsible for querying the read metadata, and generating the actual read tasks to retrieve the data blocks upon request.
- Parameters
read_args – Additional kwargs to pass to the datasource impl.
- do_write(blocks: List[ray.types.ObjectRef[Union[List[ray.data.block.T], pyarrow.Table, pandas.DataFrame, bytes]]], metadata: List[ray.data.block.BlockMetadata], ray_remote_args: Optional[Dict[str, Any]], uri: str, database: str, collection: str) List[ray.types.ObjectRef[Any]] [source]#
Launch Ray tasks for writing blocks out to the datasource.
- Parameters
blocks – List of data block references. It is recommended that one write task be generated per block.
metadata – List of block metadata.
ray_remote_args – Kwargs passed to ray.remote in the write tasks.
write_args – Additional kwargs to pass to the datasource impl.
- Returns
A list of the output of the write tasks.