Mars on Ray¶
Mars is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn. Mars on Ray makes it easy to scale your programs with a Ray cluster.
Note
This API is experimental in Mars. If you encounter any bugs, please file an issue on GitHub.
Getting started¶
It’s easy to run Mars jobs on a Ray cluster. Use from mars.session import new_session
and run new_session(backend='ray').as_default()
; this will create
a Mars session for Ray as the default session, and then all Mars tasks will be
submitted to Ray. Arguments will be passed to ray.init()
when
creating a Mars session like this:
new_session(backend='ray', address=<address>, num_cpus=<num_cpus>)
.
from mars.session import new_session
ray_session = new_session(backend='ray').as_default()
import mars.dataframe as md
import mars.tensor as mt
t = mt.random.rand(100, 4, chunk_size=30)
df = md.DataFrame(t, columns=list('abcd'))
print(df.describe().execute())
Warning
Mars remote API is not available for now.