KubeRay metrics references#
controller-runtime
metrics#
KubeRay exposes metrics provided by kubernetes-sigs/controller-runtime, including information about reconciliation, work queues, and more, to help users operate the KubeRay operator in production environments.
For more details about the default metrics provided by kubernetes-sigs/controller-runtime, see Default Exported Metrics References.
KubeRay custom metrics#
Starting with KubeRay 1.4.0, KubeRay provides metrics for its custom resources to help users better understand Ray clusters and Ray applications.
You can view these metrics by following the instructions below:
# Forward a local port to the KubeRay operator service.
kubectl port-forward service/kuberay-operator 8080
# View the metrics.
curl localhost:8080/metrics
# You should see metrics like the following if a RayCluster already exists:
# kuberay_cluster_info{name="raycluster-kuberay",namespace="default",owner_kind="None"} 1
RayCluster metrics#
Metric name |
Type |
Description |
Labels |
---|---|---|---|
|
Gauge |
Metadata information about RayCluster custom resources. |
|
|
Gauge |
Indicates whether the RayCluster is provisioned. See RayClusterProvisioned for more information. |
|
|
Gauge |
The time, in seconds, when a RayCluster’s |
|
RayService metrics#
Metric name |
Type |
Description |
Labels |
---|---|---|---|
|
Gauge |
Metadata information about RayService custom resources. |
|
|
Gauge |
Describes whether the RayService is ready. Ready means users can send requests to the underlying cluster and the number of serve endpoints is greater than 0. See RayServiceReady for more information. |
|
|
Gauge |
Describes whether the RayService is performing a zero-downtime upgrade. See UpgradeInProgress for more information. |
|
RayJob metrics#
Metric name |
Type |
Description |
Labels |
---|---|---|---|
|
Gauge |
Metadata information about RayJob custom resources. |
|
|
Gauge |
The RayJob’s current deployment status. |
|
|
Gauge |
Duration of the RayJob CR’s JobDeploymentStatus transition from |
|