Using Keras & TensorFlow with Tune#

Prerequisites#
pip install "ray[tune]" tensorflow==2.18.0 filelock
Example#
import os
from filelock import FileLock
from tensorflow.keras.datasets import mnist
from ray import tune
from ray.tune.schedulers import AsyncHyperBandScheduler
from ray.air.integrations.keras import ReportCheckpointCallback
def train_mnist(config):
# https://github.com/tensorflow/tensorflow/issues/32159
import tensorflow as tf
batch_size = 128
num_classes = 10
epochs = 12
with FileLock(os.path.expanduser("~/.data.lock")):
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(config["hidden"], activation="relu"),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(num_classes, activation="softmax"),
]
)
model.compile(
loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.SGD(learning_rate=config["learning_rate"], momentum=config["momentum"]),
metrics=["accuracy"],
)
model.fit(
x_train,
y_train,
batch_size=batch_size,
epochs=epochs,
verbose=0,
validation_data=(x_test, y_test),
callbacks=[ReportCheckpointCallback(metrics={"accuracy": "accuracy"})],
)
def tune_mnist():
sched = AsyncHyperBandScheduler(
time_attr="training_iteration", max_t=400, grace_period=20
)
tuner = tune.Tuner(
tune.with_resources(train_mnist, resources={"cpu": 2, "gpu": 0}),
tune_config=tune.TuneConfig(
metric="accuracy",
mode="max",
scheduler=sched,
num_samples=10,
),
run_config=tune.RunConfig(
name="exp",
stop={"accuracy": 0.99},
),
param_space={
"threads": 2,
"learning_rate": tune.uniform(0.001, 0.1),
"momentum": tune.uniform(0.1, 0.9),
"hidden": tune.randint(32, 512),
},
)
results = tuner.fit()
return results
results = tune_mnist()
print(f"Best hyperparameters found were: {results.get_best_result().config} | Accuracy: {results.get_best_result().metrics['accuracy']}")
Show code cell output
Tune Status
Current time: | 2025-02-13 15:22:41 |
Running for: | 00:00:41.76 |
Memory: | 21.4/36.0 GiB |
System Info
Using AsyncHyperBand: num_stopped=0Bracket: Iter 320.000: None | Iter 80.000: None | Iter 20.000: None
Logical resource usage: 2.0/12 CPUs, 0/0 GPUs
Trial Status
Trial name | status | loc | hidden | learning_rate | momentum | iter | total time (s) | accuracy |
---|---|---|---|---|---|---|---|---|
train_mnist_533a2_00000 | TERMINATED | 127.0.0.1:36365 | 371 | 0.0799367 | 0.588387 | 12 | 20.8515 | 0.984583 |
train_mnist_533a2_00001 | TERMINATED | 127.0.0.1:36364 | 266 | 0.0457424 | 0.22303 | 12 | 19.5277 | 0.96495 |
train_mnist_533a2_00002 | TERMINATED | 127.0.0.1:36368 | 157 | 0.0190286 | 0.537132 | 12 | 16.6606 | 0.95385 |
train_mnist_533a2_00003 | TERMINATED | 127.0.0.1:36363 | 451 | 0.0433488 | 0.18925 | 12 | 22.0514 | 0.966283 |
train_mnist_533a2_00004 | TERMINATED | 127.0.0.1:36367 | 276 | 0.0336728 | 0.430171 | 12 | 20.0884 | 0.964767 |
train_mnist_533a2_00005 | TERMINATED | 127.0.0.1:36366 | 208 | 0.071015 | 0.419166 | 12 | 17.933 | 0.976083 |
train_mnist_533a2_00006 | TERMINATED | 127.0.0.1:36475 | 312 | 0.00692959 | 0.714595 | 12 | 13.058 | 0.944017 |
train_mnist_533a2_00007 | TERMINATED | 127.0.0.1:36479 | 169 | 0.0694114 | 0.664904 | 12 | 10.7991 | 0.9803 |
train_mnist_533a2_00008 | TERMINATED | 127.0.0.1:36486 | 389 | 0.0370836 | 0.665592 | 12 | 14.018 | 0.977833 |
train_mnist_533a2_00009 | TERMINATED | 127.0.0.1:36487 | 389 | 0.0676138 | 0.52372 | 12 | 14.0043 | 0.981833 |
2025-02-13 15:22:41,843 INFO tune.py:1009 -- Wrote the latest version of all result files and experiment state to '/Users/rdecal/ray_results/exp' in 0.0048s.
2025-02-13 15:22:41,846 INFO tune.py:1041 -- Total run time: 41.77 seconds (41.75 seconds for the tuning loop).
Best hyperparameters found were: {'threads': 2, 'learning_rate': 0.07993666231835218, 'momentum': 0.5883866709655042, 'hidden': 371} | Accuracy: 0.98458331823349
This should output something like:
Best hyperparameters found were: {'threads': 2, 'learning_rate': 0.07607440973606909, 'momentum': 0.7715363277240616, 'hidden': 452} | Accuracy: 0.98458331823349
More Keras and TensorFlow Examples#
Memory NN Example: Example of training a Memory NN on bAbI with Keras using PBT.
TensorFlow MNIST Example: Converts the Advanced TF2.0 MNIST example to use Tune with the Trainable. This uses
tf.function
. Original code from tensorflow: https://www.tensorflow.org/tutorials/quickstart/advancedKeras Cifar10 Example: A contributed example of tuning a Keras model on CIFAR10 with the PopulationBasedTraining scheduler.