-
Notifications
You must be signed in to change notification settings - Fork 2.1k
migrate retinanet example to keras 3 #2187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Shi-pra-19, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on modernizing the RetinaNet example by migrating its codebase to Keras 3. The changes ensure that the example leverages the latest Keras API, enhancing compatibility and maintainability. This involved a comprehensive update of TensorFlow-dependent operations to their Keras backend counterparts, refining data handling, and adapting model training components like loss functions and checkpointing to the Keras 3 standard. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a great first contribution, thank you for taking the time to migrate this example to Keras 3! The changes are mostly correct and follow the Keras 3 migration guide well. I've found a few issues, including a critical one in the loss function implementation that would prevent the model from training correctly. I've also pointed out a regression in functionality regarding loading pre-trained weights and some places where the migration to keras.ops
is incomplete. Please take a look at my comments. Once these are addressed, this PR should be in good shape. Welcome to the repository!
cross_entropy = keras.ops.binary_crossentropy( | ||
y_true, y_pred | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The y_pred
tensors are logits from the model's head, but keras.ops.binary_crossentropy
expects probabilities by default. You need to pass from_logits=True
to ensure the loss is calculated correctly. Without this, the model will not train correctly.
cross_entropy = keras.ops.binary_crossentropy( | |
y_true, y_pred | |
) | |
cross_entropy = keras.ops.binary_crossentropy( | |
y_true, y_pred, from_logits=True | |
) |
examples/vision/retinanet.py
Outdated
positive_mask = keras.ops.cast(tf.greater(y_true[:, :, 4], -1.0), dtype="float32") | ||
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've started migrating to keras.ops
here, but there are still some tf
ops being used inside the cast
. For backend agnosticity, you should consistently use keras.ops
. Specifically, tf.greater
should be keras.ops.greater
and tf.equal
should be keras.ops.equal
.
positive_mask = keras.ops.cast(tf.greater(y_true[:, :, 4], -1.0), dtype="float32") | |
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32") | |
positive_mask = keras.ops.cast(keras.ops.greater(y_true[:, :, 4], -1.0), dtype="float32") | |
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32") |
def get_latest_weights(model_dir): | ||
weight_files = glob.glob(os.path.join(model_dir, "*.weights.h5")) | ||
|
||
if not weight_files: | ||
raise FileNotFoundError(f"No weight files found in {model_dir}") | ||
|
||
latest_weight_file = max(weight_files, key=os.path.getmtime) | ||
|
||
return latest_weight_file | ||
|
||
|
||
model.load_weights(get_latest_weights(model_dir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation allowed loading pre-trained weights from the downloaded dataset. This new implementation removes that capability and requires the user to train the model first to generate weight files before running inference. This is a regression in the example's functionality.
The downloaded weights are in TensorFlow checkpoint format, which your new get_latest_weights
function cannot handle as it only looks for .weights.h5
files.
To maintain the ability to use pre-trained weights, you could check if trained weights exist and fall back to the pre-trained ones if not. Since the example still contains other TensorFlow-specific code (e.g., tf.image.*
), it's acceptable to use tf.train.latest_checkpoint
for now for the TF backend.
examples/vision/retinanet.py
Outdated
self.num_classes = num_classes | ||
|
||
prior_probability = tf.constant_initializer(-np.log((1 - 0.01) / 0.01)) | ||
prior_probability = prior_probability = keras.initializers.Constant(-np.log((1 - 0.01) / 0.01)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model = RetinaNet(num_classes, resnet50_backbone) | ||
|
||
optimizer = tf.keras.optimizers.legacy.SGD(learning_rate=learning_rate_fn, momentum=0.9) | ||
optimizer = keras.optimizers.SGD(learning_rate=learning_rate_fn, momentum=0.9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While keras.optimizers.SGD
is correct, for Keras 3 it is recommended to use the functional optimizer keras.optimizers.sgd
for better performance and consistency across backends.
optimizer = keras.optimizers.SGD(learning_rate=learning_rate_fn, momentum=0.9) | |
optimizer = keras.optimizers.sgd(learning_rate=learning_rate_fn, momentum=0.9) |
Fixed the following -
|
This commit updates the RetinaNet example to be compatible with Keras 3. The following changes were made:
This migration ensures that the example works seamlessly with the latest version of Keras.
This is my first PR to the repo, and I’m excited to contribute!