Skip to content

Conversation

Shi-pra-19
Copy link

This commit updates the RetinaNet example to be compatible with Keras 3. The following changes were made:

  • Refactored the code to align with Keras 3's API.
  • Adjusted imports, model initialization, training, and callbacks as required by the new version of Keras.
    This migration ensures that the example works seamlessly with the latest version of Keras.

This is my first PR to the repo, and I’m excited to contribute!

Copy link
Contributor

Summary of Changes

Hello @Shi-pra-19, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on modernizing the RetinaNet example by migrating its codebase to Keras 3. The changes ensure that the example leverages the latest Keras API, enhancing compatibility and maintainability. This involved a comprehensive update of TensorFlow-dependent operations to their Keras backend counterparts, refining data handling, and adapting model training components like loss functions and checkpointing to the Keras 3 standard.

Highlights

  • Keras 3 Migration: The RetinaNet example has been updated to be fully compatible with Keras 3, ensuring seamless operation with the latest version of the framework.
  • API Alignment: Code refactoring was performed to align with Keras 3's API, specifically by replacing TensorFlow-specific operations (e.g., tf.stack, tf.concat, tf.cast) with their Keras backend equivalents (e.g., keras.ops.stack, keras.ops.concatenate, keras.ops.cast).
  • Updated Data Handling and Model Checkpointing: Adjustments were made to data loading using keras.utils.get_file with named arguments, and model checkpointing now explicitly saves weights with a .weights.h5 extension. A new utility function get_latest_weights was introduced to robustly load the most recent model weights.
  • Loss Function Updates: The custom loss functions (RetinaNetBoxLoss, RetinaNetClassificationLoss, RetinaNetLoss) were updated to inherit from keras.losses.Loss and utilize Keras operations for their calculations, including a change in RetinaNetLoss's reduction strategy.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great first contribution, thank you for taking the time to migrate this example to Keras 3! The changes are mostly correct and follow the Keras 3 migration guide well. I've found a few issues, including a critical one in the loss function implementation that would prevent the model from training correctly. I've also pointed out a regression in functionality regarding loading pre-trained weights and some places where the migration to keras.ops is incomplete. Please take a look at my comments. Once these are addressed, this PR should be in good shape. Welcome to the repository!

Comment on lines 785 to 787
cross_entropy = keras.ops.binary_crossentropy(
y_true, y_pred
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The y_pred tensors are logits from the model's head, but keras.ops.binary_crossentropy expects probabilities by default. You need to pass from_logits=True to ensure the loss is calculated correctly. Without this, the model will not train correctly.

Suggested change
cross_entropy = keras.ops.binary_crossentropy(
y_true, y_pred
)
cross_entropy = keras.ops.binary_crossentropy(
y_true, y_pred, from_logits=True
)

Comment on lines 814 to 815
positive_mask = keras.ops.cast(tf.greater(y_true[:, :, 4], -1.0), dtype="float32")
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

You've started migrating to keras.ops here, but there are still some tf ops being used inside the cast. For backend agnosticity, you should consistently use keras.ops. Specifically, tf.greater should be keras.ops.greater and tf.equal should be keras.ops.equal.

Suggested change
positive_mask = keras.ops.cast(tf.greater(y_true[:, :, 4], -1.0), dtype="float32")
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32")
positive_mask = keras.ops.cast(keras.ops.greater(y_true[:, :, 4], -1.0), dtype="float32")
ignore_mask = keras.ops.cast(keras.ops.equal(y_true[:, :, 4], -2.0), dtype="float32")

Comment on lines +944 to +955
def get_latest_weights(model_dir):
weight_files = glob.glob(os.path.join(model_dir, "*.weights.h5"))

if not weight_files:
raise FileNotFoundError(f"No weight files found in {model_dir}")

latest_weight_file = max(weight_files, key=os.path.getmtime)

return latest_weight_file


model.load_weights(get_latest_weights(model_dir))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The previous implementation allowed loading pre-trained weights from the downloaded dataset. This new implementation removes that capability and requires the user to train the model first to generate weight files before running inference. This is a regression in the example's functionality.

The downloaded weights are in TensorFlow checkpoint format, which your new get_latest_weights function cannot handle as it only looks for .weights.h5 files.

To maintain the ability to use pre-trained weights, you could check if trained weights exist and fall back to the pre-trained ones if not. Since the example still contains other TensorFlow-specific code (e.g., tf.image.*), it's acceptable to use tf.train.latest_checkpoint for now for the TF backend.

self.num_classes = num_classes

prior_probability = tf.constant_initializer(-np.log((1 - 0.01) / 0.01))
prior_probability = prior_probability = keras.initializers.Constant(-np.log((1 - 0.01) / 0.01))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There seems to be a typo in this variable assignment. You have prior_probability assigned to itself.

Suggested change
prior_probability = prior_probability = keras.initializers.Constant(-np.log((1 - 0.01) / 0.01))
prior_probability = keras.initializers.Constant(-np.log((1 - 0.01) / 0.01))

model = RetinaNet(num_classes, resnet50_backbone)

optimizer = tf.keras.optimizers.legacy.SGD(learning_rate=learning_rate_fn, momentum=0.9)
optimizer = keras.optimizers.SGD(learning_rate=learning_rate_fn, momentum=0.9)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While keras.optimizers.SGD is correct, for Keras 3 it is recommended to use the functional optimizer keras.optimizers.sgd for better performance and consistency across backends.

Suggested change
optimizer = keras.optimizers.SGD(learning_rate=learning_rate_fn, momentum=0.9)
optimizer = keras.optimizers.sgd(learning_rate=learning_rate_fn, momentum=0.9)

@Shi-pra-19
Copy link
Author

Fixed the following -

  • Corrected the prior_probability typo.
  • Added from_logits=True
  • Kept keras.optimizers.SGD instead of keras.optimizers.sgd (avoids Attribute Error).
  • Pre-trained weights are not loaded because they are not in the format required by Keras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants