[NLP] New example: IMDB Sentiment Analysis with TextVectorization and LSTM #2215

MadhurJain06 · 2025-12-02T13:28:13Z

Adds a new end-to-end NLP example demonstrating binary sentiment classification on the IMDB dataset using a Sequential model with an LSTM layer.

This submission addresses previous review comments by:

Refactoring data loading to use keras.utils.get_file for portability (no reliance on local files or Kaggle setup).

Replacing keras.preprocessing.text.Tokenizer and pad_sequences with the modern TextVectorization layer, making the model end-to-end (accepts raw strings for inference)

And other minute ones are also corrected

gemini-code-assist · 2025-12-02T13:28:26Z

Summary of Changes

Hello @MadhurJain06, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive new example for Natural Language Processing within the repository. Its primary goal is to showcase a modern and portable approach to binary sentiment classification using the IMDB dataset. By leveraging keras.utils.get_file for data handling and the TextVectorization layer, the example provides an end-to-end solution that is easy to run and understand, accepting raw text inputs directly into the model.

Highlights

New NLP Example: A new end-to-end NLP example has been added, demonstrating binary sentiment classification on the IMDB dataset.
Portable Data Loading: Data loading now uses keras.utils.get_file for improved portability, removing reliance on local files or specific setups.
Modern Text Preprocessing: The example utilizes the modern TextVectorization layer, replacing older keras.preprocessing utilities, to create an end-to-end model that accepts raw strings for inference.
LSTM Model Architecture: The model architecture is a Sequential model incorporating an LSTM layer for sentiment analysis.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a valuable end-to-end example for sentiment analysis using TextVectorization and an LSTM model. The code is well-structured and demonstrates modern Keras practices. However, I've found a critical issue in the data loading logic that prevents the script from running correctly. I've also included a few suggestions to improve code consistency and cleanup, which will enhance the quality of the example.

gemini-code-assist · 2025-12-02T13:31:28Z