-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Implement JS entity extraction from text #999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Aarchi-07
commented
Oct 16, 2025
- Converted Python entity extraction code to JavaScript.
- Added functions to extract phone numbers, URLs, and directions info.
- Integrated prompts with AI model to return structured lists of entities.
Summary of ChangesHello @Aarchi-07, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new JavaScript-based entity extraction module, porting functionality previously available in Python. It leverages the Gemini AI model to intelligently identify and structure various entities, such as phone numbers, URLs, and transportation details, from unstructured text, providing a robust solution for data parsing. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully converts the Python entity extraction notebook into a JavaScript version, providing examples for extracting various types of information like street names, phone numbers, and URLs. The code is clear and follows the structure of the original notebook. My main feedback concerns a fundamental JavaScript best practice: all variables are currently created as implicit globals. This can lead to bugs and makes the code harder to maintain. I've added several suggestions to declare all variables with const
to properly scope them. Applying these changes will significantly improve the code quality and align it with standard JavaScript practices.
module = await import("https://esm.sh/@google/genai@1.4.0"); | ||
GoogleGenAI = module.GoogleGenAI; | ||
ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY }); | ||
|
||
MODEL_ID = "gemini-2.5-flash" // ["gemini-2.5-flash-lite", "gemini-2.5-flash", "gemini-2.5-pro"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In JavaScript, it's a best practice to always declare variables with const
, let
, or var
to avoid creating implicit globals. This improves code clarity and prevents potential naming conflicts. For the SDK import, you can use object destructuring for a more concise syntax.
const { GoogleGenAI } = await import("https://esm.sh/@google/genai@1.4.0");
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const MODEL_ID = "gemini-2.5-flash"; // ["gemini-2.5-flash-lite", "gemini-2.5-flash", "gemini-2.5-pro"]
*/ | ||
|
||
// [CODE STARTS] | ||
directions = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directionsPrompt = ` | ||
From the given text, extract the following entities and return a list of them. | ||
Entities to extract: street name, form of transport. | ||
Text: ${directions} | ||
Street = [] | ||
Transport = [] | ||
`; | ||
|
||
response = await ai.models.generateContent({ | ||
model: MODEL_ID, | ||
contents: [directionsPrompt], | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variables directionsPrompt
and response
are being created as implicit globals. They should be declared with const
to properly scope them to this code block. This is a fundamental best practice in JavaScript to avoid side effects and improve maintainability.
const directionsPrompt = `
From the given text, extract the following entities and return a list of them.
Entities to extract: street name, form of transport.
Text: ${directions}
Street = []
Transport = []
`;
const response = await ai.models.generateContent({
model: MODEL_ID,
contents: [directionsPrompt],
});
directionsListPrompt = ` | ||
From the given text, extract the following entities and | ||
return a list of them. | ||
Entities to extract: street name, form of transport. | ||
Text: ${directions} | ||
Return your answer as two lists: | ||
Street = [street names] | ||
Transport = [forms of transport] | ||
`; | ||
|
||
response = await ai.models.generateContent({ | ||
model: MODEL_ID, | ||
contents: [directionsListPrompt], | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variables directionsListPrompt
and response
are created as implicit globals. Please declare them with const
to scope them correctly. Re-declaring response
is fine here as it's in a new conceptual block, and const
is block-scoped.
const directionsListPrompt = `
From the given text, extract the following entities and
return a list of them.
Entities to extract: street name, form of transport.
Text: ${directions}
Return your answer as two lists:
Street = [street names]
Transport = [forms of transport]
`;
const response = await ai.models.generateContent({
model: MODEL_ID,
contents: [directionsListPrompt],
});
*/ | ||
|
||
// [CODE STARTS] | ||
customerServiceEmail = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
phonePrompt = ` | ||
From the given text, extract the following entities and return a list of them. | ||
Entities to extract: phone numbers. | ||
Text: ${customerServiceEmail} | ||
Return your answer in a list: | ||
`; | ||
|
||
response = await ai.models.generateContent({ | ||
model: MODEL_ID, | ||
contents: [phonePrompt], | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phonePrompt
and response
variables are created as implicit globals. They should be declared with const
to follow JavaScript best practices and avoid polluting the global namespace.
const phonePrompt = `
From the given text, extract the following entities and return a list of them.
Entities to extract: phone numbers.
Text: ${customerServiceEmail}
Return your answer in a list:
`;
const response = await ai.models.generateContent({
model: MODEL_ID,
contents: [phonePrompt],
});
*/ | ||
|
||
// [CODE STARTS] | ||
urlText = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
urlPrompt = ` | ||
From the given text, extract the following entities and return a list of them. | ||
Entities to extract: URLs. | ||
Text: ${urlText} | ||
Do not duplicate entities. | ||
Return your answer in a markdown format: | ||
`; | ||
|
||
response = await ai.models.generateContent({ | ||
model: MODEL_ID, | ||
contents: [urlPrompt], | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The urlPrompt
and response
variables are created as implicit globals. They should be declared with const
to ensure they are block-scoped and to prevent potential issues with global variable conflicts.
const urlPrompt = `
From the given text, extract the following entities and return a list of them.
Entities to extract: URLs.
Text: ${urlText}
Do not duplicate entities.
Return your answer in a markdown format:
`;
const response = await ai.models.generateContent({
model: MODEL_ID,
contents: [urlPrompt],
});
Hello @Aarchi-07 , thank you for your contribution! I believe it would be best to port the notebooks from the |
Make sure to also create a Readme file for that newly created folder, although I would suggest waiting for Giom to approve the creation for a new folder. cc @Giom-V |
Hello @andycandy , thanks for your suggestion! Alright I'll wait for the approval. |