-
Notifications
You must be signed in to change notification settings - Fork 6
Fix insertion #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix insertion #93
Conversation
… keys in manifest_2020.yaml.
…thout creating anything yet)
…ntity resolver services for cleaner code
Summary of ChangesHello @zaeema-n, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly overhauls the data management system by flattening the 2020 data file structure and centralizing its logical organization within a YAML manifest. It introduces a robust ingestion framework capable of parsing this new manifest, resolving governmental entities against a database, and laying the groundwork for automated data insertion. This foundational change aims to improve data accessibility and streamline future data processing workflows. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant refactoring by moving from a hierarchical file structure to a flat one, managed by a new YAML manifest. It also adds the corresponding Python scripts to parse this new structure and resolve entities against the database. The new code is well-structured, using async practices and separating concerns into different services. However, I've identified a critical resource leak related to an unclosed aiohttp.ClientSession and a high-severity performance issue in the entity resolution logic that could lead to an excessive number of API calls. I've also included several medium-severity suggestions to improve code clarity, maintainability, and project structure. Addressing these points will greatly improve the robustness and quality of the new ingestion process.
ChanukaUOJ
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added Some Comments! Looks good for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this separate environment.yml file exists?
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request flattens the 2020 hierarchical file structure into a flat one and creates a YAML file to store the hierarchies. It also adds code to traverse the file structure and correctly select the minister and department from the database. The changes include modifications to .gitignore, removal of a JSON file, addition of several new JSON files, renaming of some JSON files, addition of a YAML file (manifest_2020.yaml), modification of environment.yml, and addition of several Python files related to ingestion and exception handling. The review focuses on identifying potential issues related to code correctness and maintainability, particularly in the newly added Python files.
This pr:
Note that this pr does not do any insertion of categories or datasets yet