-
Notifications
You must be signed in to change notification settings - Fork 34
INTPYTHON-501 INTPYTHON-784 Create Graph Visualizations for GraphRAG #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This reverts commit 82ec17a.
…just typing of langchain-mongodb
…and cleaned up closing clients (langchain-ai#232) [Issue Key](https://jira.mongodb.org/browse/{ISSUE_KEY}) <!-- What is this PR introducing? If context is already provided from the JIRA ticket, still place it in the Pull Request as you should not make the reviewer do digging for a basic summary. --> <!-- What changes did you make to the code? What new APIs (public or private) were added, removed, or edited to generate the desired outcome explained in the above summary? --> This is a small bugfix. When we switched from including OPENAI_API_KEY in CI secrets to using AZURE_OPENAI_ENDPOINTS, we missed this one. <!-- How did you test the code? If you added unit tests, you can say that. If you didn’t introduce unit tests, explain why. All code should be tested in some way – so please list what your validation strategy was. --> - Tests are run in github and evergreen. A passing patch build will be included. <!-- Do not delete the items provided on this checklist --> - [ na] Did you update the changelog (if necessary)? - [X] Is the intention of the code captured in relevant tests? - [na] If there are new TODOs, has a related JIRA ticket been created? - [X] Has a MongoDB Employee run [the patch build of this PR](https://github.com/mongodb-labs/ai-ml-pipeline-testing?tab=readme-ov-file#running-a-patch-build-of-a-given-pr)? - [x] Does the title of the PR reference a JIRA Ticket? - [x] Do you fully understand the implementation? (Would you be comfortable explaining how this code works to someone else?) - [x] Have you checked for spelling & grammar errors? - [x] Is all relevant documentation (README or docstring) updated? <!-- List any complex portion of code you believe needs particular scrutiny and explain why. --> The author had thought of updating MongoDBGraphStore to be a context manager. All this would do is close the client. We decided to leave it with a close method only.
…ere, and cleared output of cells.
|
The PR replaces #227 which I will close. That one confused the git history. |
libs/langchain-mongodb/tests/integration_tests/test_graphrag.py
Outdated
Show resolved
Hide resolved
|
Looks like the lock file needs updating according to pre-commit. |
|
Now it appears that we're seeing failures come in from INTPYTHON-798 as the retrievers now live in langchain-classic (I believe). I'm surprised that this problem isn't more widespread.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| if TYPE_CHECKING: | ||
| try: | ||
| from typing import TypeAlias # type:ignore[attr-defined] # Python 3.10+ | ||
| from typing import TypeAlias # type:ignore[attr-defined] # Python 3.10+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a separate ticket: should we be bumping our minimum Python version to 3.10 to align with PyMongo?
|
|
||
| # First pass: Add all nodes with their attributes | ||
| nx_graph = nx.DiGraph(**nx_opts) | ||
| for doc in self.collection.find({}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this find call project out the fields we're accessing? Or will the data always be shaped such that only _id, type, and attributes will be present?
| nx_graph.add_node(node_id, **node_attrs, **json_opts) | ||
|
|
||
| # Second pass: Add edges based on relationships | ||
| for doc in self.collection.find({}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect the data being queried to be large enough that we can't store the retrieved documents in memory to avoid a second find query here?
| Both of these libraries are incredibly feature rich. | ||
| We encourage those interested in visualization and/or graph analysis | ||
| to dig deeper into their documentation. | ||
| The customization options are truly stunning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels out-of-place for a docstring.
Issue Key
Summary
This PR adds graphical visualization of our MongoDBGraphStore, used for Knowledge Graphs and Graph RAG. It also adds an example notebook, and improvements to the prompting.
Changes in this PR
viewandto_networkxmethods to MongoDBGraphStore.Test Plan
Unit tests were added for the new methods. Typing tests were run. The notebooks was demonstrated numerous times to the maintainers and stakeholders.
Checklist
Checklist for Author
Checklist for Reviewer {@primary_reviewer}
Focus Areas for Reviewer (optional)