Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors CENSO’s molecule/geometry data model toward typed atom/coordinate structures with new constructors and file-update helpers, while also improving parallel OpenMP balancing and adding a periodic table mapping to support the new constructors.
Changes:
- Refactor
GeometryData/MoleculeDatato useAtomobjects and typed coordinate tuples; addfrom_xyz/from_molconstructors and deprecate legacy string-list inputs. - Replace legacy geometry update methods (
fromxyz/fromcoord) withupdate_from_xyz_file/update_from_coord_file(keeping deprecated wrappers) and update processors to use the new APIs. - Improve load-balancing logic in
set_ompand add aPSEatomic-number→symbol mapping inparams.py.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/test_parallel.py | Updates unit tests to use MoleculeData.from_xyz after constructor refactor. |
| src/censo/processing/tm_processor.py | Switches to update_from_coord_file for reading optimized geometries. |
| src/censo/processing/qm_processor.py | Makes QmProc explicitly abstract via ABCMeta. |
| src/censo/processing/processor.py | Makes GenericProc an ABC so abstract methods are enforced. |
| src/censo/processing/orca_processor.py | Switches to update_from_xyz_file/update_from_coord_file for geometry updates. |
| src/censo/processing/job.py | Modernizes type annotation for mo_guess using ` |
| src/censo/params.py | Adds PSE mapping (atomic number → element symbol). |
| src/censo/parallel.py | Refines set_omp balancing logic to reduce idle cores and handle edge cases. |
| src/censo/molecules.py | Major data model refactor: Atom, typed coordinates, new constructors, and deprecated wrappers. |
| src/censo/ensemble.py | Updates conformer creation to use MoleculeData.from_xyz. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces substantial improvements to the handling of molecular geometry and data structures, with a strong emphasis on type safety, extensibility, and backwards compatibility. The changes refactor the way geometries and molecules are represented and instantiated, add new utility methods for conversion between formats, and enhance documentation for clarity. Additionally, there are improvements to parallel job assignment logic and the addition of a periodic table mapping. Below are the most important changes grouped by theme.
Geometry and Molecule Data Refactoring
GeometryDataandMoleculeDatato use explicit types (Atom, tuples for coordinates) and added constructors (from_xyz,from_mol) for safer and more flexible instantiation from xyz lines or atomic numbers and coordinates. Deprecated old list-of-strings interfaces with warnings for backwards compatibility. [1] [2] [3]toorca,toxyz,tocoord, etc.) to provide clearer documentation and ensure proper handling of coordinate formats and units. Added new methods for updating geometry from files and deprecated old ones with warnings. [1] [2] [3]Type Safety and Backwards Compatibility
Atom.xyzfrom alist[float]to atuple[float, float, float]for better type safety and consistency. Added casting and validation logic to ensure correct parsing of input lines.Utility and Infrastructure
PSE) toparams.pyfor converting atomic numbers to element symbols, supporting new geometry constructors and improving code clarity.set_ompto handle edge cases more robustly, minimizing idle cores and ensuring efficient resource allocation. [1] [2]Minor Enhancements
ensemble.pyto use the newMoleculeData.from_xyzconstructor, aligning with the refactored data model.