Skip to content

Update file handling POSIX error#719

Open
jaclark5 wants to merge 2 commits intoOpenFreeEnergy:mainfrom
jaclark5:posix_issue
Open

Update file handling POSIX error#719
jaclark5 wants to merge 2 commits intoOpenFreeEnergy:mainfrom
jaclark5:posix_issue

Conversation

@jaclark5
Copy link

@jaclark5 jaclark5 commented Jan 28, 2026

When I pass a POSIX path to:

protein = ProteinComponent.from_pdb_file(protein_path_obj)

I get the following error that this PR is meant to resolve, and provide actionable feedback if another type is used:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[22], [line 2](vscode-notebook-cell:?execution_count=22&line=2)
      1 # Load protein
----> [2](vscode-notebook-cell:?execution_count=22&line=2) protein = ProteinComponent.from_pdb_file(system.protein)
      3 print(f"Loaded protein: {protein}")
      4 print(f"  Number of atoms: {protein.to_rdkit().GetNumAtoms()}")

File ~/bin/gufe/gufe/components/proteincomponent.py:178, in ProteinComponent.from_pdb_file(cls, pdb_file, name)
    161 @classmethod
    162 def from_pdb_file(cls, pdb_file: PathLike | TextIO, name: str = ""):
    163     """
    164     Create ``ProteinComponent`` from PDB-formatted file.
    165 
   (...)    176         the deserialized molecule
    177     """
--> [178](https://file+.vscode-resource.vscode-cdn.net/Users/jenniferclark/bin/openfe-benchmarks/examples/~/bin/gufe/gufe/components/proteincomponent.py:178)     openmm_PDBFile = PDBFile(pdb_file)
    179     return cls._from_openmmPDBFile(openmm_PDBFile=openmm_PDBFile, name=name)

File ~/bin/gufe/gufe/vendor/pdb_file/pdbfile.py:161, in PDBFile.__init__(self, file, extraParticleIdentifier)
    159     inputfile = open(file)
    160     own_handle = True
--> [161](https://file+.vscode-resource.vscode-cdn.net/Users/jenniferclark/bin/openfe-benchmarks/examples/~/bin/gufe/gufe/vendor/pdb_file/pdbfile.py:161) pdb = PdbStructure(
    162     inputfile,
    163     load_all_models=True,
    164     extraParticleIdentifier=extraParticleIdentifier,
    165 )
    166 if own_handle:
    167     inputfile.close()

File ~/bin/gufe/gufe/vendor/pdb_file/pdbstructure.py:154, in PdbStructure.__init__(self, input_stream, load_all_models, extraParticleIdentifier)
    152 self.modified_residues = []
    153 # read file
--> [154](https://file+.vscode-resource.vscode-cdn.net/Users/jenniferclark/bin/openfe-benchmarks/examples/~/bin/gufe/gufe/vendor/pdb_file/pdbstructure.py:154) self._load(input_stream)

File ~/bin/gufe/gufe/vendor/pdb_file/pdbstructure.py:160, in PdbStructure._load(self, input_stream)
    158 self._reset_residue_numbers()
    159 # Read one line at a time
--> [160](https://file+.vscode-resource.vscode-cdn.net/Users/jenniferclark/bin/openfe-benchmarks/examples/~/bin/gufe/gufe/vendor/pdb_file/pdbstructure.py:160) for pdb_line in input_stream:
    161     if not isinstance(pdb_line, str):
    162         pdb_line = pdb_line.decode("utf-8")

TypeError: 'PosixPath' object is not iterable

Without this change I need to resolve with:

protein = ProteinComponent.from_pdb_file(str(protein_path_obj))

Which is kinda clunky

Checklist

  • Added a news entry

Developers certificate of origin

@jaclark5 jaclark5 changed the title Update file handling os POSIX Update file handling POSIX error Jan 28, 2026
@jaclark5
Copy link
Author

pre-commit.ci autofix

@github-actions
Copy link

No API break detected ✅

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be doing this in the vendor vs in the component - I'll cc @atravitz here for thoughts.

@jaclark5
Copy link
Author

I guess Chem.SDMolSupplier requires paths be strings anyway and that's pretty standard to the openfe workflow. I'm mostly doing this because in OpenFF we generally use POSIX paths, and that's what I'm implementing in my opefe-benchmarks methods... but if it's cleaner to use strings only for openfe (like RDKit) I can scrap this PR and switch to string only openfe-benchmarks outputs.

@jaclark5
Copy link
Author

Although I think you supported posix once upon a time because in one of the old notebooks in the openfe-benchmarks it read: protein = gufe.ProteinComponent.from_pdb_file(system_dir.joinpath("protein.pdb").as_posix())

@atravitz
Copy link
Contributor

we’re reevaluating our pdb parser vendoring in the next week or so - @jaclark5 can we follow up on this then, and can you use str representation until then?

@jaclark5
Copy link
Author

@atravitz absolutely. I'm not waiting on this, but it's something I noticed so I offered a fix to make the workflow sleeker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants