Conversation
… Also created the encodings.yaml
…ling error. changed e2e debug test to xfail, will update after formatting issue resolved.
…llow custom aspin types
…aframe_e2e.py compatible with the inclusion of the yaml file
…n and removed xfail. added helper function test for load_encodings_yaml().
…into formatting_issue_prototype
…gex to include .h5 for first fmt. added new pytest fixture for spin data with "m"
…rrent filename. Removed _tmp_test from test_paraframe.py file
…and a shorter version in the yaml file. Began working on setting up hallmark to accomodate the temporary paths in the testing.
…aml file, to ensure that duplicate fmts with duplicate sections are not made.
…orary paths for each test function.
…d. Modified the core file to take into account no encodings specified. Updated the tests by adding encoding = True where .parse is called.
…e same location as the data being paraframed. changed yaml name to .hallmark from CK's recommendation.
…the temporary data directory for each testing function. The temporary yaml file now only requires the most fundamental fmt strings as are currently displayed in the .hallmark.yaml file instead of the entire temporary path. All tests pass.
hfoote
left a comment
There was a problem hiding this comment.
Let's fix the few things I mentioned in the comments, then get this merged. After we merge this with next, we should be able to use the Repo's functionality to handle the path to the yaml file (look for a .hallmark.yaml file in the data directory by default).
| _user_yaml_path = None | ||
|
|
||
| def set_rel_yaml_path(path): | ||
| global _user_yaml_path |
There was a problem hiding this comment.
Add an attribute of ParaFrame that stores _user_yaml_path when a ParaFrame is instantiated. That way, the user can set the path with this function between creating different ParaFrames.
| "outputs": [], | ||
| "source": [ | ||
| "pf = ParaFrame.parse(\"data/a_{a:d}/b_{b:d}.txt\")" | ||
| "pf = ParaFrame.parse(\"/a_{a:d}/b_{b:d}.txt\")" |
There was a problem hiding this comment.
Leading / usually indicates the root dir in unix, it would be better to make this look like a relative path.
| # a user wants to create a paraframe | ||
| fmt = str(create_temp_data / "a_{a:d}/b_{b:d}.txt") | ||
| pf = ParaFrame.parse(fmt) | ||
| fmt = str("/a_{a:d}/b_{b:d}.txt") |
There was a problem hiding this comment.
Leading / again - let's make this a relative path if possible.
Summary
We separated the file acquisition from the original parse function, so the code becomes modular. We added the new YAML file configuration that helps to organize the format strings that the user needs to modify into a document. We added an ‘encoding’ key in the YAML file that allows the user to specify the desired format substitution. We added code to carry out these substitutions in the core file so that the ParaFrame object displays the modifications.
Details
We separated the parse function into glob search, which searches files in the specified directory and returns the pattern used for parsing, and parse, which parses through the files and places it in the ParaFrame object. We add the YAML file, which allows the user to specify the directory to work on, and the directory path that the user wants to modify. In this file, the user can also specify the method to apply the modification, generally through a regex substitution. However, the user also has the option to pass the format string directly into the function if they don’t require a modification of the path. In the core.py file, we added code that carries out the desired regex substitution in the ‘parse’ function so that the ParaFrame object contains the modified characters. We added functions specifically designed to interact with the YAML file and perform the regex substitution, located in the helperfunctions.py file.
Testing
We added several tests to make temporary directories that have special cases in the format string that requires regex substitution. These cases include ‘+’, ‘m’ in the directory path. We also create a temporary yaml file in the temporary directory and manually write it with the path to testing functions so that the testing process is carried out systematically for each test.
Follow-on
Future modifications include adding example use cases for the 'encodings' regex substitution in the demo notebook. The next iteration of hallmark will develop the Command Line Interface to allow users further flexibility and access to this package. We will also further explore the use of one vs multiple YAML files for different user stories.