Skip to content

Conversation

@i-be-snek
Copy link
Collaborator

This PR is meant to do two things:

(1) format any left-over files using the pre-commit hook (done automatically)
(2) improve the README, especially after a large number of changes were made to the pipeline

README.md Outdated

#### (Step 1) Raw output
Choose the raw file contains the text you need to process, please use the clear raw file name to indicate your experiment, this name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file)
Choose the raw file that contains the text you need to process. Please use clear raw file names to indicate your experiment. This name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liniiiiii

I don't understand this sentence:

This name will be used as the output file, the api env you want to use, the decription of the experiment, the prompt category, and the batch file location you want to store the batch file (this is not mandatory, but it's good to check if you create correct batch file)

Is it suggesting that the experiment name and description and category will all be the name of the output file?

Maybe adding a psuedo example (or a real example) could help

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I will do that, where can I edit it, in the same branch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can edit the same branch. I think for READMEs you can even safely edit directly in the Github website :D

@i-be-snek i-be-snek added the documentation Improvements or additions to documentation label Oct 2, 2024
@liniiiiii
Copy link
Collaborator

Pls keep this pr for a while, I will check other readmes I edited later, thanks!

```shell
from Database.Prompts.prompts import V_3 as target_prompts
```
##### Step 1: Experiment Settings
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this looks great! Thanks :D

One thing that could help the reader is to say that these are the params to pass into run_prompts

Suggested change
##### Step 1: Experiment Settings
##### Step 1: Experiment Settings
Here is what you need to begin an experiment run with `Database/Prompts/run_prompts.py`:

4. **Prompt Category**: Indicate the prompt category, such as "all".

5. **Batch File Location** (Optional): Specify where to store the batch file. This helps verify the batch file's creation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add something like this:

# check the args and flags
poetry run python3 Database/Prompts/run_prompts.py --help

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output:

wikimpacts-py3.11➜  Wikimpacts git:(drop-l1-missing-all-impacts) ✗ poetry run python3 Database/Prompts/run_prompts.py --help

usage: run_prompts.py [-h] [-f FILENAME] [-r RAW_DIR] [-b BATCH_DIR] [-m MODEL_NAME] [-t MAX_TOKENS] [-e API_ENV] [-d DESCRIPTION] [-p PROMPT_CATEGORY]

options:
  -h, --help            show this help message and exit
  -f FILENAME, --filename FILENAME
                        The name of the json file in the <Wikipedia articles> directory
  -r RAW_DIR, --raw_dir RAW_DIR
                        The directory containing Wikipedia json files to be run
  -b BATCH_DIR, --batch_dir BATCH_DIR
                        The directory where the batch file will land (as .jsonl)
  -m MODEL_NAME, --model_name MODEL_NAME
                        The model version applied in the experiment, like gpt-4o-mini.
  -t MAX_TOKENS, --max_tokens MAX_TOKENS
                        The max tokens of the model selected
  -e API_ENV, --api_env API_ENV
                        The env file that contains the API keys.
  -d DESCRIPTION, --description DESCRIPTION
                        The description of the experiment
  -p PROMPT_CATEGORY, --prompt_category PROMPT_CATEGORY
                        The prompt category of the experiment, can only choose from impact, basic, and all

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the suggestion, I will check them out after I fixed the visualization!

```

## Quickstart
## Development
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the suggestion from @koffiworou, I've moved the dev doc section further to the top so that users can make sure they have all the basics and dependencies set up before developing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants