A Python sandbox project to explore and experiment with text generation using Large Language Models (LLMs).
The repository is structured as follows:
pytorch-project/
├── data/ # Directory to store chat JSON files
├── src/
│ ├── __init__.py # Shared code
│ ├── *.py # Entrypoints for model demonstration
├── requirements*.txt # Python dependencies
├── setup-venv.sh # Setup script for the .venv
- Python Version: Ensure Python version 3.10.x or 3.11.x is installed. (Python versions >= 3.13 are currently not compatible with PyTorch.)
- A performant GPU with Cuda-support is desirable, but you can fall back to using CPU if you want.
-
Clone this repository:
$ git clone <repository-url>
-
Set up a virtual environment and install dependencies:
$ bash setup-venv.sh
-
Create a
.envfile with Hugging Face credentials.
This project uses the transformers python package for downloading and using pre-trained models. To ensure smooth functionality:
-
Obtain an access token from your Hugging Face account.
-
Create a
.envfile in the root directory with the following content:HF_TOKEN=<your access token> HF_HOME=.cache/
Ensure that:
- You have requested gated model access where required (e.g., for models like LLaMA).
Once all dependencies are set up, you can run any of the demo scripts located in the src directory:
$ python src/<script-name>.py$ python src/recurrentgemma.pyNote: The file
__init__.pycontains shared code and is not meant to be executed directly.
The data/ directory contains .json5 files, which store structured chat-like message interactions. The following schema is used for these files:
[
{
"role": "system" | "user" | "assistant",
"content": "message text"
},
...
]
-
Roles:
system: Defines the model's behavior or initial instructions. Not every model supportssystemroles (e.g.,src/gemma-*.pyandsrc/recurrentgemma.pydo not use these entries).user: Represents user-supplied input for the model.assistant: Represents the model-generated output in response to the user.
-
Content: The
contentfield contains the text for each message, whether it's input or output.
- A chat session may optionally begin with a
systemrole. - The first non-
systemmessage must have the roleuser. - Messages should alternate between the
userandassistantroles. - To use a demo, the final entry in the JSON file must be a
usermessage, which serves as the input request for the model.
Models in this project support chat templates to automatically transform chat-like instructions into structured data for processing.
-
Model Support:
- Models behind gated access (e.g., LLaMA) require prior approval via the Hugging Face model repository.
- Refer to the specific demo script for details on model compatibility and features.
-
Errors: For
Torch not compiled with ...or similar issues, ensure PyTorch is installed for your system configuration using this guide.