This repository contains tools and frameworks for developing and training AI-agents for genetic engineering workflows.
The ModelClient provides an interface to an LLM of choice (currently only from openai or a local model via vllm).
State Management: The system uses a SessionState object to manage the context (like the currently selected library, custom UCF files, etc.) across multiple tool calls within a single user design request..
Several tools will be used to design and optimize genetic circuits. These tools are be wrapped in an integration layer which will be used by the agent. Listed below are some of the key tools that will be used:
- Cello - includes library management tools for adding/editing parts in a library files and a tool for kicking off Cello's design algorithm
- SynBioHub - includes tools to search and download parts from SynBioHub
- Part Optimization - ProD for promoter design, RBS Calculator for RBS design
- Scientific utilities - Count GC in a sequence, run sequence similarity, literature search
A Workflow defines:
- A natural language
promptdescribing a task - A
get_metricsfunctions for computing the metrics based on the generated chat history and session state. This can be used for computing arewardorscorefor various forms of RL via policy-optimization. - A
check_finishedfunction for optionally terminating the chat stream upon some condition - An optional
GRADING_RUBRIC- instructions for assigning a grade/reward to the rolled our workflow
Workflows can be run which coordinates a multi-turn conversations with the ModelClient. Upon finishing, get_metrics() can be used.
ArtAdapter
- An
ArtAdapterobject wraps theWorkflowand exposes theasyncrolloutfunction. This allows multiple runs to execute in parallel and can be dropped intoarttraining scheme to fine-tune an LLM via GRPO. - The optional
GRADING_RUBRICcan be used in conjunction with LLM judges to compute reward (e.g. ART's RULER)
Examples of prompts can be found in examples_and_prompts. These range from simple to complex to aspirational and are designed to guide the implementation of this project and outline the vision of more sophisticated systems.
The workflows provide a convenient means of "rolling" out scenarios and comparing outcomes. Their primary intention is to easily enable RL-training on the various implemented tools.
A basic streamlit UI allows for interactive chat sessions with an LLM agent (openai and models served via vllm supported)
Example scripts are provided in the examples directory.
- git clone the repo
cd geneforgevirtualenv venv --python=3.12# create a virtual env -- Important: Use python version <3.13source venv/bin/activate# activate the environmentpip install -r requirements.txt# install project requirementsmkdir ext_repos# create a directory to hold external reposcd ext_repos# cd into the created repogit clone https://github.com/CIDARLAB/Cello-UCF.git# clone cello libsgit clone https://github.com/CIDARLAB/Cello-v2-1-Core.git# clone cellogit clone https://github.com/barricklab/promoter-calculator.git# clone promoter calculatorpip install -r ext_repos/Cello-v2-1-Core/requirements.txt# install cello requirementspip install -r ext_repos/Cello-UCF/requirements.txt# install Cello-UCF requirementspip install -r ext_repos/promoter-calculator/requirements.txt# install promoter-calculator requirementscd ..# cd back into project root (geneforge directory)mkdir logs# create the logs foldertouch .env# create file.envin geneforge folder (project root) to hold environment variables- Add the following keys to
.env:
OPENAI_API_KEY={Your open ai api key} # this or deepseek api key required to use LLMs
DEEPSEEK_API_KEY={Your deepseek api key (if using deepseek)} # not required
DEEPSEEK_BASE_URL=https://api.deepseek.com
PROMOTER_CALCULATOR_PATH=ext_repos/promoter-calculator/promoter-calculator
CELLO_UCF_ROOT=ext_repos/Cello-UCF
CELLO_ROOT=ext_repos/Cello-v2-1-Core
- For using a debugger such as in VS Code or Cursor, set the PYTHONPATH in the config to the project root:
"env": {
"PYTHONPATH": "${workspaceFolder}"
},
- Test the setup by running
python src/examples/agent/design_simple_circuit.pyfromgeneforgedirectory.
Tools/Core
RL/Fine-tuning
- openai
- art
- GRPO
Verilog Generation
- CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization Paper
- RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique Paper Model
Other