Gene Forge: Automated Genetic Circuit Design and Optimization

This repository contains tools and frameworks for developing and training AI-agents for genetic engineering workflows.

ModelClient

The ModelClient provides an interface to an LLM of choice (currently only from openai or a local model via vllm).

State Management: The system uses a SessionState object to manage the context (like the currently selected library, custom UCF files, etc.) across multiple tool calls within a single user design request..

Tool Use

Several tools will be used to design and optimize genetic circuits. These tools are be wrapped in an integration layer which will be used by the agent. Listed below are some of the key tools that will be used:

Cello - includes library management tools for adding/editing parts in a library files and a tool for kicking off Cello's design algorithm
SynBioHub - includes tools to search and download parts from SynBioHub
Part Optimization - ProD for promoter design, RBS Calculator for RBS design
Scientific utilities - Count GC in a sequence, run sequence similarity, literature search

Workflows

A Workflow defines:

A natural language prompt describing a task
A get_metrics functions for computing the metrics based on the generated chat history and session state. This can be used for computing a reward or score for various forms of RL via policy-optimization.
A check_finished function for optionally terminating the chat stream upon some condition
An optional GRADING_RUBRIC - instructions for assigning a grade/reward to the rolled our workflow

Workflows can be run which coordinates a multi-turn conversations with the ModelClient. Upon finishing, get_metrics() can be used.

ArtAdapter

An ArtAdapter object wraps the Workflow and exposes the async rollout function. This allows multiple runs to execute in parallel and can be dropped into art training scheme to fine-tune an LLM via GRPO.
The optional GRADING_RUBRIC can be used in conjunction with LLM judges to compute reward (e.g. ART's RULER)

Examples and use cases:

Examples of prompts can be found in examples_and_prompts. These range from simple to complex to aspirational and are designed to guide the implementation of this project and outline the vision of more sophisticated systems.

Fine-tuning

The workflows provide a convenient means of "rolling" out scenarios and comparing outcomes. Their primary intention is to easily enable RL-training on the various implemented tools.

UI

A basic streamlit UI allows for interactive chat sessions with an LLM agent (openai and models served via vllm supported)

Examples

Example scripts are provided in the examples directory.

Minimal Setup steps

git clone the repo
cd geneforge
virtualenv venv --python=3.12 # create a virtual env -- Important: Use python version <3.13
source venv/bin/activate # activate the environment
pip install -r requirements.txt # install project requirements
mkdir ext_repos # create a directory to hold external repos
cd ext_repos # cd into the created repo
git clone https://github.com/CIDARLAB/Cello-UCF.git # clone cello libs
git clone https://github.com/CIDARLAB/Cello-v2-1-Core.git # clone cello
git clone https://github.com/barricklab/promoter-calculator.git # clone promoter calculator
pip install -r ext_repos/Cello-v2-1-Core/requirements.txt # install cello requirements
pip install -r ext_repos/Cello-UCF/requirements.txt # install Cello-UCF requirements
pip install -r ext_repos/promoter-calculator/requirements.txt # install promoter-calculator requirements
cd .. # cd back into project root (geneforge directory)
mkdir logs # create the logs folder
touch .env # create file .env in geneforge folder (project root) to hold environment variables
Add the following keys to .env:

OPENAI_API_KEY={Your open ai api key}   # this or deepseek api key required to use LLMs
DEEPSEEK_API_KEY={Your deepseek api key (if using deepseek)} # not required
DEEPSEEK_BASE_URL=https://api.deepseek.com

PROMOTER_CALCULATOR_PATH=ext_repos/promoter-calculator/promoter-calculator
CELLO_UCF_ROOT=ext_repos/Cello-UCF
CELLO_ROOT=ext_repos/Cello-v2-1-Core

For using a debugger such as in VS Code or Cursor, set the PYTHONPATH in the config to the project root:

      "env": {
            "PYTHONPATH": "${workspaceFolder}"
      },

Test the setup by running python src/examples/agent/design_simple_circuit.py from geneforge directory.

Notable References/Links

Tools/Core

Cello Github
Cello Libs Github
Promoter Calculator Github

RL/Fine-tuning

openai
art
GRPO

Verilog Generation

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization Paper
RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique Paper Model

Other

Anatomical Compiler Paper
CLASSIC: Ultra-high throughput mapping of genetic design space Paper

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
docs		docs
ext_repos		ext_repos
notebooks		notebooks
prompts		prompts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements-fe.txt		requirements-fe.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gene Forge: Automated Genetic Circuit Design and Optimization

ModelClient

Tool Use

Workflows

Examples and use cases:

Fine-tuning

UI

Examples

Minimal Setup steps

Notable References/Links

About

Uh oh!

Releases

Packages

Uh oh!

Languages

jordanlgraves/geneforge

Folders and files

Latest commit

History

Repository files navigation

Gene Forge: Automated Genetic Circuit Design and Optimization

ModelClient

Tool Use

Workflows

Examples and use cases:

Fine-tuning

UI

Examples

Minimal Setup steps

Notable References/Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages