Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: CI

on:
push:
branches: ["main", "claude/**"]
pull_request:
branches: ["main"]

jobs:
lint:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install ruff
run: pip install ruff
- name: Run ruff
run: ruff check src/

test:
name: Test (Python ${{ matrix.python-version }})
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libbz2-dev liblzma-dev libcurl4-openssl-dev
- name: Install package and dev dependencies
run: pip install -e ".[dev]"
# rpy2 (the [r] extra) requires R in PATH; omitted in CI.
# Tests that depend on rpy2 are skipped automatically via pytest.importorskip.
- name: Run tests
run: pytest --cov=src --cov-report=xml -v
- name: Upload coverage
uses: codecov/codecov-action@v4
if: matrix.python-version == '3.12'
with:
file: coverage.xml
fail_ci_if_error: false
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,13 @@
.DS*
*.pyc
__pycache__/
*.egg-info/
.eggs/
dist/
build/
.pytest_cache/
htmlcov/
.coverage
coverage.xml
.ruff_cache/
*.bak
19 changes: 19 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.4
hooks:
- id: ruff
args: [--fix]
- id: ruff-format

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-toml
- id: check-added-large-files
args: ["--maxkb=1000"]
- id: debug-statements
- id: check-merge-conflict
45 changes: 45 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added
- `pyproject.toml` with modern setuptools packaging configuration
- `requirements.txt` with pinned dependency ranges
- `tests/` directory with smoke tests for `qpcr` and `seqlib` modules
- GitHub Actions CI workflow for linting and testing
- `.pre-commit-config.yaml` with ruff and pre-commit-hooks
- `CHANGELOG.md` and `CONTRIBUTING.md`
- `ruff`, `pytest`, and `black` configuration in `pyproject.toml`
- `__version__` attribute to both `qpcr` and `seqlib` packages
- `__all__` export list to `seqlib/__init__.py`

### Changed
- Upgraded entire codebase from Python 2 to Python 3.12
- Replaced `seqlib/__init__.py` SHRiMP pipeline stub with proper package docstring and exports
- Expanded `qpcr/__init__.py` to expose all submodules (`abi`, `MinerMethod`, `qpcrAnalysis`, `util`)
- Removed dead `rasmus` library imports from `seqlib/util.py` (were already silently failing)
- Wrapped legacy `pygr` imports in `genomelib.py` and `pygrlib.py` with `try/except ImportError`
- Replaced `import sequencelib` with relative import in `genomelib.py`

### Deprecated
- `seqlib.genomelib` — requires the unmaintained `pygr` library; use `pysam` or `pybedtools` instead
- `seqlib.pygrlib` — experimental scratch file depending on `pygr`; not suitable for production use

## [0.2.0] — Python 3.12 upgrade

### Changed
- Full Python 2 → Python 3.12 migration across all modules
- Updated `print` statements to `print()` functions
- Modernised `dict.keys()`/`values()`/`items()` usage
- Fixed exception syntax (`except X as e`)
- Updated `urllib`/`urllib2` imports for Python 3
- Fixed integer division and string handling throughout

## [0.1.0] — Initial release

- Personal compbio utility library for sequence analysis and qPCR
94 changes: 94 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Contributing to biolib

## Development Setup

1. Clone the repository:

```bash
git clone https://github.com/gofflab/biolib.git
cd biolib
```

2. Create a virtual environment and install in editable mode with dev dependencies:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
```

3. Install pre-commit hooks:

```bash
pip install pre-commit
pre-commit install
```

## Running Tests

```bash
pytest
```

With coverage report:

```bash
pytest --cov=src --cov-report=html
open htmlcov/index.html
```

## Code Style

This project uses [ruff](https://docs.astral.sh/ruff/) for linting and formatting.

Check for issues:

```bash
ruff check src/
```

Auto-fix issues:

```bash
ruff check --fix src/
```

Format code:

```bash
ruff format src/
```

## Branch Naming

- Features: `feature/<short-description>`
- Bug fixes: `fix/<short-description>`
- Automated branches: `claude/<description>-<id>`

## Commit Messages

Use clear, imperative commit messages:

- `Add GTFlib support for GFF3 format`
- `Fix off-by-one error in intervallib.overlap()`
- `Upgrade seqlib to Python 3.12`

## Adding a New Module

1. Create the module in `src/seqlib/` or `src/qpcr/`
2. Add it to `__all__` in the corresponding `__init__.py`
3. Add smoke tests in `tests/test_seqlib.py` or `tests/test_qpcr.py`
4. Document it in `README.md` module table
5. Note the addition in `CHANGELOG.md` under `[Unreleased]`

## Dependency Notes

- **pygr**: Legacy genome database library — unmaintained and Python 2 only.
`seqlib.genomelib` and `seqlib.pygrlib` depend on it and are non-functional
in Python 3. Do not add new code using `pygr`.

- **rasmus**: Legacy utility library — not Python 3 compatible.
All `rasmus` references have been replaced with local implementations or removed.

- **rpy2**: Optional dependency for R integration. Required by `qpcr.qpcrAnalysis`
for ddCt analysis. Not required for pure-Python functionality.
172 changes: 169 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,170 @@
biolib
======
# biolib

Python library of my own personal compbio utils
Personal computational biology utility library for sequence analysis and qPCR data
processing, built for Python 3.12+.

## Installation

```bash
pip install -e ".[dev]"
```

### Requirements

- Python >= 3.12
- numpy >= 1.26
- scipy >= 1.12
- pysam >= 0.22
- rpy2 >= 3.5 (required for R-based qPCR analysis and enrichment functions)

## Modules

### `seqlib` — Sequence Analysis Utilities

A broad collection of bioinformatics tools for next-generation sequencing analysis.

| Module | Description |
|-------------------------|--------------------------------------------------|
| `seqlib.stats` | Statistical functions for genomic data |
| `seqlib.util` | General-purpose utility functions |
| `seqlib.seqlib` | Core sequence manipulation |
| `seqlib.seqstats` | Sequence-level statistics |
| `seqlib.intervallib` | Genomic interval operations |
| `seqlib.mySam` | SAM/BAM file handling |
| `seqlib.GTFlib` | GTF/GFF annotation parsing |
| `seqlib.algorithms` | Common bioinformatics algorithms |
| `seqlib.prob` | Probability distributions |
| `seqlib.JensenShannon` | Jensen-Shannon divergence |
| `seqlib.Alignment` | Sequence alignment utilities |
| `seqlib.Chip` | ChIP-seq analysis tools |
| `seqlib.clustering` | Clustering algorithms |
| `seqlib.converters` | Format conversion utilities |
| `seqlib.bowtie` | Bowtie aligner wrappers |
| `seqlib.bwa` | BWA aligner wrappers |
| `seqlib.LSFlib` | LSF cluster job submission |
| `seqlib.QCtools` | Quality control tools |
| `seqlib.RIPDiff` | RIP-seq differential analysis |
| `seqlib.continuousData` | Continuous data representation and operations |
| `seqlib.blockIt` | Block-based data iteration |
| `seqlib.misc` | Miscellaneous helper functions |

### `qpcr` — qPCR Analysis

Tools for quantitative PCR data processing and analysis.

| Module | Description |
|----------------------|----------------------------------------------|
| `qpcr.abi` | ABI instrument file parsing |
| `qpcr.qpcrAnalysis` | ddCt analysis and qPCR workflows |
| `qpcr.MinerMethod` | Miner method for PCR efficiency estimation |
| `qpcr.util` | Utility functions for qPCR data |

## Usage Examples

### Parse a GTF annotation file

```python
from seqlib import GTFlib

gtf = GTFlib.GTFReader("annotation.gtf")
for gene in gtf:
print(gene.gene_id, gene.chrom, gene.start, gene.end)
```

### Compute Jensen-Shannon divergence

```python
from seqlib.JensenShannon import JS_divergence

p = [0.25, 0.25, 0.25, 0.25]
q = [0.50, 0.50, 0.00, 0.00]
divergence = JS_divergence(p, q)
print(divergence)
```

### Work with genomic intervals

```python
from seqlib import intervallib

interval = intervallib.Interval("chr1", 1000, 2000, strand="+")
print(interval.length())
```

### Load ABI qPCR results

```python
from qpcr import abi

data = abi.parseABIResults("results.txt", "cycleData.txt")
```

### Run ddCt qPCR analysis

```python
from qpcr import qpcrAnalysis

results = qpcrAnalysis.ddCtAnalysis(
data_file="results.txt",
endogenous_control="GapDH",
reference_sample="control"
)
```

## Development

### Setup

```bash
git clone https://github.com/gofflab/biolib.git
cd biolib
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest
```

With coverage:

```bash
pytest --cov=src --cov-report=html
```

### Linting and Formatting

```bash
# Check for issues
ruff check src/

# Auto-fix issues
ruff check --fix src/

# Format code
ruff format src/
```

### Pre-commit Hooks

```bash
pip install pre-commit
pre-commit install
```

## Project Structure

```
biolib/
├── src/
│ ├── qpcr/ # qPCR analysis modules
│ └── seqlib/ # Sequence analysis modules
├── tests/ # Test suite
├── pyproject.toml # Package configuration
└── requirements.txt # Pinned dependencies
```

## License

MIT
Loading
Loading