Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 47 additions & 14 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,21 @@
name: CI
name: SD Savior Pipeline

on:
push:
branches: ["**"]
pull_request:
push:
branches:
- main

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
validate:
name: Validate (lint, types, tests)
runs-on: ubuntu-latest
strategy:
fail-fast: false
Expand All @@ -24,6 +30,7 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: pip

- name: Install dependencies
run: |
Expand All @@ -36,14 +43,14 @@ jobs:
- name: Type check
run: mypy src

- name: Tests with coverage gate
- name: Tests
run: pytest -q

semver:
name: Release
name: Release (semantic-release)
runs-on: ubuntu-latest
needs: [validate]
if: github.ref == 'refs/heads/main'
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
permissions:
contents: write
env:
Expand All @@ -52,6 +59,7 @@ jobs:
released: ${{ steps.release.outputs.released }}
tag: ${{ steps.release.outputs.tag }}
commit_sha: ${{ steps.release.outputs.commit_sha }}

steps:
- name: Checkout (full history for tags)
uses: actions/checkout@v4
Expand All @@ -68,12 +76,14 @@ jobs:
git_committer_email: "41898282+github-actions[bot]@users.noreply.github.com"

docs-build:
name: Build docs
needs: [validate]
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: validate
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write

steps:
- name: Checkout
uses: actions/checkout@v4
Expand All @@ -82,6 +92,7 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: pip

- name: Install dependencies
run: |
Expand All @@ -100,29 +111,51 @@ jobs:
path: site

docs-deploy:
name: Deploy docs
needs: [docs-build]
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: docs-build
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}

steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

pypi-publish:
needs: semver
name: Upload release to PyPI
needs: [semver]
if: github.event_name == 'push' && github.ref == 'refs/heads/main' && needs.semver.outputs.released == 'true'
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/sdsavior
permissions:
id-token: write

steps:
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
- name: Checkout release tag
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ needs.semver.outputs.tag }}

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: pip

- name: Build tooling
run: python -m pip install -U build twine

- name: Build package
run: python -m build

- name: Upload package to PyPI
run: python -m twine upload dist/*
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,32 @@

<!-- version list -->

## v1.0.2 (2026-03-04)

### Bug Fixes

- **ci**: Pypy with twine
([`f4bd2c2`](https://github.com/well-it-wasnt-me/SDSavior/commit/f4bd2c2833b42478fb15be6ac2b63a7a17ee7250))

### Chores

- Line to long
([`fd9e506`](https://github.com/well-it-wasnt-me/SDSavior/commit/fd9e5069ff4e5c79ca96b8e2ab320de1d3c36bd9))

### Testing

- **coverage**: Extended test coverage
([`3f0b76a`](https://github.com/well-it-wasnt-me/SDSavior/commit/3f0b76a922c79593ce877fbf91df6e27dd98bff5))


## v1.0.1 (2026-03-04)

### Bug Fixes

- Harden open/recovery edge cases
([`f55a6f2`](https://github.com/well-it-wasnt-me/SDSavior/commit/f55a6f2a742f43790f0f8ad866a2d12b88963cc8))


## v1.0.0 (2026-02-28)

- Initial Release
Expand Down
7 changes: 5 additions & 2 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@ Dataclass storing persisted pointer state:

### Constructor

`SDSavior(data_path, meta_path, capacity_bytes, *, fsync_data=False, fsync_meta=True, json_dumps_kwargs=None, recover_scan_limit_bytes=None)`
`SDSavior(data_path, meta_path, capacity_bytes, *, fsync_data=False, fsync_meta=True, json_dumps_kwargs=None, recover_scan_limit_bytes=None, coalesce_max_records=None, coalesce_max_seconds=None)`

- `capacity_bytes` must be a multiple of 8 and at least 16 KiB.
- `json_dumps_kwargs` is copied internally.
- `recover_scan_limit_bytes` can cap recovery scanning.
- `coalesce_max_records` (optional): flush pending records after this count.
- `coalesce_max_seconds` (optional): flush pending records after this many seconds.

### Lifecycle

Expand All @@ -29,7 +31,8 @@ Dataclass storing persisted pointer state:
### Data Operations

- `append(obj) -> int`: append JSON object and return assigned sequence.
- `iter_records(from_seq=None)`: iterate `(seq, ts_ns, obj)` from tail to head.
- `iter_records(from_seq=None, skip_corrupt=False)`: iterate `(seq, ts_ns, obj)` from tail to head. When `skip_corrupt=True`, skips corrupt records and continues scanning for valid records instead of stopping. The scan searches up to the full ring capacity, handling corruption of any size.
- `_last_iter_skipped`: after `iter_records(skip_corrupt=True)`, contains the count of corrupt regions that were skipped.
- `export_jsonl(out_path, from_seq=None)`: write records to JSONL file.

### Internal Mechanics (for whomever wish to contribute)
Expand Down
53 changes: 53 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,20 @@ for seq, ts_ns, obj in rb.iter_records(from_seq=200):
...
```

## Skip Corrupt Records

By default, iteration stops at the first corrupt record. Use `skip_corrupt=True` to skip over corrupt regions and yield all valid records:

```python
for seq, ts_ns, obj in rb.iter_records(skip_corrupt=True):
print(seq, obj)

# Check how many corrupt regions were skipped
print(f"Skipped {rb._last_iter_skipped} corrupt region(s)")
```

This is useful for data recovery when partial corruption has occurred.

## Export JSONL

```python
Expand All @@ -49,3 +63,42 @@ with SDSavior("data.ring", "data.meta", 8 * 1024 * 1024) as rb:
- `recover_scan_limit_bytes=None` (default): scan up to capacity during recovery.

Use `fsync_data=True` when stronger durability is required and throughput tradeoffs are acceptable.

## Write Coalescing

Buffer multiple writes and flush them in batches to reduce I/O overhead:

```python
with SDSavior(
"data.ring", "data.meta", 8 * 1024 * 1024,
coalesce_max_records=100,
coalesce_max_seconds=1.0,
) as rb:
for i in range(1000):
rb.append({"sample": i})
# Remaining pending records are flushed on close
```

- `coalesce_max_records`: flush after this many buffered records.
- `coalesce_max_seconds`: flush after this many seconds since last flush.
- `flush()`: manually trigger a flush of pending records.

**Trade-offs**: Coalescing increases memory use (pending records are held in memory) and widens the data loss window (unflushed records are lost on crash). Use when write throughput matters more than per-record durability.

### Write Statistics

```python
stats = rb.write_stats
print(f"Logical appends: {stats['logical_appends']}")
print(f"Physical flushes: {stats['physical_flushes']}")
print(f"Bytes written: {stats['bytes_written']}")
```

### Pending Records in Iteration

By default, `iter_records()` only yields durable on-disk records. To include unflushed pending records:

```python
for seq, ts_ns, obj in rb.iter_records(include_pending=True):
print(seq, obj)
```
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "sdsavior"
version = "0.1.0"
version = "1.0.1"
description = "Crash-recoverable memory-mapped ring buffer for JSON records (SD-card friendly-ish)"
readme = "README.md"
requires-python = ">=3.11"
Expand Down Expand Up @@ -51,3 +51,4 @@ select = ["E", "F", "I", "UP", "B"]
[tool.pytest.ini_options]
addopts = "--cov=src/sdsavior --cov-report=term-missing --cov-fail-under=90"
testpaths = ["tests"]
pythonpath = ["src"]
Loading