Pipelines for Non-Target Screening (PINTS) – Core SQL schema for transparent and standardized storage of non-target screening (NTS) results.
This repository provides the universal SQL layer for NTS feature data, based on open standards and ontologies. It can be consumed from Python, R, Julia, or C++ through thin wrapper packages.
- Core schema: minimal mandatory entities for NTS (samples, runs, features).
- Ontology integration: PSI-MS terms and UO units mapped to each field.
- Integrity guardrails: foreign keys and data-quality checks keep datasets tidy.
- Extensible: optional plugins (e.g. intra-run components, DQS metrics).
- Portable: pure SQL, independent of programming language.
- Versioned: Semantic Versioning (
VERSIONfile,schema_infotable).
pints-core-sql/
├── sql/
│ ├── core/ # Minimal mandatory schema
│ └── seeds/ # Ontology + unit seeds
├── tests/ # Smoke test SQLs
├── docs/ # Contribution guidelines
├── VERSION # Schema version (SemVer)
├── LICENSE # MIT License
├── NOTICE # Third-party notices
└── CITATION.cff # Citation metadata
-- Create or open a database
.open pints.duckdb
-- Load the core schema
.read sql/core/pints_core_v1.sql
-- Seed ontology + units
.read sql/seeds/pints_core_seed_v1.sql
-- Optional: run smoke test
.read tests/smoke.sqlimport duckdb, pathlib
con = duckdb.connect("pints.duckdb")
# Load schema + seed
con.execute(pathlib.Path("sql/core/pints_core_v1.sql").read_text())
con.execute(pathlib.Path("sql/seeds/pints_core_seed_v1.sql").read_text())
print(con.execute("SHOW TABLES").fetchdf())
con.close()| Table | Purpose | Notable constraints |
|---|---|---|
samples |
Logical specimens (sample, blank, QC, standard) | sample_type whitelist |
runs |
Individual acquisitions tied to a sample | run_id unique, index on sample_id |
features |
Peak-level observations | Positive mz, non-negative rt/area, run + chromatogram indexes |
prop_dictionary |
Registry of semantic property keys | Namespace pattern check, uncertainty requires base key |
algo_properties |
Algorithm outputs with provenance | prop_value or value_text required, FK to dictionary and units |
schema_info |
Schema version manifest | Default timestamps, non-empty name |
More detail is available in docs/SCHEMA.md.
Additional functionality (e.g. intra-run components, aligned features, algorithm-specific metrics) is shipped as plugins in sql/plugins/
Wrappers in Python, R, Julia, and C++ will provide convenience APIs to load plugins.
- Load the schema:
.read sql/core/pints_core_v1.sql - Seed demo content (optional but recommended for tests):
.read sql/seeds/pints_core_seed_v1.sql - Run the smoke suite:
.read tests/smoke.sql
The smoke queries confirm that version metadata, ontology dictionaries, and demo features are in place.
docs/SCHEMA.md: Deep dive into every table, view, constraint, and index.docs/OPERATIONS.md: How to provision databases, reseed data, run tests, and manage migrations.docs/CONTRIBUTING.md: Workflow, style, and review expectations for contributors.
PINTS Core SQL is released under the MIT License. Additional notices: see NOTICE.
If you use this project in scientific work, please cite it as described in CITATION.cff. GitHub will also display the citation automatically.