Skip to content

odea-project/pints-core-sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PINTS Core SQL

Pipelines for Non-Target Screening (PINTS) – Core SQL schema for transparent and standardized storage of non-target screening (NTS) results.

This repository provides the universal SQL layer for NTS feature data, based on open standards and ontologies. It can be consumed from Python, R, Julia, or C++ through thin wrapper packages.


📌 Features

  • Core schema: minimal mandatory entities for NTS (samples, runs, features).
  • Ontology integration: PSI-MS terms and UO units mapped to each field.
  • Integrity guardrails: foreign keys and data-quality checks keep datasets tidy.
  • Extensible: optional plugins (e.g. intra-run components, DQS metrics).
  • Portable: pure SQL, independent of programming language.
  • Versioned: Semantic Versioning (VERSION file, schema_info table).

📂 Repository structure

pints-core-sql/
├── sql/
│   ├── core/        # Minimal mandatory schema
│   └── seeds/       # Ontology + unit seeds
├── tests/           # Smoke test SQLs
├── docs/            # Contribution guidelines
├── VERSION          # Schema version (SemVer)
├── LICENSE          # MIT License
├── NOTICE           # Third-party notices
└── CITATION.cff     # Citation metadata

🚀 Quickstart

Using DuckDB CLI

-- Create or open a database
.open pints.duckdb

-- Load the core schema
.read sql/core/pints_core_v1.sql

-- Seed ontology + units
.read sql/seeds/pints_core_seed_v1.sql

-- Optional: run smoke test
.read tests/smoke.sql

Using Python

import duckdb, pathlib

con = duckdb.connect("pints.duckdb")

# Load schema + seed
con.execute(pathlib.Path("sql/core/pints_core_v1.sql").read_text())
con.execute(pathlib.Path("sql/seeds/pints_core_seed_v1.sql").read_text())

print(con.execute("SHOW TABLES").fetchdf())
con.close()

🧱 Schema at a glance

Table Purpose Notable constraints
samples Logical specimens (sample, blank, QC, standard) sample_type whitelist
runs Individual acquisitions tied to a sample run_id unique, index on sample_id
features Peak-level observations Positive mz, non-negative rt/area, run + chromatogram indexes
prop_dictionary Registry of semantic property keys Namespace pattern check, uncertainty requires base key
algo_properties Algorithm outputs with provenance prop_value or value_text required, FK to dictionary and units
schema_info Schema version manifest Default timestamps, non-empty name

More detail is available in docs/SCHEMA.md.

🔌 Extensions

Additional functionality (e.g. intra-run components, aligned features, algorithm-specific metrics) is shipped as plugins in sql/plugins/

Wrappers in Python, R, Julia, and C++ will provide convenience APIs to load plugins.

🧪 Testing

  1. Load the schema: .read sql/core/pints_core_v1.sql
  2. Seed demo content (optional but recommended for tests): .read sql/seeds/pints_core_seed_v1.sql
  3. Run the smoke suite: .read tests/smoke.sql

The smoke queries confirm that version metadata, ontology dictionaries, and demo features are in place.

📖 Documentation

  • docs/SCHEMA.md: Deep dive into every table, view, constraint, and index.
  • docs/OPERATIONS.md: How to provision databases, reseed data, run tests, and manage migrations.
  • docs/CONTRIBUTING.md: Workflow, style, and review expectations for contributors.

📜 License & Citation

PINTS Core SQL is released under the MIT License. Additional notices: see NOTICE.

If you use this project in scientific work, please cite it as described in CITATION.cff. GitHub will also display the citation automatically.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published