Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b0f1bdd
chores: adding import.report to gitignore
njmmatthieu Dec 8, 2025
32371ea
fix: changing the name of the neo4j database.
njmmatthieu Dec 10, 2025
41727c1
fix: change single_nucleotide_variants to short_mutations
njmmatthieu Dec 10, 2025
dbb1402
fix: change functional_outcome to gene_status
njmmatthieu Dec 10, 2025
cf13bec
fix: change edge_label and name for gene_ontology adapters
njmmatthieu Dec 10, 2025
c23a1f4
test echo in leaking data
njmmatthieu Dec 15, 2025
5fff0f7
faxe patient id
njmmatthieu Dec 15, 2025
6f2b657
fix: change copy_number_alterations to copy_number_amplifications and…
njmmatthieu Dec 17, 2025
62a9d98
feat: adding gene_summary property to gene node
njmmatthieu Dec 17, 2025
c5bec4a
fix: cleaning adapter for local annotation of copy number amplifications
njmmatthieu Dec 17, 2025
e1b86d6
fix: remove alteration affects gene edge
njmmatthieu Dec 18, 2025
644e3bf
fix: fix make.sf for MacOS
njmmatthieu Dec 18, 2025
1f0e2ff
fix: change meta property name to data_source
njmmatthieu Dec 18, 2025
6a28c69
feat(validation): validator for short mutations external
njmmatthieu Jan 5, 2026
b10cdb1
fix: validator for short mutations external
njmmatthieu Jan 5, 2026
078ecaa
fix: validator for short mutations local
njmmatthieu Jan 6, 2026
be1df8c
fix: validator for cna external
njmmatthieu Jan 6, 2026
ce3035c
fix: validators for copy number amplification
njmmatthieu Jan 9, 2026
7f47c74
feat: pretty hooks with pre-commit
njmmatthieu Jan 9, 2026
77c3c53
fix: poetry and uv dependencies and add poetry.lock to .gitignore
njmmatthieu Jan 9, 2026
1cb936a
adding build to .gitignore
njmmatthieu Jan 9, 2026
c843f6e
change ontoweaver version to 1.0.0
njmmatthieu Jan 9, 2026
259246f
fix: comming the import of adapters __init__.py to have a true first …
njmmatthieu Jan 9, 2026
c56f2ea
tmp: Adding newest version of ontoweaver from Marko's forked repository
njmmatthieu Jan 12, 2026
ba045be
fix: removed tail ontologies to correct the hierarchy types for gene
njmmatthieu Jan 12, 2026
d55b564
fix: change outcome to phenomenon to correct the hierarchy types for …
njmmatthieu Jan 12, 2026
b522517
style: shortened pre-commit hooks
njmmatthieu Jan 12, 2026
410f590
style: removing lists when there is a single entry in adapters
njmmatthieu Jan 12, 2026
557f9d9
fix: dashes instead of underscores for long flags
njmmatthieu Jan 14, 2026
c9b3898
feat: new released version of ontoweaver 1.2.0
njmmatthieu Jan 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,18 @@
.venv
.vscode
biocypher-*
*__pycache__*
neo4j.pass
backup
data
*.DS_Store
notebooks
tmp.sh
tmp.sh

import.report

data

.venv
poetry.lock
uv.lock
*.egg-info
build
33 changes: 33 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
repos:
- repo: local
hooks:
- id: patient_id
name: Check for patient ids starting with "D", "H" or "M".
always_run: true
args: [--multiline]
entry: '[DHM][0-9]{3}'
language: pygrep
# - id: M-code-patient_id
# name: Check for patient ids starting with an "M".
# always_run: true
# args: [--multiline]
# entry: 'M[0-9]{3}'
# language: pygrep
# - id: D-code-patient_id
# name: Check for patient ids starting with an "D".
# always_run: true
# args: [--multiline]
# entry: 'D[0-9]{3}'
# language: pygrep
- id: OC-code-patient_id
name: Check for patient ids starting with "OC".
always_run: true
args: [--multiline]
entry: 'OC[0-9]{3}'
language: pygrep
- id: EOC-patient_code
name: Check for EOC patient codes.
always_run: true
args: [--multiline]
entry: 'EOC[0-9]{3}'
language: pygrep
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,16 @@ shell` inside the project directory.

If you have a problem with the poetry install command, it may be that the 'poetry lock' command has not been ran after changing dependencies modification in '$ONCODASHKB_HOME/pyproject.toml'. Try running 'poetry lock' to fix the issue.

### Requirements for preventing publishing patient ids

Please, install [pre-commit](https://pre-commit.com/) hooks before committing or pushing anything new:

```
pre-commit install
pre-commit install --hook-type pre-push
pre-commit install --hook-type commit-msg
```

### Database

Theoretically, any graph database supported by Biocypher may be used.
Expand Down
18 changes: 9 additions & 9 deletions config/biocypher_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@ biocypher:
url: https://github.com/biolink/biolink-model/raw/v3.2.1/biolink-model.owl.ttl
root_node: entity

tail_ontologies:
so_variant:
url: http://purl.obolibrary.org/obo/so.owl
head_join_node: sequence variant
tail_join_node: sequence_variant
so_feature:
url: http://purl.obolibrary.org/obo/so.owl
head_join_node: attribute
tail_join_node: sequence_feature
# tail_ontologies:
# so_variant:
# url: http://purl.obolibrary.org/obo/so.owl
# head_join_node: sequence variant
# tail_join_node: sequence_variant
# so_feature:
# url: http://purl.obolibrary.org/obo/so.owl
# head_join_node: attribute
# tail_join_node: sequence_feature
# go_biological_process:
# url: http://purl.obolibrary.org/obo/go.owl
# head_join_node: biological process
Expand Down
17 changes: 5 additions & 12 deletions config/schema_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,15 @@ alteration:
properties:
citation_PM_ids: str
consequence: str
homogenous: str
level_of_evidence: str
mutation_effect_description: str
name: str
oncogenic: str
reference_genome: str
tumor_type: str
tumor_type_summary: str
variant_summary: str
name: str

disease:
represented_as: node
Expand All @@ -33,7 +34,7 @@ drug:
name: str

gene status:
is_a: outcome
is_a: phenomenon
represented_as: node
label_in_input: gene_status
properties:
Expand All @@ -43,6 +44,7 @@ gene:
represented_as: node
label_in_input: gene
properties:
gene_summary: str
name: str

# protein:
Expand Down Expand Up @@ -112,7 +114,7 @@ sample carries alteration:
# A gene is linked to its gene status (gain or loss of function),
# which are represented as nodes, so as to allow a causal path
# to go through alteration -> gene status -> transcript activity.
# Hence, outcomes have at least two instances:
# Hence, phenomenons have at least two instances:
# - Gene:GoF, and
# - Gene:LoF.

Expand Down Expand Up @@ -155,15 +157,6 @@ patient has chronic illness:

### AFFECTS

alteration affects gene:
is_a: affects
represented_as: edge
label_in_input: alteration_affects_gene
source: alteration
taregt: gene
properties:
name: str

gene status affects gene:
is_a: affects
represented_as: edge
Expand Down
6 changes: 0 additions & 6 deletions hooks/install_hooks.sh

This file was deleted.

30 changes: 0 additions & 30 deletions hooks/leaking-data

This file was deleted.

30 changes: 0 additions & 30 deletions hooks/pre-commit

This file was deleted.

64 changes: 0 additions & 64 deletions hooks/pre-push

This file was deleted.

31 changes: 8 additions & 23 deletions make_macos.sh
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
#!/usr/bin/bash

# When using Neo4j installed on system (like Ubuntu's packaged version),
# the current directory must be writable by user "neo4j",
# and all parent directories must be executable by "other".
# Every interaction with the database must be done by user "neo4j",
# and the import will try to write reports in the current directory.
# NEO_USER="sudo -u neo4j"

# Exit on error.
set -e
set -o pipefail
Expand All @@ -22,9 +15,6 @@ if [[ "$2" == "debug" ]] ; then
weave_args="-v DEBUG"
fi

# export JAVA_HOME="/usr/lib/jvm/java-21-openjdk-amd64"


echo "Activate virtual environment..." >&2
source $(poetry env info --path)/bin/activate

Expand All @@ -43,16 +33,12 @@ fi
echo "Weave data..." >&2

cmd="python3 ${py_args} $(pwd)/weave.py --verbose INFO \
--clinical $data_dir/DECIDER/$data_version/clinical_export.csv \
--single_nucleotide_variants $data_dir/DECIDER/$data_version/snv_external.csv \
--copy_number_alterations $data_dir/DECIDER/$data_version/cna_external.csv \
--gene_ontology_genes $data_dir/DECIDER/$data_version/OncoKB_gene_symbols.conf \
--oncokb $data_dir/DECIDER/$data_version/treatments.csv \
--gene_ontology $data_dir/GO/goa_human.gaf.gz \
--gene_ontology_owl $data_dir/GO/go.owl \
--gene_ontology_reverse
--clinical $data_dir/DECIDER/$data_version/clinical_export.csv \
--short-mutations-local $data_dir/DECIDER/$data_version/snv_local.csv \
--short-mutations-external $data_dir/DECIDER/$data_version/snv_external.csv \
--copy-number-amplifications-local $data_dir/DECIDER/$data_version/cna_local.csv \
--copy-number-amplifications-external $data_dir/DECIDER/$data_version/cna_external.csv \
${weave_args}" # \
#--small_molecules $data_dir/omnipath_networks/omnipath_webservice_interactions__small_molecule_interactions_filtered.tsv.gz"

echo "Weaving command:" >&2
echo "$cmd" >&2
Expand All @@ -63,15 +49,14 @@ $cmd > tmp.sh
if [[ "$2" != "debug" ]] ; then
echo "Run import script..." >&2
chmod a+x $(cat tmp.sh)
$(cat tmp.sh) | tee /dev/tty | ${NEO_USER} sh
sh $(cat tmp.sh)

echo "Restart Neo4j..." >&2
$server start
sleep 5

echo "Send a test query..." >&2
sudo -u neo4j cypher-shell --username neo4j --database oncodash --password $(cat neo4j.pass) "MATCH (p:Patient) RETURN p LIMIT 20;"
cypher-shell --username neo4j --database oncodash --password $(cat neo4j.pass) "MATCH (p:Patient) RETURN p LIMIT 20;"
fi

echo "Done" >&2

echo "Done" >&2
5 changes: 2 additions & 3 deletions oncodashkb/adapters/clinical.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
row:
map:
columns:
- cohort_code
column: cohort_code
to_subject: patient
transformers:
- map:
columns: survival
column: survival
to_property: survival
for_object: patient
- map:
Expand Down
Loading