Skip to content

Add methbat map for filtering and annotation#127

Draft
inemesb wants to merge 3 commits intomasterfrom
add-methbat-map
Draft

Add methbat map for filtering and annotation#127
inemesb wants to merge 3 commits intomasterfrom
add-methbat-map

Conversation

@inemesb
Copy link
Contributor

@inemesb inemesb commented Feb 13, 2026

Description

This PR adds a mapping file that can be used to filter and annotate the methbat output based on regions, independent of the cpg_label column. This would create an output file that is better for Scout implementation.
Since the cpg_label would no longer be used for parsing, I have also included more information from the NanoImprint repo in this column for both the regions file and methbat background file.

Added

  • methbat mapping file for filtering and annotation

Changed

  • cpg_label in regions file to contain more information from nanoimprint
  • cpg_label in background file to contain more information from nanoimprint

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_[TOOL]-t [TOOL] -b [THIS-BRANCH-NAME] -a

How to test

  • Do ...

Expected test outcome

  • Check that ...
  • Take a screenshot and attach or copy/paste the output.

Review

  • Tests executed by
  • "Merge and deploy" approved by
    Thanks for filling in who performed the code review and the test!

This version is a

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Document in ...
  • Deploy this branch on ...
  • Inform to ...

@inemesb
Copy link
Contributor Author

inemesb commented Feb 13, 2026

I will add some more information on creating the mapping file in this PR draft, hopefully it will be useful for later documentation.

ANNOTATION NOTES

  • The mapping file was based on the original nallo/region/grch38_methbat_promoters_nanoimprint.tsv, adding three columns to the regions: hgnc_symbol, hgnc_id, and region_type (promoter or nanoimprint)
  • The hgnc_symbol was annotated based on the gene for each promoter region or based on the gene annotations from nanoimprint, then corresponding HGNC IDs were found, and region_type was based on the source of each region.
  • Nanoimprint regions GNAS-NESP GNASA/B GNASXL were all annotated with the same symbol GNAS and ID 4392.
  • The two MEG3 regions from nanoimprint were also given the same symbol and ID 14575.

FILTERING NOTES

  • All rows with hgnc ID not found were removed from the map.
  • For redundant genes found in both nanoimprint and promoter regions, promoter region rows were removed, as nanoimprint regions were prioritized due to having more evidence of ASM. An exception is the IGF2 promoter window, since the nanoimprint region is really between IGF2 and H19.
  • This filtering removed the following 143 regions which are present in nallo/region/grch38_methbat_promoters_nanoimprint.tsv but not in the map:
chrom start end hgnc_symbol hgnc_id region_type
chr6 144008260 144009259 PLAGL1 9046 promoter
chr7 50782897 50783896 GRB10 4564 promoter
chr7 130491085 130492084 MEST 7028 promoter
chr15 23647868 23648867 MAGEL2 6814 promoter
chr15 24953987 24954986 SNURF 11171 promoter
chr15 24953987 24954986 SNRPN 11164 promoter
chr19 56840727 56841726 PEG3 8826 promoter
chr20 58890421 58891420 GNAS 4392 promoter
chr1 17011758 17012757 ENSG00000288636 NA promoter
chr1 21546404 21547403 ENSG00000289715 NA promoter
chr1 23094377 23095376 ENSG00000293565 NA promoter
chr1 30576826 30577825 ENSG00000289710 NA promoter
chr1 31919629 31920628 ENSG00000288678 NA promoter
chr1 41628817 41629816 ENSG00000284895 NA promoter
chr1 53440019 53441018 ENSG00000293253 NA promoter
chr1 89578592 89579591 ENSG00000288629 NA promoter
chr1 147610590 147611589 ENSG00000288626 NA promoter
chr1 155637605 155638604 ENSG00000287839 NA promoter
chr1 174022299 174023298 ENSG00000293562 NA promoter
chr1 179881316 179882315 ENSG00000293564 NA promoter
chr1 203801094 203802093 ENSG00000288644 NA promoter
chr2 111120153 111121152 ENSG00000293584 NA promoter
chr2 202376112 202377111 ENSG00000289490 NA promoter
chr2 236507477 236508476 IQCA1 NA promoter
chr3 10248042 10249041 ENSG00000289763 NA promoter
chr3 40311086 40312085 ENSG00000291315 NA promoter
chr3 88058217 88059216 ENSG00000288654 NA promoter
chr3 101685827 101686826 ENSG00000293268 NA promoter
chr3 195909562 195910561 ENSG00000289747 NA promoter
chr4 6070163 6071162 ENSG00000284684 NA promoter
chr4 41143100 41144099 ENSG00000289761 NA promoter
chr5 21484012 21485011 ENSG00000233974 NA promoter
chr5 34189156 34190155 ENSG00000215156 NA promoter
chr5 69631080 69632079 ENSG00000250138 NA promoter
chr5 69903086 69904085 ENSG00000251158 NA promoter
chr5 70486719 70487718 ENSG00000293691 NA promoter
chr5 70507196 70508195 ENSG00000293689 NA promoter
chr5 99525362 99526361 ENSG00000206356 NA promoter
chr5 100387975 100388974 ENSG00000273957 NA promoter
chr5 140563828 140564827 ENSG00000293600 NA promoter
chr5 176083963 176084962 ENSG00000289731 NA promoter
chr5 179521483 179522482 ENSG00000251545 NA promoter
chr6 11093194 11094193 ENSG00000293642 NA promoter
chr6 16328432 16329431 ENSG00000288708 NA promoter
chr6 30933523 30934522 ENSG00000310558 NA promoter
chr6 31621774 31622773 ENSG00000291302 NA promoter
chr6 31622874 31623873 ENSG00000289282 NA promoter
chr6 53041386 53042385 ENSG00000288614 NA promoter
chr6 53064602 53065601 ENSG00000288646 NA promoter
chr6 63571480 63572479 ENSG00000285976 NA promoter
chr6 68634890 68635889 ENSG00000288712 NA promoter
chr6 133952304 133953303 ENSG00000288529 NA promoter
chr7 38273637 38274636 TARP NA promoter
chr7 100478171 100479170 ENSG00000289760 NA promoter
chr7 105039848 105040847 ENSG00000289360 NA promoter
chr7 105039858 105040857 ENSG00000288914 NA promoter
chr7 112449487 112450486 ENSG00000288634 NA promoter
chr7 150404676 150405675 ENSG00000284691 NA promoter
chr7 151060928 151061927 ENSG00000288608 NA promoter
chr8 32646202 32647201 ENSG00000286131 NA promoter
chr8 102863861 102864860 ENSG00000289653 NA promoter
chr8 109538726 109539725 ENSG00000289767 NA promoter
chr9 34666046 34667045 ENSG00000187186 NA promoter
chr9 87955214 87956213 ENSG00000283205 NA promoter
chr9 130712626 130713625 ENSG00000288570 NA promoter
chr9 131429129 131430128 ENSG00000291303 NA promoter
chr9 131431759 131432758 ENSG00000288841 NA promoter
chr9 135613183 135614182 ENSG00000236543 NA promoter
chr9 137216494 137217493 ENSG00000284976 NA promoter
chr10 66925308 66926307 ENSG00000289325 NA promoter
chr11 804420 805419 ENSG00000293685 NA promoter
chr11 2149604 2150603 ENSG00000284779 NA promoter
chr11 64240095 64241094 ENSG00000286264 NA promoter
chr12 13103485 13104484 ENSG00000289766 NA promoter
chr12 13980896 13981895 ENSG00000293563 NA promoter
chr12 31748391 31749390 ENSG00000300510 NA promoter
chr12 53240900 53241899 ENSG00000283536 NA promoter
chr12 57520481 57521480 ENSG00000285133 NA promoter
chr12 111841237 111842236 ENSG00000292259 NA promoter
chr12 113221095 113222094 IQCD NA promoter
chr12 120533697 120534696 ENSG00000288623 NA promoter
chr12 122226565 122227564 ENSG00000284934 NA promoter
chr13 63745741 63746740 ENSG00000237378 NA promoter
chr13 63833054 63834053 ENSG00000285566 NA promoter
chr13 77326989 77327988 ENSG00000288716 NA promoter
chr14 61320954 61321953 ENSG00000310561 NA promoter
chr14 92627967 92628966 ENSG00000293569 NA promoter
chr14 103333237 103334236 ENSG00000291313 NA promoter
chr15 50686718 50687717 ENSG00000288645 NA promoter
chr15 52044846 52045845 ENSG00000289025 NA promoter
chr15 81001142 81002141 ENSG00000288625 NA promoter
chr16 3003132 3004131 ENSG00000289281 NA promoter
chr16 11527248 11528247 ENSG00000188897 NA promoter
chr16 30052151 30053150 ENSG00000285043 NA promoter
chr16 30610352 30611351 ENSG00000289491 NA promoter
chr16 89418293 89419292 ENSG00000288715 NA promoter
chr17 6640317 6641316 ENSG00000282936 NA promoter
chr17 76569649 76570648 ENSG00000284526 NA promoter
chr17 80544075 80545074 ENSG00000289764 NA promoter
chr18 21111140 21112139 ENSG00000293575 NA promoter
chr19 20357163 20358162 ENSG00000293570 NA promoter
chr19 39831957 39832956 ENSG00000291312 NA promoter
chr19 52690497 52691496 ENSG00000269825 NA promoter
chr20 10434223 10435222 ENSG00000285508 NA promoter
chr20 10434223 10435222 ENSG00000285723 NA promoter
chr20 32190361 32191360 ENSG00000293164 NA promoter
chr21 34072576 34073575 ENSG00000293606 NA promoter
chr21 43938919 43939918 ENSG00000288593 NA promoter
chr21 46325288 46326287 ENSG00000286224 NA promoter
chr22 39503231 39504230 MIURF NA promoter
chrX 10576956 10577955 ENSG00000291314 NA promoter
chrX 23782278 23783277 ENSG00000288706 NA promoter
chrX 40833445 40834444 ENSG00000300293 NA promoter
chrX 45729995 45730994 ENSG00000310562 NA promoter
chrX 63754665 63755664 ENSG00000288661 NA promoter
chrX 120878925 120879924 ENSG00000278646 NA promoter
chrX 135252012 135253011 ENSG00000293661 NA promoter
chrX 135306259 135307258 ENSG00000293662 NA promoter
chrX 135392208 135393207 ENSG00000293663 NA promoter
chrX 149415507 149416506 ENSG00000287585 NA promoter
GL000009.2 58377 59376 ENSG00000278704 NA promoter
GL000194.1 115019 116018 ENSG00000277400 NA promoter
GL000195.1 49165 50164 ENSG00000276256 NA promoter
GL000213.1 139656 140655 ENSG00000277630 NA promoter
GL000218.1 54894 55893 ENSG00000278384 NA promoter
GL000219.1 83312 84311 ENSG00000273748 NA promoter
KI270711.1 29627 30626 ENSG00000271254 NA promoter
KI270713.1 32529 33528 ENSG00000277475 NA promoter
KI270713.1 34407 35406 ENSG00000268674 NA promoter
KI270721.1 1585 2584 ENSG00000276345 NA promoter
KI270726.1 25241 26240 ENSG00000277856 NA promoter
KI270726.1 40444 41443 ENSG00000275063 NA promoter
KI270727.1 167568 168567 ENSG00000276760 NA promoter
KI270727.1 371322 372321 ENSG00000275249 NA promoter
KI270727.1 385278 386277 ENSG00000274792 NA promoter
KI270728.1 16234 17233 ENSG00000274175 NA promoter
KI270728.1 936468 937467 ENSG00000275869 NA promoter
KI270728.1 1147869 1148868 ENSG00000273554 NA promoter
KI270728.1 1269984 1270983 ENSG00000277836 NA promoter
KI270731.1 13002 14001 ENSG00000278633 NA promoter
KI270734.1 71411 72410 ENSG00000276017 NA promoter
KI270734.1 130494 131493 ENSG00000278817 NA promoter
KI270734.1 161853 162852 ENSG00000277196 NA promoter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant