Skip to content

Latest commit

 

History

History
70 lines (62 loc) · 4.37 KB

File metadata and controls

70 lines (62 loc) · 4.37 KB

Summary

Simple script to check nearby genes using UCSC's Public Genome Browser's MySQL database given the chromosome, txStart, and txEnd starting reference points

Based on code from: http://genomewiki.ucsc.edu/index.php/Finding_nearby_genes

Requirements

  • python 3.x
  • pymysql
  • BeautifulTable

Configuration

Edit the variables between # CONFIGURE BELOW and # END CONFIG

Example output

closest 10 upstream transcripts from chr1:991973-991973 in hg19 for refGene set
Note: for reverse - strand items, txEnd is the 5' end, the transcription start site
+-------+---------+--------+--------+-----------------------------+------------+
| chrom | txStart | txEnd  | strand |            name             | geneSymbol |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 955502  | 991499 |   +    |          NM_198576          |    AGRN    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 948876  | 949919 |   +    |          NM_005101          |   ISG15    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 934341  | 935552 |   -    |          NM_021170          |    HES4    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 934343  | 935552 |   -    |        NM_001142467         |    HES4    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |          NM_032129          |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |          NM_032129          |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 901876  | 910484 |   +    |        NM_001160184         |  PLEKHN1   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 895966  | 901099 |   +    |          NM_198317          |   KLHL17   |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 879582  | 894679 |   -    |          NM_015658          |   NOC2L    |
+-------+---------+--------+--------+-----------------------------+------------+
| chr1  | 879582  | 894679 |   -    |          NM_015658          |   NOC2L    |
+-------+---------+--------+--------+-----------------------------+------------+


closest 10 downstream transcripts from chr1:991973-991973 in hg19 for refGene set
Note: for reverse - strand items, txStart is the 3' end, NOT the transcription start site
+-------+---------+---------+--------+----------------------------+------------+
| chrom | txStart |  txEnd  | strand |            name            | geneSymbol |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1007125 | 1009687 |   -    |        NM_001205252        |   RNF223   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1007125 | 1009687 |   -    |        NM_001205252        |   RNF223   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1017197 | 1051736 |   -    |         NM_017891          |  C1orf159  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1072396 | 1079434 |   +    |         NR_038869          | LOC254099  |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1102483 | 1102578 |   +    |         NR_029639          |  MIR200B   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1103242 | 1103332 |   +    |         NR_029834          |  MIR200A   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1104384 | 1104467 |   +    |         NR_029957          |   MIR429   |
+-------+---------+---------+--------+----------------------------+------------+
| chr1  | 1109285 | 1133313 |   +    |        NM_001130045        |   TTLL10   |
+-------+---------+---------+--------+----------------------------+------------+