Lab journal for courses in Bioinformatics Institute
Entries organized as PROJECT_NAME/YYYY_MM_DD.md
- Causes of the E.Coli outbreak in Germany in April 2011.
- Mutations that probably cause ampicillin resistance in E.Coli.
The goals are to analyze that sequencing data from a strain of E. coli resistant to the antibiotic ampicillin to locate the mutations responsible for giving E. coli its antibiotic resistance property, to research the genes that are mutated to identify the mechanism of antibiotic resistance in each case, and to make recommendations for alternative antibiotics a doctor could use to treat each strain.
- Get data
- Inspect raw sequencing data manually
- Inspect raw sequencing data with fastqc.
- Filtering the reads.
- Aligning sequences to reference
- Variant calling
- Variant effect prediction
- Writing lab report
The goal is to reproduce part of E. coli O104:H4 Genome Analysis on TY2482 sequencing data and answer following questions:
- What is genomic sequence of E.Coli X?
- What strain of E. coli is E. coli X most similar to? (Where did it come from?)
- What are the genes that E. coli X contains?
- Which of these genes make E. coli X distinct?
- How did E. coli X evolve to obtain these genes?
- How did E. coli X become pathogenic?
- Explore the dataset with FastQC
- K-mer profiling and genome size estimation
- Assembling E.Coli X genome from paired reads
- Analyze effect of read correction
- Analyze impact of reads with large insert size
- Genome annotation
- Finding the closest relative of E.Coli X
- Finding what is the genetic cause of HUS
- Tracing the source of toxin genes in E.Coli X
- Antibiotic resistance detection
- Writing lab report