Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
3b64501
Add Feature, GenericFeature, CDSFeature, ClusterFeature classes
Manikumar1998 Jun 10, 2017
26d5e94
Add from_biopython(), _modified_features in Record
Manikumar1998 Jun 11, 2017
036553a
Replace from_genbank() with from_file() in Record, Update test_record.py
Manikumar1998 Jun 11, 2017
236afba
Refined class methods
Manikumar1998 Jun 14, 2017
e3bb9e5
Add add_feature() in Record class, modify get_cluster_number, Refined
Manikumar1998 Jun 16, 2017
1d420ce
Strict type check for cutoff & extension, Clusters and their numbers …
Manikumar1998 Jun 16, 2017
100db17
Update test_record.py
Manikumar1998 Jun 16, 2017
524dec5
Add test_cluster.py, modify add_feature() and test_record.py
Manikumar1998 Jun 17, 2017
9052879
Solved: Format of writing to files
Manikumar1998 Jun 18, 2017
3046eca
Update record.py, test_cluster.py, test_record.py. Add new test files…
Manikumar1998 Jun 18, 2017
2c549b3
Updated testfiles path tests/data/ -> data/
Manikumar1998 Jun 19, 2017
5bd4486
SOLVED: complement(location); Add sort_feature() and cmp_feature_loca…
Manikumar1998 Jun 20, 2017
976d300
Update new SeqFeature location
Manikumar1998 Jun 21, 2017
0db0c67
Update FeatureLocation(), Delete logging(list), Update sort_features()
Manikumar1998 Jun 21, 2017
d21aebe
Remove comments
Manikumar1998 Jun 21, 2017
2b2e57d
Add name and record property in Record
Manikumar1998 Jun 25, 2017
f3577bc
Remove record property, add translation in CDSFeature, add setter for…
Manikumar1998 Jun 27, 2017
510d138
Add setter for id
Manikumar1998 Jun 27, 2017
7748c6f
Add setter's and getter's for generics and clusters lists
Manikumar1998 Jun 28, 2017
3c17965
Add setters for things in Record
Manikumar1998 Jul 1, 2017
0e9b1b7
Add setter and getter for location in Feature class.
Manikumar1998 Jul 1, 2017
301f2e4
Modify location setter
Manikumar1998 Jul 2, 2017
c20f78a
Modify parsing 'note' in ClusterFeature()
Manikumar1998 Jul 2, 2017
0dd1078
Update 'note' qualifier in ClusterFeature()
Manikumar1998 Jul 2, 2017
762b6c2
Modify set_location() to take location values instead of lists
Manikumar1998 Jul 3, 2017
95eb608
Update set_location() to accept lists as CompoundLoaction()
Manikumar1998 Jul 5, 2017
8d3e80e
Add structure, probability in ClusterFeature and EC_number in CDSFeature
Manikumar1998 Jul 6, 2017
d79bf23
Avoid None data in SeqRecord
Manikumar1998 Jul 6, 2017
3344e1b
Add group_cluster_cds() to Record
Manikumar1998 Jul 7, 2017
616e8e2
Replace set_location() with location property. Add find_cluster_pos()…
Manikumar1998 Jul 11, 2017
7b55371
Update record to integrate with antiSMASH
Manikumar1998 Aug 14, 2017
25e6292
Modify tests to use unittest
Manikumar1998 Aug 14, 2017
10063aa
Add tests for CDS_motif, aSDomain, PFAM_domain
Manikumar1998 Aug 14, 2017
c7b4b80
Modify test_add_feature() to test CDS_motif(), aSDomain() and PFAM_do…
Manikumar1998 Aug 15, 2017
6f0d3dd
Add test_cluster_cds_links()
Manikumar1998 Aug 15, 2017
c8ede0d
Replace __str__() with str()
Manikumar1998 Aug 15, 2017
b7829ba
Add uniitest module in test_cluster.py
Manikumar1998 Aug 15, 2017
c09c00f
Add test_ClusterFeature_members() to check the members of ClusterFeat…
Manikumar1998 Aug 15, 2017
c63285c
Bugs Fix in record.py
Manikumar1998 Aug 15, 2017
d551955
Add test_cds.py to tests
Manikumar1998 Aug 15, 2017
269b542
Add test_cds_motif.py
Manikumar1998 Aug 15, 2017
827eafa
Add test_domains.py to tests
Manikumar1998 Aug 15, 2017
7fa4917
Add test_generic.py to tests
Manikumar1998 Aug 15, 2017
5dd0a2f
Bug fix: Replace aSDomain_id member with asDomain_id
Manikumar1998 Aug 15, 2017
8c16c15
Modify secmet tests
Manikumar1998 Aug 15, 2017
e2c8c5d
Update docstrings, uniformise the code
Manikumar1998 Aug 15, 2017
cb17cf9
Bug fix: self.translation -> self.name
Manikumar1998 Aug 16, 2017
0d3ab35
Add more tests to test_generic.py
Manikumar1998 Aug 16, 2017
b9a6c01
Refine secmet
Manikumar1998 Aug 16, 2017
4c08f64
Add more test to test_record.py
Manikumar1998 Aug 16, 2017
215c7be
Add more tests to test_cds.py
Manikumar1998 Aug 16, 2017
f31ab82
Add more tests to test_cds_motif.py
Manikumar1998 Aug 16, 2017
df98d56
Add more tests to test_cluster.py
Manikumar1998 Aug 16, 2017
a6fb2a5
Add more tests to test_domains.py
Manikumar1998 Aug 16, 2017
a9c936f
Add _map_sec_met_list_to_SecMetQualifier() and SecMetResult()
Manikumar1998 Aug 18, 2017
493ff04
test_cds: Modify tests to test sec_met and SecMetResult()
Manikumar1998 Aug 18, 2017
de626d3
Modify tests to use looping value for asserting
Manikumar1998 Aug 18, 2017
6150240
add_qualifier(): Check for the qualifier values for int or float and …
Manikumar1998 Aug 19, 2017
7bd9442
Replace binary searches with python-inbuilt bisect_left method
Manikumar1998 Aug 20, 2017
5d746c5
from_biopython(): Remove updating cluster and cds links(There is no m…
Manikumar1998 Aug 20, 2017
01799f1
Add code to update cluster and cds links for testing purpose
Manikumar1998 Aug 20, 2017
360052a
Replace set_*() methods with erase_*() methods
Manikumar1998 Aug 21, 2017
4e74f1d
Replace mutable lists to immutable tuples for returning features
Manikumar1998 Aug 21, 2017
09143d2
Add SubCDSFeature super class for CDS_motifFeature, aSDomain, PFAM_do…
Manikumar1998 Aug 21, 2017
da69795
Update secmet tests
Manikumar1998 Aug 21, 2017
384595f
Refine secmet, Avoid explicit checking for None
Manikumar1998 Aug 21, 2017
5978267
Refine members initilization in ClusterFeature
Manikumar1998 Aug 22, 2017
b9d196d
SecMetResult: Check for valid float bitscore and evalue before assigning
Manikumar1998 Aug 22, 2017
3151f17
Modify tests for failure cases using assertRaises() of unittest
Manikumar1998 Aug 22, 2017
647fc12
Use a copy of feature qualifiers instead of original qualifiers dict
Manikumar1998 Aug 22, 2017
32656a7
Refine secmet
Manikumar1998 Aug 22, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,076 changes: 1,053 additions & 23 deletions secmet/record.py

Large diffs are not rendered by default.

630 changes: 630 additions & 0 deletions tests/data/HM219853.1.final.gbk

Large diffs are not rendered by default.

539 changes: 539 additions & 0 deletions tests/data/HM219853.1.final.minimal.gbk

Large diffs are not rendered by default.

6,887 changes: 6,887 additions & 0 deletions tests/data/Y16952.3.final.gbk

Large diffs are not rendered by default.

2,270 changes: 2,270 additions & 0 deletions tests/data/Y16952.3.final.minimal.gbk

Large diffs are not rendered by default.

2,033 changes: 2,033 additions & 0 deletions tests/data/balh.embl

Large diffs are not rendered by default.

157 changes: 157 additions & 0 deletions tests/test_cds.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
from os import path
import unittest
from Bio import SeqIO
from Bio.SeqFeature import FeatureLocation, SeqFeature
from secmet.record import Record, CDSFeature, SecMetQualifier, SecMetResult

filename = 'Y16952.3.final.gbk'
filetype = 'genbank'

class FakeResult(object):
"""A FakeResult to test SecMetResult"""
def __init__(self):
"""Initialise members with fake values"""
self.query_id = 'fake_id'
self.evalue = '10000'
self.bitscore = '10000'

class TestCDSFeature(unittest.TestCase):
def get_testfile(self):
"""File path for testing"""
return path.join(path.dirname(__file__), 'data', filename)

def BioFeature(self):
biofeature = SeqFeature(location=FeatureLocation(10, 100))
biofeature.qualifiers = {'locus_tag': ['fake_locus_tag'], 'translation': ['fake_translation'], \
'gene': ['fake_gene'], 'product': ['fake_product'], 'protein_id': ['fake_protein_id'], \
'transl_table': ['fake_transl_table'], 'source': ['fake_source'], 'db_xref': ['fake_db_xref'],\
'EC_number': ['fake_EC_number'], 'note': ['fake_notes'], 'aSProdPred': ['fake_aSProdPred'], \
'aSASF_choice':['fake_aSAF_choice'], 'aSASF_note': ['fake_aSASF_note'], 'aSASF_choice': ['fake_aSASF_choice'], \
'aSASF_scaffold': ['fake_aSASF_scaffold'], 'aSASF_prediction': ['fake_aSASF_prediction'], \
'sec_met_predictions': ['fake_sec_met_predictions'], 'unknown_qualifier': ['fake_qualifier']}
return biofeature

def test_CDSFeature_members(self):
"""Test the members of CDSFeature"""
testfile = self.get_testfile()
rec = Record.from_file(testfile)
bp_rec = SeqIO.read(testfile, filetype)
bp_cdss = [i for i in bp_rec.features if i.type == 'CDS']
mod_cdss = rec.get_CDSs()
self.assertEqual(len(bp_cdss), len(mod_cdss))
#Segregate out qualifiers that are stored in list form
qualifiers_as_list = ['note', 'aSProdPred', 'aSASF_choice', 'aSASF_prediction', 'db_xref', \
'aSASF_note', 'aSASF_scaffold', 'sec_met_predictions', 'EC_number']
for bp_cds, mod_cds in zip(bp_cdss, mod_cdss):
for key, value in bp_cds.qualifiers.items():
if value:
if key not in qualifiers_as_list:
if key != 'sec_met': #antiSMASH anyways erases all sec_met qualifiers
if not hasattr(mod_cds, key):
if not key in mod_cds._qualifiers:
raise AttributeError('%s is not a member of CDSFeature'%key)
else:
self.assertEqual(value, mod_cds._qualifiers[key])
else:
self.assertEqual(str(value[0]), str(getattr(mod_cds, key)))
else:
self.assertEqual(value, mod_cds.sec_met.as_list())
else:
if key == 'note':
#note is modified to notes in secmet
self.assertEqual(value, mod_cds.notes)
else:
self.assertEqual(value, getattr(mod_cds, key))

def test_BioFeature_to_CDSFeature(self):
biofeature = self.BioFeature()
cds_feature = CDSFeature(feature=biofeature)
self.assertEqual(str(cds_feature.location), str(FeatureLocation(10, 100)))
self.assertEqual(cds_feature.type, 'CDS')
self.assertEqual(cds_feature.locus_tag, 'fake_locus_tag')
self.assertEqual(cds_feature.translation, 'fake_translation')
self.assertEqual(cds_feature.gene, 'fake_gene')
self.assertEqual(cds_feature.product, 'fake_product')
self.assertEqual(cds_feature.protein_id, 'fake_protein_id')
self.assertEqual(cds_feature.transl_table, 'fake_transl_table')
self.assertEqual(cds_feature.source, 'fake_source')
self.assertEqual(cds_feature.EC_number, ['fake_EC_number'])
self.assertEqual(cds_feature.notes, ['fake_notes'])
self.assertEqual(cds_feature.db_xref, ['fake_db_xref'])
self.assertEqual(cds_feature.aSProdPred, ['fake_aSProdPred'])
self.assertEqual(cds_feature.aSASF_note, ['fake_aSASF_note'])
self.assertEqual(cds_feature.aSASF_scaffold, ['fake_aSASF_scaffold'])
self.assertEqual(cds_feature.aSASF_choice, ['fake_aSASF_choice'])
self.assertEqual(cds_feature.aSASF_prediction, ['fake_aSASF_prediction'])
self.assertEqual(cds_feature.sec_met_predictions, ['fake_sec_met_predictions'])
self.assertEqual(cds_feature._qualifiers['unknown_qualifier'], ['fake_qualifier'])
self.assertEqual(repr(cds_feature), repr(cds_feature.to_biopython()[0]))
self.assertIsInstance(cds_feature.to_biopython()[0], SeqFeature)
self.assertIsInstance(cds_feature.sec_met, SecMetQualifier)

def test_SecMetQualifier(self):
"""Test SecMetQualifier"""
cds = CDSFeature(FeatureLocation(1, 10))
self.assertEqual(None, cds.sec_met.clustertype)
self.assertEqual(None, cds.sec_met.domains)
self.assertEqual(None, cds.sec_met.kind)
self.assertEqual(0, len(cds.sec_met))
self.assertEqual([], cds.sec_met)
self.assertEqual([], cds.sec_met.nrpspks)
self.assertEqual([], cds.sec_met.asf_predictions)
self.assertEqual([], cds.sec_met.as_list())

cds.sec_met.clustertype = "FAKE"
cds.sec_met.domains = ["FAKE_DOMAIN1", "FAKE_DOMAIN2"]
cds.sec_met.kind = "FAKE"
cds.sec_met.nrpspks = ["FAKE_NRPS/PKS Domain: "]
cds.sec_met.asf_predictions = ['FAKE_ASF_predictions: ']
self.assertEqual("FAKE", cds.sec_met.clustertype)
self.assertEqual(["FAKE_DOMAIN1", "FAKE_DOMAIN2"], cds.sec_met.domains)
self.assertEqual("FAKE", cds.sec_met.kind)
self.assertEqual(5, len(cds.sec_met))
expected_sec_met = ['Type: FAKE', 'Domains detected: FAKE_DOMAIN1; FAKE_DOMAIN2', 'Kind: FAKE', \
'FAKE_NRPS/PKS Domain: ', 'FAKE_ASF_predictions: ']
self.assertEqual(expected_sec_met, cds.sec_met.as_list())
self.assertEqual(["FAKE_NRPS/PKS Domain: "], cds.sec_met.nrpspks)
self.assertEqual(['FAKE_ASF_predictions: '], cds.sec_met.asf_predictions)
self.assertEqual(str(cds.sec_met), repr(cds.sec_met))
with self.assertRaises(TypeError):
#sec_met feature should be an instance of SecMetQualifier
cds.sec_met = []
cds.to_biopython()

#Test the failure cases in SecMetQualifier
with self.assertRaises(TypeError):
#clustertype should be a string
SecMetQualifier(clustertype=1)
with self.assertRaises(TypeError):
#domains should be a list
SecMetQualifier(domains='invalid_domains_type')
with self.assertRaises(TypeError):
#kind should be a str
SecMetQualifier(kind=1)

def test_SecMetResult(self):
"""Test the SecMetResult class"""
empty_result = SecMetResult()
self.assertEqual(None, empty_result.query_id)
self.assertEqual(None, empty_result.evalue)
self.assertEqual(None, empty_result.bitscore)
self.assertEqual(None, empty_result.nseeds)
result = SecMetResult(FakeResult(), "fake_seeds")
self.assertEqual('fake_id', result.query_id)
self.assertEqual(10000.0, result.evalue)
self.assertEqual(10000.0, result.bitscore)
self.assertEqual('fake_seeds', result.nseeds)

expected = "fake_id (E-value: 10000.0, bitscore: 10000.0, seeds: fake_seeds)"
self.assertEqual(expected, repr(result), str(result))
#Test the failure cases in SecMetResult
result = SecMetResult()
with self.assertRaises(ValueError):
#evalue should be a float
result.evalue = 'invalid_evalue'
with self.assertRaises(ValueError):
#bitscore should be a float
result.bitscore = 'invalid_bitscore'
87 changes: 87 additions & 0 deletions tests/test_cds_motif.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
from os import path
import unittest
from Bio import SeqIO
from Bio.SeqFeature import SeqFeature, FeatureLocation
from secmet.record import Record, CDS_motifFeature

filename = 'Y16952.3.final.gbk'
filetype = 'genbank'

class TestCDS_motifFeature(unittest.TestCase):
def get_testfile(self):
"""File path for testing"""
return path.join(path.dirname(__file__), 'data', filename)

def BioFeature(self):
biofeature = SeqFeature(location=FeatureLocation(10, 100))
biofeature.qualifiers = {'locus_tag': ['fake_locus_tag'], 'translation': ['fake_translation'], 'aSTool': ['fake_aSTool'],\
'motif': ['fake_motif'], 'asDomain_id': ['fake_asDomain_id'], 'detection': ['fake_detection'],\
'database': ['fake_database'], 'label': ['fake_label'], 'unknown_qualifier': ['fake_qualifier'],\
'note': ['fake_notes'], 'aSProdPred': ['fake_aSProdPred'], 'aSASF_choice':['fake_aSAF_choice'],\
'aSASF_note': ['fake_aSASF_note'], 'aSASF_choice': ['fake_aSASF_choice'], 'aSASF_scaffold': ['fake_aSASF_scaffold'],\
'aSASF_prediction': ['fake_aSASF_prediction']}
return biofeature

def test_CDS_motifFeature_members(self):
"""Check if all the qualifiers are properly stored in CDS_motifFeature"""
testfile = self.get_testfile()
rec = Record.from_file(testfile)
bp_rec = SeqIO.read(testfile, filetype)
bp_cds_motifs = [i for i in bp_rec.features if i.type == 'CDS_motif']
mod_cds_motifs = rec.get_CDS_motifs()
#Segregate out qualifiers that are stored in list form
qualifiers_as_list = ['note', 'aSASF_choice', 'aSASF_note', 'aSASF_scaffold', \
'aSASF_prediction', 'aSProdPred']
for bp_motif, mod_motif in zip(bp_cds_motifs, mod_cds_motifs):
for key, value in bp_motif.qualifiers.items():
if value:
if key not in qualifiers_as_list:
if not hasattr(mod_motif, key):
raise AttributeError("%s is not a member of CDS_motifFeature"%key)
#score and evalue are numbers
if key in ['score', 'evalue']:
self.assertEqual(float(value[0]), float(getattr(mod_motif, key)))
else:
self.assertEqual(str(value[0]), str(getattr(mod_motif, key)))
else:
if key == 'note':
#note is modified to notes in secmet
self.assertEqual(value, mod_motif.notes)
else:
self.assertEqual(value, getattr(mod_motif, key))
cdsmotif = CDS_motifFeature(FeatureLocation(1, 10))
#score, evalue should be numbers
with self.assertRaises(ValueError):
cdsmotif.score = '-a50'
with self.assertRaises(ValueError):
cdsmotif.evalue = 'a5.50E-08'

#If valid qualifiers and values are added, We shouldn't get an error
try:
cdsmotif.score = '-50'
cdsmotif.evalue = '5.50E-08'
except:
raise RuntimeError('Secmet unable to add valid qualifiers')

def test_BioFeature_to_CDS_motifFeauture(self):
biofeature = self.BioFeature()
cdsmotif_feature = CDS_motifFeature(feature=biofeature)
self.assertEqual(str(cdsmotif_feature.location), str(FeatureLocation(10, 100)))
self.assertEqual(cdsmotif_feature.type, 'CDS_motif')
self.assertEqual(cdsmotif_feature.locus_tag, 'fake_locus_tag')
self.assertEqual(cdsmotif_feature.translation, 'fake_translation')
self.assertEqual(cdsmotif_feature.label, 'fake_label')
self.assertEqual(cdsmotif_feature.aSTool, 'fake_aSTool')
self.assertEqual(cdsmotif_feature.detection, 'fake_detection')
self.assertEqual(cdsmotif_feature.database, 'fake_database')
self.assertEqual(cdsmotif_feature.asDomain_id, 'fake_asDomain_id')
self.assertEqual(cdsmotif_feature.motif, 'fake_motif')
self.assertEqual(cdsmotif_feature.notes, ['fake_notes'])
self.assertEqual(cdsmotif_feature.aSProdPred, ['fake_aSProdPred'])
self.assertEqual(cdsmotif_feature.aSASF_note, ['fake_aSASF_note'])
self.assertEqual(cdsmotif_feature.aSASF_scaffold, ['fake_aSASF_scaffold'])
self.assertEqual(cdsmotif_feature.aSASF_choice, ['fake_aSASF_choice'])
self.assertEqual(cdsmotif_feature.aSASF_prediction, ['fake_aSASF_prediction'])
self.assertEqual(cdsmotif_feature._qualifiers['unknown_qualifier'], ['fake_qualifier'])
self.assertEqual(repr(cdsmotif_feature), repr(cdsmotif_feature.to_biopython()[0]))
self.assertIsInstance(cdsmotif_feature.to_biopython()[0], SeqFeature)
Loading