Skip to content

Sourcery refactored main branch#55

Open
sourcery-ai[bot] wants to merge 1 commit intomainfrom
sourcery/main
Open

Sourcery refactored main branch#55
sourcery-ai[bot] wants to merge 1 commit intomainfrom
sourcery/main

Conversation

@sourcery-ai
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot commented Dec 4, 2023

Branch main refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch origin sourcery/main
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

@sourcery-ai sourcery-ai Bot requested a review from ben-silke December 4, 2023 22:58
Comment on lines -39 to +43
directory = "testing/data/"
location = "output/genemark/"

gff_ext = ".genemark.txt"

gff_file = directory + location + id + "/" + id + gff_ext
gff_file = f"testing/data/{location}{id}/{id}{gff_ext}"
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run_file refactored with the following changes:

This removes the following comments ( why? ):

# rows.append(metadata)

Comment on lines -144 to +134
output = []
output.append(["id", "correct", "total", "gene_count"])
output = [["id", "correct", "total", "gene_count"]]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run_files refactored with the following changes:

Comment on lines -164 to +159
if not type(files) == list:
if type(files) != list:
files = [files]

all_file_count = len(files)
print(files)

output = []
output.append(["id", "correct", "total", "gene_count"])
output = [["id", "correct", "total", "gene_count"]]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run refactored with the following changes:

Comment on lines -39 to +43
directory = "testing/data/"
location = "output/genemark/"

gff_ext = ".genemark.txt"

gff_file = directory + location + id + "/" + id + gff_ext
gff_file = f"testing/data/{location}{id}/{id}{gff_ext}"
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run_file refactored with the following changes:

This removes the following comments ( why? ):

# rows.append(metadata)

Comment on lines -145 to +133
output = []
output.append(["id", "correct", "total", "gene_count"])
output = [["id", "correct", "total", "gene_count"]]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run_files refactored with the following changes:

Comment on lines -44 to +45
for ref in GeneDatabaseReference.objects.filter(gene=gene):
references.append(ref)

for ref in gene.cds.database_set:
references.append(ref)

references.extend(iter(GeneDatabaseReference.objects.filter(gene=gene)))
references.extend(iter(gene.cds.database_set))
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_gene refactored with the following changes:

Comment on lines -62 to +58
if not response.status_code == 200:
if response.status_code != 200:
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_interpro_info refactored with the following changes:

  • Simplify logical expression using De Morgan identities (de-morgan)

Comment on lines -71 to +72
base = "https://www.orthodb.org/search?"
query = f"query={id}&"
tags = "ncbi=1"
response = requests.get(base + query + tags)
if not response.status_code == 200:
return []
if results := response.json().get("data", []):
return results
else:
response = requests.get(f"https://www.orthodb.org/search?{query}{tags}")
if response.status_code != 200:
return []
return results if (results := response.json().get("data", [])) else []
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_ortho_db_set refactored with the following changes:

print("________________FAILS________________________")
print(fails)
file = output + "gene_population_failures.csv"
file = f"{output}gene_population_failures.csv"
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

parser = MinimalGenbankParser(f)
for p in parser:
genes.append(p)
genes.extend(iter(parser))
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_file refactored with the following changes:

Comment on lines -94 to +93
genes = parse_file(file)
if genes:
if genes := parse_file(file):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run refactored with the following changes:

Comment on lines -9 to +11
type = "search"
query = f"?query={query}"

response = requests.get(ORTHO_DB_ENDPOINT + type + query)
response = requests.get(f"{ORTHO_DB_ENDPOINT}search{query}")
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_orthoDB_id refactored with the following changes:

Comment on lines -22 to +23
type = "orthologs"
query = f"?id={id}"

response = requests.get(ORTHO_DB_ENDPOINT + type + query)
response = requests.get(f"{ORTHO_DB_ENDPOINT}orthologs{query}")
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_orthologs_from_orthoDB refactored with the following changes:

Comment on lines -80 to +86
if self.kegg_id:
return self.kegg_id

else:
if not self.kegg_id:
self.kegg_link()
if not self.kegg_id:
raise ValueError(
f"Database reference does not have an associated link in kegg db."
)
return self.kegg_id
if not self.kegg_id:
raise ValueError(
"Database reference does not have an associated link in kegg db."
)
return self.kegg_id
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function Gene.get_kegg_id refactored with the following changes:

Comment on lines -53 to -60
if not "NC_000964" in a:
if "NC_000964" not in a:
accs.append(a)
print(f"{i} -- {a=}")
print(f"{i} -- {accs=}")

raise ValueError(accs)
for ref in high_counts.keys():
get_sequence(ref)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_references refactored with the following changes:

Comment on lines -207 to -211
i = 0
while i < len(db_string):
for i in range(0, len(db_string), 2):
database = db_string[i]
location = db_string[i + 1]
i += 2
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function create_database_objects refactored with the following changes:

Comment on lines -325 to +324
if y == "y" or y == "Y":
print(f"please enter a description for the {object_name}")
description = input()
print(f"are you happy with {description=} (y/n)")
y = input()
if y != "y" or y != "Y":
description = None
return name, description
else:
if y not in ["y", "Y"]:
return user_input_name_and_description(object_name)
print(f"please enter a description for the {object_name}")
description = input()
print(f"are you happy with {description=} (y/n)")
y = input()
return name, None
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function user_input_name_and_description refactored with the following changes:

for p in parser:
genes.append(p)

genes.extend(iter(parser))
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 20-22 refactored with the following changes:

Comment on lines -103 to +104
result = {}
type_, data = cogent.block_consolidator(lines)
result["type"] = type_
result = {"type": type_}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GenbankParser.parse_feature refactored with the following changes:

Comment on lines -50 to +60
rows = []
correct = 0
rows.append(['Gene name (genemark)', 'Annotated Gene Name', 'Start', 'End', 'Equal', 'Raw Location'])
rows = [
[
'Gene name (genemark)',
'Annotated Gene Name',
'Start',
'End',
'Equal',
'Raw Location',
]
]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 50-52 refactored with the following changes:

Copy link
Copy Markdown
Author

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Type: Refactoring

Summary of PR: This PR introduces a series of refactoring changes aimed at improving code readability and conciseness. It includes the use of f-strings for string formatting, list comprehensions, and the walrus operator for inline assignments. The changes are spread across multiple files and functions within the codebase.

General PR suggestions

  • Verify that the Python version in the deployment environment supports the walrus operator (Python 3.8+), as it is used in several places in the refactored code.
  • Review the changes to ensure that the refactoring does not alter the intended logic, especially in cases where the original code's behavior may have been modified inadvertently.
  • Consider the readability of the code, particularly when using newer Python syntax like the walrus operator, which may not be familiar to all developers. Splitting dense lines could improve clarity.
  • Ensure that the refactoring does not introduce potential bugs, such as index errors when iterating over lists with steps or when making assumptions about list contents.
  • Revisit any changes that may have introduced regressions or altered the behavior of the code, such as the handling of self.kegg_id in the Gene model, and correct them to maintain the original functionality.

Your trial expires on December 18, 2023. Please email tim@sourcery.ai to continue using Sourcery ✨


m = annotated_genes.filter(Q(first_base=first) | Q(last_base=last))
if m:
if m := annotated_genes.filter(
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): Using the walrus operator (:=) inside the if statement is a Python 3.8+ feature. Ensure that the deployment environment supports this Python version. If compatibility with earlier versions is required, consider reverting to the previous approach.

Suggested change
if m := annotated_genes.filter(
m = annotated_genes.filter(Q(first_base=first) | Q(last_base=last))
if m:

])

all_rows.extend(rows)
all_rows = [['correct', success, 'total', total], *rows]
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): While using unpacking to extend lists is concise, it may not be immediately clear to all readers. Consider if the readability could be improved by using the more explicit all_rows.extend(rows) after initializing all_rows with the summary row.

Suggested change
all_rows = [['correct', success, 'total', total], *rows]
all_rows = [['correct', success, 'total', total]]
all_rows.extend(rows)

databases = []
i = 0
while i < len(db_string):
for i in range(0, len(db_string), 2):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): The use of a range with a step of 2 is a good way to iterate over pairs in the list. However, ensure that the list always contains an even number of elements to avoid an IndexError on location = db_string[i + 1].

response = requests.get(f"https://www.orthodb.org/search?{query}{tags}")
if response.status_code != 200:
return []
return results if (results := response.json().get("data", [])) else []
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): The inline assignment and conditional expression make this line quite dense. Consider splitting it for clarity, especially since the walrus operator is a relatively new addition to Python and may not be familiar to all developers.

return self.kegg_id

else:
if not self.kegg_id:
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (llm): The refactored code has changed the logic. Previously, if self.kegg_id was not set, self.kegg_link() was called before checking self.kegg_id again. Now, if self.kegg_id is not set, it raises a ValueError immediately without attempting to set it via self.kegg_link(). This could lead to unintended behavior.

Suggested change
if not self.kegg_id:
if not self.kegg_id:
self.kegg_link()
if not self.kegg_id:
raise ValueError("Database reference does not have an associated link in kegg db.")
return self.kegg_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants