Sourcery refactored main branch by sourcery-ai[bot] · Pull Request #55 · ben-silke/biol3209

sourcery-ai · 2023-12-04T22:58:25Z

Branch main refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch origin sourcery/main
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

sourcery-ai · 2023-12-04T22:58:27Z

-    directory = "testing/data/"
    location = "output/genemark/"

    gff_ext = ".genemark.txt"

-    gff_file = directory + location + id + "/" + id + gff_ext
+    gff_file = f"testing/data/{location}{id}/{id}{gff_ext}"


Function run_file refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Merge append into list declaration [×2] (merge-list-append)

Use named expression to simplify assignment and conditional (use-named-expression)

Merge extend into list declaration (merge-list-extend)

Use f-string instead of string concatenation [×5] (use-fstring-for-concatenation)

Inline variable that is only used once (inline-variable)

This removes the following comments ( why? ):

# rows.append(metadata)

sourcery-ai · 2023-12-04T22:58:27Z

-    output = []
-    output.append(["id", "correct", "total", "gene_count"])
+    output = [["id", "correct", "total", "gene_count"]]


Function run_files refactored with the following changes:

Merge append into list declaration (merge-list-append)

sourcery-ai · 2023-12-04T22:58:27Z

-    if not type(files) == list:
+    if type(files) != list:
        files = [files]

    all_file_count = len(files)
    print(files)

-    output = []
-    output.append(["id", "correct", "total", "gene_count"])
+    output = [["id", "correct", "total", "gene_count"]]


Function run refactored with the following changes:

Simplify logical expression using De Morgan identities (de-morgan)

Merge append into list declaration (merge-list-append)

sourcery-ai · 2023-12-04T22:58:27Z

-    directory = "testing/data/"
    location = "output/genemark/"

    gff_ext = ".genemark.txt"

-    gff_file = directory + location + id + "/" + id + gff_ext
+    gff_file = f"testing/data/{location}{id}/{id}{gff_ext}"


Function run_file refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Merge append into list declaration [×2] (merge-list-append)

Use named expression to simplify assignment and conditional (use-named-expression)

Merge extend into list declaration (merge-list-extend)

Use f-string instead of string concatenation [×5] (use-fstring-for-concatenation)

Inline variable that is only used once (inline-variable)

This removes the following comments ( why? ):

# rows.append(metadata)

sourcery-ai · 2023-12-04T22:58:27Z

-    output = []
-    output.append(["id", "correct", "total", "gene_count"])
+    output = [["id", "correct", "total", "gene_count"]]


Function run_files refactored with the following changes:

Merge append into list declaration (merge-list-append)

sourcery-ai · 2023-12-04T22:58:28Z

-        for ref in GeneDatabaseReference.objects.filter(gene=gene):
-            references.append(ref)
-
-        for ref in gene.cds.database_set:
-            references.append(ref)
-
+        references.extend(iter(GeneDatabaseReference.objects.filter(gene=gene)))
+        references.extend(iter(gene.cds.database_set))


Function get_gene refactored with the following changes:

Replace a for append loop with list extend [×2] (for-append-to-extend)

Simplify generator expression [×2] (simplify-generator)

sourcery-ai · 2023-12-04T22:58:29Z

-    if not response.status_code == 200:
+    if response.status_code != 200:


Function get_interpro_info refactored with the following changes:

Simplify logical expression using De Morgan identities (de-morgan)

sourcery-ai · 2023-12-04T22:58:29Z

-    base = "https://www.orthodb.org/search?"
    query = f"query={id}&"
    tags = "ncbi=1"
-    response = requests.get(base + query + tags)
-    if not response.status_code == 200:
-        return []
-    if results := response.json().get("data", []):
-        return results
-    else:
+    response = requests.get(f"https://www.orthodb.org/search?{query}{tags}")
+    if response.status_code != 200:
        return []
+    return results if (results := response.json().get("data", [])) else []


Function get_ortho_db_set refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Simplify logical expression using De Morgan identities (de-morgan)

Replace if statement with if expression (assign-if-exp)

Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

Inline variable that is only used once (inline-variable)

sourcery-ai · 2023-12-04T22:58:29Z

    print("________________FAILS________________________")
    print(fails)
-    file = output + "gene_population_failures.csv"
+    file = f"{output}gene_population_failures.csv"


Function main refactored with the following changes:

Use f-string instead of string concatenation (use-fstring-for-concatenation)

sourcery-ai · 2023-12-04T22:58:29Z

            parser = MinimalGenbankParser(f)
-            for p in parser:
-                genes.append(p)
+            genes.extend(iter(parser))


Function parse_file refactored with the following changes:

Replace a for append loop with list extend (for-append-to-extend)

Simplify generator expression (simplify-generator)

sourcery-ai · 2023-12-04T22:58:31Z

-                genes = parse_file(file)
-                if genes:
+                if genes := parse_file(file):


Function run refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)

sourcery-ai · 2023-12-04T22:58:31Z

-    type = "search"
    query = f"?query={query}"

-    response = requests.get(ORTHO_DB_ENDPOINT + type + query)
+    response = requests.get(f"{ORTHO_DB_ENDPOINT}search{query}")


Function get_orthoDB_id refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

Inline variable that is only used once (inline-variable)

sourcery-ai · 2023-12-04T22:58:32Z

-    type = "orthologs"
    query = f"?id={id}"

-    response = requests.get(ORTHO_DB_ENDPOINT + type + query)
+    response = requests.get(f"{ORTHO_DB_ENDPOINT}orthologs{query}")


Function get_orthologs_from_orthoDB refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

Inline variable that is only used once (inline-variable)

sourcery-ai · 2023-12-04T22:58:32Z

-        if self.kegg_id:
-            return self.kegg_id
-
-        else:
+        if not self.kegg_id:
            self.kegg_link()
-            if not self.kegg_id:
-                raise ValueError(
-                    f"Database reference does not have an associated link in kegg db."
-                )
-            return self.kegg_id
+        if not self.kegg_id:
+            raise ValueError(
+                "Database reference does not have an associated link in kegg db."
+            )
+        return self.kegg_id


Function Gene.get_kegg_id refactored with the following changes:

Swap if/else to remove empty if body (remove-pass-body)

Hoist repeated code outside conditional statement (hoist-statement-from-if)

Hoist conditional out of nested conditional (hoist-if-from-if)

Replace f-string with no interpolated values with string (remove-redundant-fstring)

sourcery-ai · 2023-12-04T22:58:32Z

-            if not "NC_000964" in a:
+            if "NC_000964" not in a:
                accs.append(a)
                print(f"{i} -- {a=}")
        print(f"{i} -- {accs=}")

    raise ValueError(accs)
-    for ref in high_counts.keys():
-        get_sequence(ref)


Function get_references refactored with the following changes:

Remove unreachable code (remove-unreachable-code)

Simplify logical expression using De Morgan identities (de-morgan)

sourcery-ai · 2023-12-04T22:58:32Z

-            i = 0
-            while i < len(db_string):
+            for i in range(0, len(db_string), 2):
                database = db_string[i]
                location = db_string[i + 1]
-                i += 2


Function create_database_objects refactored with the following changes:

Replace while with for (while-to-for)

Use named expression to simplify assignment and conditional (use-named-expression)

sourcery-ai · 2023-12-04T22:58:32Z

-    if y == "y" or y == "Y":
-        print(f"please enter a description for the {object_name}")
-        description = input()
-        print(f"are you happy with {description=} (y/n)")
-        y = input()
-        if y != "y" or y != "Y":
-            description = None
-        return name, description
-    else:
+    if y not in ["y", "Y"]:
        return user_input_name_and_description(object_name)
+    print(f"please enter a description for the {object_name}")
+    description = input()
+    print(f"are you happy with {description=} (y/n)")
+    y = input()
+    return name, None


Function user_input_name_and_description refactored with the following changes:

Swap if/else branches (swap-if-else-branches)

Remove unnecessary else after guard condition (remove-unnecessary-else)

Replace multiple comparisons of same variable with in operator (merge-comparisons)

Remove redundant conditional (remove-redundant-if)

Inline variable that is only used once (inline-variable)

sourcery-ai · 2023-12-04T22:58:32Z

-        for p in parser:
-            genes.append(p)
-
+        genes.extend(iter(parser))


Lines 20-22 refactored with the following changes:

Replace a for append loop with list extend (for-append-to-extend)

Simplify generator expression (simplify-generator)

sourcery-ai · 2023-12-04T22:58:33Z

-        result = {}
        type_, data = cogent.block_consolidator(lines)
-        result["type"] = type_
+        result = {"type": type_}


Function GenbankParser.parse_feature refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Replace a conditional string slice with a call to str.removesuffix or str.removeprefix, where applicable (use-string-remove-affix)

Merge dictionary assignment with declaration (merge-dict-assign)

sourcery-ai · 2023-12-04T22:58:33Z

-rows = []
 correct = 0
-rows.append(['Gene name (genemark)', 'Annotated Gene Name', 'Start', 'End', 'Equal', 'Raw Location'])
+rows = [
+    [
+        'Gene name (genemark)',
+        'Annotated Gene Name',
+        'Start',
+        'End',
+        'Equal',
+        'Raw Location',
+    ]
+]


Lines 50-52 refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Merge append into list declaration (merge-list-append)

sourcery-ai

PR Type: Refactoring

Summary of PR: This PR introduces a series of refactoring changes aimed at improving code readability and conciseness. It includes the use of f-strings for string formatting, list comprehensions, and the walrus operator for inline assignments. The changes are spread across multiple files and functions within the codebase.

General PR suggestions

Verify that the Python version in the deployment environment supports the walrus operator (Python 3.8+), as it is used in several places in the refactored code.
Review the changes to ensure that the refactoring does not alter the intended logic, especially in cases where the original code's behavior may have been modified inadvertently.
Consider the readability of the code, particularly when using newer Python syntax like the walrus operator, which may not be familiar to all developers. Splitting dense lines could improve clarity.
Ensure that the refactoring does not introduce potential bugs, such as index errors when iterating over lists with steps or when making assumptions about list contents.
Revisit any changes that may have introduced regressions or altered the behavior of the code, such as the handling of self.kegg_id in the Gene model, and correct them to maintain the original functionality.

Your trial expires on December 18, 2023. Please email tim@sourcery.ai to continue using Sourcery ✨

sourcery-ai · 2023-12-04T23:01:16Z


-        m = annotated_genes.filter(Q(first_base=first) | Q(last_base=last))
-        if m:
+        if m := annotated_genes.filter(


suggestion (llm): Using the walrus operator (:=) inside the if statement is a Python 3.8+ feature. Ensure that the deployment environment supports this Python version. If compatibility with earlier versions is required, consider reverting to the previous approach.

Suggested change

if m := annotated_genes.filter(

m = annotated_genes.filter(Q(first_base=first) | Q(last_base=last))

if m:

sourcery-ai · 2023-12-04T23:01:16Z

-    ])
-
-    all_rows.extend(rows)
+    all_rows = [['correct', success, 'total', total], *rows]


suggestion (llm): While using unpacking to extend lists is concise, it may not be immediately clear to all readers. Consider if the readability could be improved by using the more explicit all_rows.extend(rows) after initializing all_rows with the summary row.

Suggested change

all_rows = [['correct', success, 'total', total], *rows]

all_rows = [['correct', success, 'total', total]]

all_rows.extend(rows)

sourcery-ai · 2023-12-04T23:01:16Z

            databases = []
-            i = 0
-            while i < len(db_string):
+            for i in range(0, len(db_string), 2):


suggestion (llm): The use of a range with a step of 2 is a good way to iterate over pairs in the list. However, ensure that the list always contains an even number of elements to avoid an IndexError on location = db_string[i + 1].

sourcery-ai · 2023-12-04T23:01:16Z

+    response = requests.get(f"https://www.orthodb.org/search?{query}{tags}")
+    if response.status_code != 200:
        return []
+    return results if (results := response.json().get("data", [])) else []


suggestion (llm): The inline assignment and conditional expression make this line quite dense. Consider splitting it for clarity, especially since the walrus operator is a relatively new addition to Python and may not be familiar to all developers.

sourcery-ai · 2023-12-04T23:01:17Z

-            return self.kegg_id
-
-        else:
+        if not self.kegg_id:


issue (llm): The refactored code has changed the logic. Previously, if self.kegg_id was not set, self.kegg_link() was called before checking self.kegg_id again. Now, if self.kegg_id is not set, it raises a ValueError immediately without attempting to set it via self.kegg_link(). This could lead to unintended behavior.

Suggested change

if not self.kegg_id:

if not self.kegg_id:

self.kegg_link()

if not self.kegg_id:

raise ValueError("Database reference does not have an associated link in kegg db.")

return self.kegg_id

'Refactored by Sourcery'

1ad019b

sourcery-ai Bot requested a review from ben-silke December 4, 2023 22:58

sourcery-ai Bot commented Dec 4, 2023

View reviewed changes

		if not response.status_code == 200:
		if response.status_code != 200:

	if m := annotated_genes.filter(
	m = annotated_genes.filter(Q(first_base=first) \| Q(last_base=last))
	if m:

	all_rows = [['correct', success, 'total', total], *rows]
	all_rows = [['correct', success, 'total', total]]
	all_rows.extend(rows)

Conversation

sourcery-ai Bot commented Dec 4, 2023

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Dec 4, 2023

Choose a reason for hiding this comment