Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 156 additions & 30 deletions config_file_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,58 @@
import argparse
import requests
import json
import re
from pprint import pprint
import yaml
import xml.etree.ElementTree as ET

Check failure on line 6 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L6

The Python documentation recommends using `defusedxml` instead of `xml` because the native Python `xml` library is vulnerable to XML External Entity (XXE) attacks.

Check warning on line 6 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L6

Using xml.etree.ElementTree to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.etree.ElementTree with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called.

Check warning on line 6 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L6

Using xml.etree.ElementTree to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.etree.ElementTree with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called. (B405)

Check notice on line 6 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L6

camelcase 'xml.etree.ElementTree' imported as acronym 'ET' (N817)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Codacy found a minor CodeStyle issue: camelcase 'xml.etree.ElementTree' imported as acronym 'ET' (N817)

The issue reported by the Prospector linter is that the module xml.etree.ElementTree is being imported and aliased as the acronym ET, which is in camel case. According to PEP 8, module names should be in lowercase, and acronyms should not be capitalized. Therefore, the alias ET is not compliant with the style guide.

To fix this issue, you can change the alias to be fully lowercase. Here’s the suggested change:

Suggested change
import xml.etree.ElementTree as ET
import xml.etree.ElementTree as et

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using xml.etree.ElementTree to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.etree.ElementTree with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called. (B405)

The issue identified by the Prospector linter relates to the use of the xml.etree.ElementTree module for parsing XML data. This module is known to be vulnerable to various XML attacks, such as XML External Entity (XXE) attacks, when processing untrusted XML input. To mitigate this risk, it is recommended to use the defusedxml package, which provides a safer alternative for parsing XML by disabling potentially dangerous features.

To address this security concern, you can replace the import statement of xml.etree.ElementTree with the import statement from the defusedxml package as follows:

Suggested change
import xml.etree.ElementTree as ET
from defusedxml.ElementTree import ElementTree as ET

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codacy found a critical Security issue: The Python documentation recommends using defusedxml instead of xml because the native Python xml library is vulnerable to XML External Entity (XXE) attacks.

The issue identified by the Semgrep linter pertains to the use of the xml.etree.ElementTree module, which is part of the standard Python library for parsing XML. This library is known to be vulnerable to XML External Entity (XXE) attacks, where an attacker can exploit the XML parser to access sensitive files on the server or perform other malicious actions. To mitigate this risk, it is recommended to use the defusedxml library, which provides a safer alternative for parsing XML and is specifically designed to prevent such vulnerabilities.

To fix the issue, you should replace the import statement for the standard XML library with the import statement for the defusedxml library. Here’s the suggested code change:

Suggested change
import xml.etree.ElementTree as ET
from defusedxml.ElementTree import ElementTree as ET

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using xml.etree.ElementTree to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.etree.ElementTree with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called.

The issue identified by the Bandit linter is related to the use of the xml.etree.ElementTree module for parsing XML data. This module is vulnerable to various XML attacks, such as XML External Entity (XXE) attacks, when parsing untrusted XML input. To mitigate this vulnerability, it is recommended to use the defusedxml package, which is designed to safely parse XML and protect against these types of attacks.

To fix the issue, you can replace the import statement for xml.etree.ElementTree with the defusedxml.ElementTree module. Here’s the single line change:

Suggested change
import xml.etree.ElementTree as ET
import defusedxml.ElementTree as ET

This comment was generated by an experimental AI tool.

from xml.dom import minidom

Check failure on line 7 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L7

The Python documentation recommends using `defusedxml` instead of `xml` because the native Python `xml` library is vulnerable to XML External Entity (XXE) attacks.

Check warning on line 7 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L7

Using minidom to parse untrusted XML data is known to be vulnerable to XML attacks. Replace minidom with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called.

Check warning on line 7 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L7

Using minidom to parse untrusted XML data is known to be vulnerable to XML attacks. Replace minidom with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called. (B408)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using minidom to parse untrusted XML data is known to be vulnerable to XML attacks. Replace minidom with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called. (B408)

The issue identified by the Prospector linter is related to the use of the minidom module from the standard library for parsing XML data. The minidom parser is vulnerable to various XML attacks, such as XML External Entity (XXE) attacks, when processing untrusted XML input. This can lead to security vulnerabilities, including unauthorized access to files or services.

To mitigate this risk, the recommended approach is to use the defusedxml package, which is designed to provide secure XML parsing by preventing these types of attacks. The defusedxml package includes a function defuse_stdlib() that can be called to make the standard XML parsers safe to use.

To address this, we can replace the import statement for minidom with an import from defusedxml and ensure that it is used securely. Here’s the single line change to fix the issue:

Suggested change
from xml.dom import minidom
from defusedxml.ElementTree import parse

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codacy found a critical Security issue: The Python documentation recommends using defusedxml instead of xml because the native Python xml library is vulnerable to XML External Entity (XXE) attacks.

The issue identified by the Semgrep linter pertains to the potential vulnerability of the native Python xml library to XML External Entity (XXE) attacks. These attacks can occur when an XML parser processes external entities, allowing an attacker to access sensitive files or perform other malicious actions. The recommendation is to use the defusedxml library, which is designed to mitigate these vulnerabilities.

To fix the issue, you should replace the import statement for the xml.dom.minidom module with the equivalent import from the defusedxml library. Here’s the suggested code change:

Suggested change
from xml.dom import minidom
from defusedxml.minidom import parseString

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using minidom to parse untrusted XML data is known to be vulnerable to XML attacks. Replace minidom with the equivalent defusedxml package, or make sure defusedxml.defuse_stdlib() is called.

The issue identified by the Bandit linter is related to the use of the minidom module from the standard library for parsing XML. This module is vulnerable to various XML attacks, such as XML External Entity (XXE) attacks, when processing untrusted XML data. To mitigate this risk, it is recommended to use the defusedxml package, which provides a safe way to handle XML parsing.

To fix the issue, you can replace the import of minidom with the import of defusedxml.ElementTree, which serves a similar purpose but is designed to be secure against XML attacks.

Here's the suggested single line change:

Suggested change
from xml.dom import minidom
from defusedxml.ElementTree import parse as defused_parse

This comment was generated by an experimental AI tool.

import sys

def get_repositories(baseurl, provider, organization, token):
headers = {
'Accept': 'application/json',
'api-token': token
}
url = f'{baseurl}/api/v3/organizations/{provider}/{organization}/repositories'
r = requests.get(url, headers=headers)

Check warning on line 16 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L16

Requests call without timeout

Check warning on line 16 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L16

The application was found using the `requests` module without configuring a timeout value for connections.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: The application was found using the requests module without configuring a timeout value for connections.

The issue identified by the Semgrep linter is that the requests.get() call does not specify a timeout value. Not configuring a timeout can lead to situations where the application hangs indefinitely if the server does not respond, which can be a significant security and stability risk.

To address this issue, you should specify a timeout value in the requests.get() call. A common practice is to set a reasonable timeout (e.g., 5 seconds), which allows the request to fail gracefully if the server is unresponsive.

Here is the code suggestion to fix the issue:

Suggested change
r = requests.get(url, headers=headers)
r = requests.get(url, headers=headers, timeout=5)

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Requests call without timeout

The issue identified by the Bandit linter is that the requests.get call does not specify a timeout. Without a timeout, the request could hang indefinitely if the server does not respond, leading to potential denial of service or unresponsive behavior in your application. It's best practice to always set a timeout for network requests to ensure that your application can handle such situations gracefully.

To fix this issue, you can add a timeout parameter to the requests.get call. Here’s the code suggestion to implement this:

Suggested change
r = requests.get(url, headers=headers)
r = requests.get(url, headers=headers, timeout=10)

In this example, a timeout of 10 seconds is specified, but you can adjust this value as needed based on your application's requirements.


This comment was generated by an experimental AI tool.

r.raise_for_status()
repositories = r.json()
if len(repositories['data']) == 0:
raise Exception(f'No repositories found for org {organization}')
return repositories['data']

def select_repository(repositories):
print("\nAvailable repositories:")
for i, repo in enumerate(repositories, 1):
print(f"{i}. {repo['name']}")

Check notice on line 27 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L27

Trailing whitespace

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Codacy found a minor CodeStyle issue: Trailing whitespace

The issue identified by Pylint is a "Trailing whitespace" problem, which means that there is unnecessary whitespace (spaces or tabs) at the end of a line. This can lead to inconsistent formatting and is generally discouraged in Python code as it can affect readability and maintainability.

To fix the issue, you should remove any trailing whitespace from the line in question. Assuming the line with trailing whitespace is the last line of the provided code fragment, the suggested change would be:

Suggested change
return repositories['data']

This comment was generated by an experimental AI tool.

while True:
try:
choice = int(input("\nEnter the number of the repository you want to use: "))
if 1 <= choice <= len(repositories):
return repositories[choice - 1]
else:
print("Invalid selection. Please try again.")
except ValueError:
print("Please enter a valid number.")

#TODO: only returns the first one
def getCodingStandards(baseurl, provider, organization, token):
def list_tools(baseurl, token):
headers = {
'Accept': 'application/json',
'api-token': token
}
url = f'{baseurl}/api/v3/organizations/{provider}/{organization}/coding-standards'
url = f'{baseurl}/api/v3/tools'
r = requests.get(url, headers=headers)
codingStandards = json.loads(r.text)
if len(codingStandards['data']) == 0:
raise Exception(f'No Coding Standards for org {organization}')
return codingStandards['data'][0]
r.raise_for_status()
return r.json()

def getCodePatternsForTool(baseurl, provider, organization,codingStandardId, toolUuid, token):
def getCodePatternsForTool(baseurl, provider, organization, repository, toolUuid, token):
headers = {
'Accept': 'application/json',
'api-token': token
}
url = f'{baseurl}/api/v3/organizations/{provider}/{organization}/coding-standards/{codingStandardId}/tools/{toolUuid}/patterns?limit=1000'
url = f'{baseurl}/api/v3/analysis/organizations/{provider}/{organization}/repositories/{repository}/tools/{toolUuid}/patterns'
r = requests.get(url, headers=headers)
patterns = json.loads(r.text)["data"]
return patterns
r.raise_for_status()
return r.json()

def generateFileForPMD(patterns):
rules = []
Expand Down Expand Up @@ -57,29 +82,130 @@
</description>
{rules}
</ruleset>'''.format(rules=''.join(rules))
f = open("ruleset.xml", "a")
f.write(document)
f.close()
with open("pmd_ruleset.xml", "w") as f:
f.write(document)
print("PMD configuration has been saved to pmd_ruleset.xml")

def generateFileForSemgrep(patterns):
rules = []
for p in patterns:
if p['enabled']:
pattern = p['patternDefinition']
rule = {
"id": f"Semgrep_codacy.{pattern['id']}",
"pattern": pattern.get('pattern', ''),
"message": pattern.get('description', ''),
"severity": p.get('severity', 'WARNING').lower(),
"languages": pattern.get('languages', []),
}
if p["parameters"]:
rule["parameters"] = p["parameters"]
rules.append(rule)

config = {"rules": rules}
with open("semgrep_config.yaml", "w") as f:
yaml.dump(config, f, default_flow_style=False)
print("Semgrep configuration has been saved to semgrep_config.yaml")

def generateFileForCheckstyle(patterns):
root = ET.Element("module", name="Checker")
tree_walker = ET.SubElement(root, "module", name="TreeWalker")

for p in patterns:
if p['enabled']:
pattern = p['patternDefinition']
module = ET.SubElement(tree_walker, "module", name=pattern['id'].replace('Checkstyle_', ''))
ET.SubElement(module, "property", name="severity", value=p.get('severity', 'warning'))
for param in p["parameters"]:
ET.SubElement(module, "property", name=param["name"], value=str(param["value"]))

xml_declaration = '<?xml version="1.0"?>'
doctype = '<!DOCTYPE module PUBLIC "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN" "https://checkstyle.org/dtds/configuration_1_3.dtd">'

rough_string = ET.tostring(root, 'unicode')
reparsed = minidom.parseString(rough_string)

Check warning on line 126 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L126

The application was found using the `xml.dom.minidom` package for processing XML. Python's default XML processors suffer from various XML parsing vulnerabilities and care must be taken when handling XML data.

Check warning on line 126 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L126

Using xml.dom.minidom.parseString to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.dom.minidom.parseString with its defusedxml equivalent function or make sure defusedxml.defuse_stdlib() is called

Check warning on line 126 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L126

Using xml.dom.minidom.parseString to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.dom.minidom.parseString with its defusedxml equivalent function or make sure defusedxml.defuse_stdlib() is called (B318)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using xml.dom.minidom.parseString to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.dom.minidom.parseString with its defusedxml equivalent function or make sure defusedxml.defuse_stdlib() is called (B318)

The issue identified by the Prospector linter is related to the use of xml.dom.minidom.parseString, which can be vulnerable to XML External Entity (XXE) attacks when parsing untrusted XML data. This vulnerability can allow an attacker to read arbitrary files on the server or perform other malicious actions by including external entities in the XML input. To mitigate this risk, it is recommended to use the defusedxml library, which provides a safer alternative for parsing XML.

To fix the issue, you can replace the minidom.parseString method with defusedxml.minidom.parseString, which is designed to handle untrusted XML data safely.

Here is the suggested code change:

Suggested change
reparsed = minidom.parseString(rough_string)
reparsed = defusedxml.minidom.parseString(rough_string)

This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: The application was found using the xml.dom.minidom package for processing XML. Python's default XML processors suffer from various XML parsing vulnerabilities and care must be taken when handling XML data.

The issue identified by the Semgrep linter relates to the use of the xml.dom.minidom module for parsing XML. This module is known to be vulnerable to various XML parsing attacks, such as XML External Entity (XXE) attacks, which can lead to security risks if the XML data being processed comes from untrusted sources. The warning suggests that care must be taken when handling XML data, particularly when using default XML processors like minidom.

To address this security concern, it is recommended to use a more secure XML parsing library, such as lxml, which provides better security features and is more robust against such vulnerabilities.

Here’s the suggested change to replace minidom.parseString with a safer alternative using lxml:

Suggested change
reparsed = minidom.parseString(rough_string)
reparsed = etree.fromstring(rough_string)

This change assumes you have imported from lxml import etree at the beginning of your script.


This comment was generated by an experimental AI tool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Codacy found a medium Security issue: Using xml.dom.minidom.parseString to parse untrusted XML data is known to be vulnerable to XML attacks. Replace xml.dom.minidom.parseString with its defusedxml equivalent function or make sure defusedxml.defuse_stdlib() is called

The issue identified by the Bandit linter pertains to the use of xml.dom.minidom.parseString, which is vulnerable to various XML attacks, including XML External Entity (XXE) attacks. When parsing untrusted XML data, this vulnerability can be exploited to read arbitrary files or perform other malicious actions.

To mitigate this risk, it is recommended to use the defusedxml library, which provides a safer alternative to the standard XML parsing libraries by disabling potentially dangerous features.

To fix the issue, you can replace the minidom.parseString with defusedxml.minidom.parseString. This change will ensure that the XML parsing is done in a secure manner.

Here's the single line change needed:

Suggested change
reparsed = minidom.parseString(rough_string)
reparsed = defusedxml.minidom.parseString(rough_string)

This comment was generated by an experimental AI tool.

pretty_xml = reparsed.toprettyxml(indent=" ")

# Remove the XML declaration from the pretty-printed XML
pretty_xml_lines = pretty_xml.splitlines(True)
pretty_xml_without_declaration = ''.join(pretty_xml_lines[1:])

with open("checkstyle_config.xml", "w", encoding="utf-8") as f:
f.write(f"{xml_declaration}{doctype}\n{pretty_xml_without_declaration}")

print("Checkstyle configuration has been saved to checkstyle_config.xml")

def main():
print('Welcome to Codacy Config File Generator - A temporary solution')
print('!!!!!! CURRENTLY ONLY WORKS FOR PMD !!!!!!')
print('Welcome to Codacy Config File Generator')
parser = argparse.ArgumentParser(description='Codacy Engine Helper')
parser.add_argument('--token', dest='token', default=None,
parser.add_argument('--token', dest='token', required=True,
help='the api-token to be used on the REST API')
parser.add_argument('--provider', dest='provider',
default=None, help='git provider')
parser.add_argument('--organization', dest='organization',
default=None, help='organization id')
parser.add_argument('--tooluuid', dest='toolUuid',
default=None, help='Tool Uuid')
parser.add_argument('--provider', dest='provider', required=True,
help='git provider')
parser.add_argument('--organization', dest='organization', required=True,
help='organization id')
parser.add_argument('--baseurl', dest='baseurl', default='https://app.codacy.com',
help='codacy server address (ignore if cloud)')
args = parser.parse_args()
cs = getCodingStandards(args.baseurl, args.provider, args.organization, args.token)
patterns = getCodePatternsForTool(args.baseurl, args.provider, args.organization, cs["id"], args.toolUuid, args.token)

#TODO: decide what's the tool, currently only pmd
generateFileForPMD(patterns)
main()
try:
repositories = get_repositories(args.baseurl, args.provider, args.organization, args.token)
selected_repo = select_repository(repositories)
print(f"Selected repository: {selected_repo['name']}")

tools = list_tools(args.baseurl, args.token)
tool_uuids = {tool['name'].lower(): tool['uuid'] for tool in tools['data']}

except requests.exceptions.RequestException as e:
print(f"Error fetching data: {e}")
sys.exit(1)
except Exception as e:
print(f"Unexpected error: {e}")
sys.exit(1)

print("\nSelect an option:")
print("1. Generate PMD Config")
print("2. Generate Semgrep Config")
print("3. Generate Checkstyle Config")
print("4. Generate all config files")
print("5. Exit")

choice = input("Enter your choice (1-5): ")

try:
if choice == '1':
patterns = getCodePatternsForTool(args.baseurl, args.provider, args.organization, selected_repo['name'], tool_uuids['pmd'], args.token)
generateFileForPMD(patterns['data'])
elif choice == '2':
patterns = getCodePatternsForTool(args.baseurl, args.provider, args.organization, selected_repo['name'], tool_uuids['semgrep'], args.token)
generateFileForSemgrep(patterns['data'])
elif choice == '3':
patterns = getCodePatternsForTool(args.baseurl, args.provider, args.organization, selected_repo['name'], tool_uuids['checkstyle'], args.token)
generateFileForCheckstyle(patterns['data'])
elif choice == '4':
for tool, uuid in tool_uuids.items():
if tool in ['pmd', 'semgrep', 'checkstyle']:
patterns = getCodePatternsForTool(args.baseurl, args.provider, args.organization, selected_repo['name'], uuid, args.token)
if tool == "pmd":
generateFileForPMD(patterns['data'])
elif tool == "semgrep":
generateFileForSemgrep(patterns['data'])
elif tool == "checkstyle":
generateFileForCheckstyle(patterns['data'])
elif choice == '5':
print("Exiting the program.")
sys.exit(0)
else:
print("Invalid choice. Exiting the program.")
sys.exit(1)
except requests.exceptions.RequestException as e:
print(f"Error fetching patterns: {e}")
sys.exit(1)
except Exception as e:
print(f"Unexpected error: {e}")
sys.exit(1)

print("Config file(s) generated successfully. Exiting the program.")

if __name__ == "__main__":

Check notice on line 210 in config_file_generator.py

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

config_file_generator.py#L210

expected 2 blank lines after class or function definition, found 1 (E305)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Codacy found a minor CodeStyle issue: expected 2 blank lines after class or function definition, found 1 (E305)

The issue reported by the Prospector linter is related to the PEP 8 style guide, which recommends having two blank lines after the definition of a function or class. In this case, the line if __name__ == "__main__": is not preceded by the required two blank lines after the main() function definition, which results in the E305 error.

To fix this issue, you should add an additional blank line before the if __name__ == "__main__": line.

Here's the code suggestion to fix the issue:

    print("Config file(s) generated successfully. Exiting the program.")


if __name__ == "__main__":

This comment was generated by an experimental AI tool.

main()