-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Description
The insert_values function in the pandas Jupyter tutorial (docs/integrate/pandas/tutorial-jupyter.md, lines 260-369) builds SQL INSERT statements using string concatenation without proper parameterization. This creates potential SQL injection vulnerabilities when users adapt the code for other data sources.
Problem
At line 289, ticker values are directly interpolated into the SQL string:
values_array.append("({},\'{}\'',{})".format(
closing_date, row_values['ticker'], close_value))If a ticker contained a single quote (e.g., O'REILLY), it would break the SQL statement and could allow injection attacks with untrusted data sources.
Suggested Solution
- Add a warning note about the security implications for production use
- Demonstrate the safer approach using parameterized queries with
cursor.executemany()
Example safe implementation:
def insert_values(table_name, data):
create_table(table_name)
# Prepare data as list of tuples
values = []
for row in range(len(data)):
row_values = data.iloc[row, :]
close_value = row_values['close_value'] if not math.isnan(row_values['close_value']) else -1
closing_date = row_values['closing_date']
values.append((closing_date, row_values['ticker'], close_value))
# Use parameterized query
insert_stmt = f'INSERT INTO "{table_name}" (closing_date, ticker, close_value) VALUES (%s, %s, %s)'
cursor.executemany(insert_stmt, values)
print(f"Inserted {len(data)} rows in CrateDB")References
- Original discussion: pandas: Starter and Jupyter tutorial. Guidelines for efficient ingest. #297 (comment)
- File:
docs/integrate/pandas/tutorial-jupyter.md
Requested by
Metadata
Metadata
Assignees
Labels
No labels