Skip to content

Fix Multiple Artifacts Processing Errors#1400

Open
michaelgrace72 wants to merge 5 commits intoabrignoni:mainfrom
michaelgrace72:fix/health
Open

Fix Multiple Artifacts Processing Errors#1400
michaelgrace72 wants to merge 5 commits intoabrignoni:mainfrom
michaelgrace72:fix/health

Conversation

@michaelgrace72
Copy link

Summary

This PR fixes 5 errors that caused artifact processing to fail during iOS data extraction. All fixes have been tested successfully with iOS 15 Public Image

Artifacts Fixed

  1. chromeLoginData - Duplicate Column Error
    The all_data_headers was adding 'Browser Name' even though it already exist in data_header.
    Before:
    all_data_headers = lava_data_headers + ['Browser Name']
    After:
    all_data_headers = lava_data_headers

  2. cloudkitParticipants - Type Error
    Code assumed participant['UserIdentity'] was always a dictionary, but it could be a string
    Fix: Added type checking and safe dictionary access

  3. cloudkitSharing - Type Error
    Same issue as cloudkitParticipants - participant data structure varied.

  4. FitnessWorkoutsLocationData - Missing Table
    Code queried associations table without checking if it exists.
    Fix: Added table existence check

  5. health - Index Out of Range
    Query returns varying column counts depending on database schema, but code assumed all indices exist.
    Fix: Added bound checking

Copilot AI review requested due to automatic review settings December 8, 2025 21:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes 5 critical errors that caused iOS data extraction artifact processing to fail, improving the robustness of the code through defensive programming techniques.

Summary: The changes add proper error handling, type checking, and bounds checking to prevent runtime errors during iOS artifact data extraction. All fixes address specific crash scenarios encountered with iOS 15 Public Image data.

Key Changes:

  • Added comprehensive bounds checking for database record access to prevent IndexError exceptions
  • Implemented type validation and safe dictionary access for CloudKit participant data structures
  • Fixed duplicate column header issue in Chrome login data processing
  • Added table existence verification before querying associations table

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
scripts/artifacts/health.py Added bounds checking for all 28 record indices to handle varying database schemas
scripts/artifacts/cloudkitSharing.py Implemented type checking and safe dictionary access for participant data to handle string/dict type variations
scripts/artifacts/cloudkitParticipants.py Added similar type validation and error handling for participant data structures
scripts/artifacts/chrome.py Removed duplicate 'Browser Name' column addition to fix column header error
scripts/artifacts/FitnessWorkoutsLocationData.py Added associations table existence check to prevent query errors

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 546 to 561
data_list.append(
(start_timestamp, end_timestamp, str(record[2]).title(), record[3],
record[4], record[5], record[6], record[7], record[8], record[9],
record[10], record[11], record[12], record[13], record[14],
record[15], celcius_temp, record[16], record[17], record[18],
record[19], record[20], record[21], record[22], device_model,
record[23], record[24], record[25], added_timestamp)
(start_timestamp, end_timestamp, str(record[2]).title() if len(record) > 2 else '',
record[3] if len(record) > 3 else '', record[4] if len(record) > 4 else '',
record[5] if len(record) > 5 else '', record[6] if len(record) > 6 else '',
record[7] if len(record) > 7 else '', record[8] if len(record) > 8 else '',
record[9] if len(record) > 9 else '', record[10] if len(record) > 10 else '',
record[11] if len(record) > 11 else '', record[12] if len(record) > 12 else '',
record[13] if len(record) > 13 else '', record[14] if len(record) > 14 else '',
record[15] if len(record) > 15 else '', celcius_temp,
record[16] if len(record) > 16 else '', record[17] if len(record) > 17 else '',
record[18] if len(record) > 18 else '', record[19] if len(record) > 19 else '',
record[20] if len(record) > 20 else '', record[21] if len(record) > 21 else '',
record[22] if len(record) > 22 else '', device_model,
record[23] if len(record) > 23 else '', record[24] if len(record) > 24 else '',
record[25] if len(record) > 25 else '', added_timestamp)
)
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The tuple construction spans multiple lines without proper formatting. The opening parenthesis on line 546 starts the tuple, but line 547 begins the first element without proper indentation alignment. This makes the code harder to read and maintain.

Consider formatting it as:

data_list.append((
    start_timestamp, end_timestamp, str(record[2]).title() if len(record) > 2 else '',
    record[3] if len(record) > 3 else '', record[4] if len(record) > 4 else '',
    # ... rest of elements
    record[25] if len(record) > 25 else '', added_timestamp
))

Copilot uses AI. Check for mistakes.
Comment on lines +547 to +560
(start_timestamp, end_timestamp, str(record[2]).title() if len(record) > 2 else '',
record[3] if len(record) > 3 else '', record[4] if len(record) > 4 else '',
record[5] if len(record) > 5 else '', record[6] if len(record) > 6 else '',
record[7] if len(record) > 7 else '', record[8] if len(record) > 8 else '',
record[9] if len(record) > 9 else '', record[10] if len(record) > 10 else '',
record[11] if len(record) > 11 else '', record[12] if len(record) > 12 else '',
record[13] if len(record) > 13 else '', record[14] if len(record) > 14 else '',
record[15] if len(record) > 15 else '', celcius_temp,
record[16] if len(record) > 16 else '', record[17] if len(record) > 17 else '',
record[18] if len(record) > 18 else '', record[19] if len(record) > 19 else '',
record[20] if len(record) > 20 else '', record[21] if len(record) > 21 else '',
record[22] if len(record) > 22 else '', device_model,
record[23] if len(record) > 23 else '', record[24] if len(record) > 24 else '',
record[25] if len(record) > 25 else '', added_timestamp)
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bounds checking logic is inconsistent with DRY (Don't Repeat Yourself) principle. Each index is checked individually with len(record) > N, resulting in significant code duplication across 26+ checks. This approach is error-prone and makes the code harder to maintain.

Consider refactoring to use a helper function or list comprehension:

def safe_get(record, index, default=''):
    return record[index] if len(record) > index else default

# Then use: safe_get(record, 3), safe_get(record, 4), etc.

Or pad the record to expected length:

# Pad record to expected length
padded_record = list(record) + [''] * (28 - len(record))

Copilot uses AI. Check for mistakes.
if record_id: # Only add valid entries
user_dictionary[record_id] = [record_id, email_address, phone_number, name_prefix, first_name, middle_name, last_name, name_suffix, nickname]
except (KeyError, TypeError, AttributeError) as e:
logfunc(f'Error processing participant data: {str(e)}')
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging uses an f-string with str(e) which is redundant. When an exception is formatted in an f-string, it's automatically converted to a string.

Change:

logfunc(f'Error processing participant data: {str(e)}')

To:

logfunc(f'Error processing participant data: {e}')
Suggested change
logfunc(f'Error processing participant data: {str(e)}')
logfunc(f'Error processing participant data: {e}')

Copilot uses AI. Check for mistakes.
if record_id:
user_dictionary[record_id] = [record_id, email_address, phone_number, name_prefix, first_name, middle_name, last_name, name_suffix, nickname]
except (KeyError, TypeError, AttributeError) as e:
logfunc(f'Error processing participant data: {str(e)}')
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging uses an f-string with str(e) which is redundant. When an exception is formatted in an f-string, it's automatically converted to a string.

Change:

logfunc(f'Error processing participant data: {str(e)}')

To:

logfunc(f'Error processing participant data: {e}')
Suggested change
logfunc(f'Error processing participant data: {str(e)}')
logfunc(f'Error processing participant data: {e}')

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +89
if not isinstance(participant, dict) or 'UserIdentity' not in participant:
continue

# must be dict and have UserIdentity
user_identity = participant.get('UserIdentity', {})
if not isinstance(user_identity, dict):
continue

user_record_id = user_identity.get('UserRecordID', {})
if not isinstance(user_record_id, dict):
continue

record_id = user_record_id.get('RecordName', '')
lookup_info = user_identity.get('LookupInfo', {})
email_address = lookup_info.get('EmailAddress', '') if isinstance(lookup_info, dict) else ''
phone_number = lookup_info.get('PhoneNumber', '') if isinstance(lookup_info, dict) else ''

name_components = user_identity.get('NameComponents', {})
name_private = name_components.get('NS.nameComponentsPrivate', {}) if isinstance(name_components, dict) else {}
first_name = name_private.get('NS.givenName', '') if isinstance(name_private, dict) else ''
middle_name = name_private.get('NS.middleName', '') if isinstance(name_private, dict) else ''
last_name = name_private.get('NS.familyName', '') if isinstance(name_private, dict) else ''
name_prefix = name_private.get('NS.namePrefix', '') if isinstance(name_private, dict) else ''
name_suffix = name_private.get('NS.nameSuffix', '') if isinstance(name_private, dict) else ''
nickname = name_private.get('NS.nickname', '') if isinstance(name_private, dict) else ''
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The nested type checks are repetitive and could be simplified. Each .get() call is followed by an isinstance() check with early return, which adds unnecessary verbosity.

Consider using a more Pythonic approach with try-except or simplifying the nested checks:

try:
    user_identity = participant.get('UserIdentity', {})
    record_id = user_identity.get('UserRecordID', {}).get('RecordName', '')
    lookup_info = user_identity.get('LookupInfo', {})
    # ... continue with safe gets
except (AttributeError, TypeError):
    continue

This is more maintainable and follows Python's "easier to ask for forgiveness than permission" (EAFP) principle.

Suggested change
if not isinstance(participant, dict) or 'UserIdentity' not in participant:
continue
# must be dict and have UserIdentity
user_identity = participant.get('UserIdentity', {})
if not isinstance(user_identity, dict):
continue
user_record_id = user_identity.get('UserRecordID', {})
if not isinstance(user_record_id, dict):
continue
record_id = user_record_id.get('RecordName', '')
lookup_info = user_identity.get('LookupInfo', {})
email_address = lookup_info.get('EmailAddress', '') if isinstance(lookup_info, dict) else ''
phone_number = lookup_info.get('PhoneNumber', '') if isinstance(lookup_info, dict) else ''
name_components = user_identity.get('NameComponents', {})
name_private = name_components.get('NS.nameComponentsPrivate', {}) if isinstance(name_components, dict) else {}
first_name = name_private.get('NS.givenName', '') if isinstance(name_private, dict) else ''
middle_name = name_private.get('NS.middleName', '') if isinstance(name_private, dict) else ''
last_name = name_private.get('NS.familyName', '') if isinstance(name_private, dict) else ''
name_prefix = name_private.get('NS.namePrefix', '') if isinstance(name_private, dict) else ''
name_suffix = name_private.get('NS.nameSuffix', '') if isinstance(name_private, dict) else ''
nickname = name_private.get('NS.nickname', '') if isinstance(name_private, dict) else ''
user_identity = participant.get('UserIdentity', {})
user_record_id = user_identity.get('UserRecordID', {})
record_id = user_record_id.get('RecordName', '')
lookup_info = user_identity.get('LookupInfo', {})
email_address = lookup_info.get('EmailAddress', '')
phone_number = lookup_info.get('PhoneNumber', '')
name_components = user_identity.get('NameComponents', {})
name_private = name_components.get('NS.nameComponentsPrivate', {})
first_name = name_private.get('NS.givenName', '')
middle_name = name_private.get('NS.middleName', '')
last_name = name_private.get('NS.familyName', '')
name_prefix = name_private.get('NS.namePrefix', '')
name_suffix = name_private.get('NS.nameSuffix', '')
nickname = name_private.get('NS.nickname', '')

Copilot uses AI. Check for mistakes.
cursor = db.cursor()

if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True:
if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True and does_table_exist_in_db(healthdb_secure, 'associations') == True:
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boolean comparison == True is redundant. In Python, boolean values can be used directly in conditional statements.

Change:

if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True and does_table_exist_in_db(healthdb_secure, 'associations') == True:

To:

if does_table_exist_in_db(healthdb_secure, 'location_series_data') and does_table_exist_in_db(healthdb_secure, 'associations'):
Suggested change
if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True and does_table_exist_in_db(healthdb_secure, 'associations') == True:
if does_table_exist_in_db(healthdb_secure, 'location_series_data') and does_table_exist_in_db(healthdb_secure, 'associations'):

Copilot uses AI. Check for mistakes.
if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True:
if does_table_exist_in_db(healthdb_secure, 'location_series_data') == True and does_table_exist_in_db(healthdb_secure, 'associations') == True:

# Fitness Workouts Location Data Analysis
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 33 is not properly indented. It should be indented to match the block it belongs to (inside the if statement).

Current:

if does_table_exist_in_db(...):

# Fitness Workouts Location Data Analysis

Should be:

if does_table_exist_in_db(...):
    # Fitness Workouts Location Data Analysis
Suggested change
# Fitness Workouts Location Data Analysis
# Fitness Workouts Location Data Analysis

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant