Merged
Conversation
New files associated with this scanner are:
- scan_vehicle.flex: A pattern scanner to recognize vehicle related
identifiers. Currently limited to VINs, but
could be expanded to include HINs, etc.
Patterned after scan_accts.flex
- scan_vin.cpp: A validator for Vehicle Identification Numbers
(VINs). Validates rules for World Manufacturer
Identifier, Vehicle Descriptor Section (including
check digit) and Vehicle Identifier Section to
ensure capture of valid VINs. Structure adapted
from scan_ccns2.cpp
- scan_vin.h: Header for scan_vin.cpp
Some initial test files are also included, using synthesized valid
and invalid VINs:
- src/tests/test_synthetic_vin.json
- src/tests/test_vin.txt
- src/tests/test_vin_doc.odt
Build and scanner registration files modified:
- src/Makefile.am
- src/bulk_extractor_scanners.h
Individual scanner functionality tests added to test_be1 for scan_vin
scanner to validate operation, consistent with other test structures.
Included tests:
- scan_vin_validation: general tests for valid and invalid VINs
- scan_vin1: test with valid VINs labeled in context
- scan_vin2: test with VINs in open text, no context
- scan_vin3: invalid VIN tests (disallowed characters, bad check
digits, lowercase characters, all-digit VINs)
- scan_vin_json: scan valid/invalid VINs appearing in JSON file
- scan_vin_year_codes: test year codes
- scan_vin_context: test that VINs in hex dump context are disregarded
Verified via: make test_be && test_be "[scanners]"
Integration tests added to test_be2 for scan_vehicle to validate operation, consistent with existing test structures. Included tests: - test_vin: test with test_vin.txt from the tests/ directory - test_vin_json: test with test_synthetic_vin.json from the tests/ directory - test_vin_odt: test with the ODT doc in the tests/ directory Verified via: make test_be && test_be "[phase1]"
There was a problem hiding this comment.
Pull Request Overview
This PR adds a Vehicle Identification Number (VIN) scanner to bulk_extractor for forensic analysis. The implementation validates VINs according to ISO 3779 standards, including check digit verification and character validation.
Key changes:
- New VIN validation logic with check digit calculation and WMI/VDS/VIS validation
- Flex scanner for detecting VINs with context-aware filtering to reduce false positives
- Comprehensive test suite covering valid/invalid VINs, different formats, and edge cases
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/scan_vin.cpp | Core VIN validation implementation with check digit and structural validation |
| src/scan_vin.h | Header file defining VIN validation functions |
| src/scan_vehicle.flex | Flex scanner for detecting VINs in data streams with context filtering |
| src/test_be1.cpp | Unit tests for VIN validation and scanner functionality |
| src/test_be2.cpp | Integration tests for VIN scanner with various file formats |
| src/tests/test_vin.txt | Test data with valid and invalid VINs |
| src/tests/test_synthetic_vin.json | Synthetic VIN test data |
| src/tests/test_vin_doc.odt | Binary ODT test file containing VINs |
| src/bulk_extractor_scanners.h | Registered vehicle scanner |
| src/Makefile.am | Build system updates for new scanner |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Owner
Author
Owner
Author
|
@kamwoods — Any idea why codecov didn't run here? |
- Error suppresion logic in scan_vehicle.flex - Character check bug in scan_vin.cpp - Misc comment typos - Removed an unused pattern from scan_vehicle.flex
- Removed dead code blocks from a previous approach in scan_vin.cpp and scan_vin.h
Collaborator
|
Cleaned up some dead code blocks after a codecov run. Coverage on scan_accts.flex and scan_vin.cpp improved. |
Minor modification of scan_vehicle.flex to stop generating extraneous vin_manufacturer.txt and vin_year.txt histograms. Brings the vehicle scan histogram generation behavior in line with other bulk_extractor scanners, generating a single vin_histogram.txt on each run.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.