Skip to content

fix: fingerprinting profiles#29

Merged
mclueppers merged 1 commit intomainfrom
fix/fingerprinting-profiles
Dec 24, 2025
Merged

fix: fingerprinting profiles#29
mclueppers merged 1 commit intomainfrom
fix/fingerprinting-profiles

Conversation

@mclueppers
Copy link
Contributor

Don't use form fields for fingerprint profiling.

Don't use form fields for fingerprint profiling.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors fingerprint profiling to remove form field names from fingerprints and introduces a new header consistency validation system for improved bot detection. The changes enhance the WAF's ability to identify spoofed browsers and suspicious clients by analyzing HTTP header patterns.

  • Removed include_field_names parameter from fingerprint generation to avoid using form data in fingerprints
  • Added new header_consistency.lua module with User-Agent parsing and browser header validation
  • Introduced multiple new fingerprint profiles for monitoring bots, modern browsers, mobile apps, API clients, and suspicious patterns

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
openresty/lua/header_consistency.lua New module for User-Agent parsing and header consistency validation using lua-resty-woothee
openresty/lua/fingerprint_profiles.lua Removed form field fingerprinting, added 8 new built-in profiles, added request context classification functions
openresty/lua/waf_handler.lua Integrated header consistency checks into request processing pipeline
openresty/lua/api_handlers/fingerprint_profiles.lua Removed include_field_names from default profile configuration
openresty/Dockerfile Added lua-resty-woothee dependency installation via luarocks
admin-ui/src/pages/security/FingerprintProfiles.tsx Removed UI form field for include_field_names toggle
admin-ui/src/api/types.ts Removed include_field_names from TypeScript interface

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

matching = {
conditions = {
-- Chrome 80+ should have Sec-Fetch headers (introduced in Chrome 76)
{ header = "User-Agent", condition = "matches", pattern = "Chrome/([89][0-9]|1[0-2][0-9])" },
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern Chrome/([89][0-9]|1[0-2][0-9]) matches Chrome versions 80-129, but Chrome has already surpassed version 129 (current versions are 130+). This pattern should be updated to include current Chrome versions. Consider using a pattern like Chrome/([89][0-9]|1[0-9]{2}|[2-9][0-9]{2}) to match Chrome 80+ more comprehensively, or adjust the upper bound to a more future-proof range.

Suggested change
{ header = "User-Agent", condition = "matches", pattern = "Chrome/([89][0-9]|1[0-2][0-9])" },
{ header = "User-Agent", condition = "matches", pattern = "Chrome/([89][0-9]|1[0-9]{2}|[2-9][0-9]{2})" },

Copilot uses AI. Check for mistakes.
-- Java libraries
"java|httpclient|okhttp|apache-httpclient|spring-resttemplate|restassured|" ..
-- PHP libraries
"guzzle|guzzlehttp|symfony.*http|" ..
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern symfony.*http uses an unescaped dot followed by an asterisk, which matches any characters. If you intend to match literal dots in package names like "symfony.http" or "symfony-http", consider using symfony[.-]http or escaping the dot as symfony\\..*http if you want to match "symfony." followed by any characters and then "http".

Suggested change
"guzzle|guzzlehttp|symfony.*http|" ..
"guzzle|guzzlehttp|symfony[./-].*http|" ..

Copilot uses AI. Check for mistakes.
Comment on lines +297 to +299
"jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw\\/|" ..
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern ab\/ uses an escaped forward slash which is unnecessary in Lua regex patterns. The forward slash doesn't need escaping in Lua (unlike in JavaScript or other languages where regex literals use / delimiters). You can simplify this to ab/ for better readability.

Suggested change
"jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw\\/|" ..
"jmeter|apache-jmeter|wrk|ab/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw/|" ..

Copilot uses AI. Check for mistakes.
}

-- Expected header profiles per browser family
-- Based on actual browser behavior in 2024/2025
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "Based on actual browser behavior in 2024/2025" but this creates a maintenance issue as the comment will quickly become outdated. Consider either removing the year references or updating to say "Based on current browser behavior" or "Based on modern browser behavior (as of 2024/2025)".

Suggested change
-- Based on actual browser behavior in 2024/2025
-- Based on current browser behavior

Copilot uses AI. Check for mistakes.
Comment on lines +281 to +301
{ header = "User-Agent", condition = "matches", pattern = "(?i)(curl|wget|" ..
-- Python libraries
"python-requests|python-urllib|aiohttp|httpx|urllib3|requests-html|scrapy|beautifulsoup|" ..
-- JavaScript/Node libraries
"axios|node-fetch|superagent|got|undici|cheerio|" ..
-- Java libraries
"java|httpclient|okhttp|apache-httpclient|spring-resttemplate|restassured|" ..
-- PHP libraries
"guzzle|guzzlehttp|symfony.*http|" ..
-- Go libraries
"go-http-client|fasthttp|go-resty|" ..
-- Rust libraries
"reqwest|hyper-client|" ..
-- Ruby/Perl
"ruby|perl|libwww|mechanize|httparty|" ..
-- Load testing tools
"jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw\\/|" ..
-- Generic patterns
"http-client|httpclient)" },
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The User-Agent pattern contains a very long alternation list with 40+ alternatives. While functional, this could impact regex matching performance. Consider breaking this into multiple, more specific profiles or using a more optimized pattern structure. For example, group related patterns or use character classes where appropriate.

Suggested change
{ header = "User-Agent", condition = "matches", pattern = "(?i)(curl|wget|" ..
-- Python libraries
"python-requests|python-urllib|aiohttp|httpx|urllib3|requests-html|scrapy|beautifulsoup|" ..
-- JavaScript/Node libraries
"axios|node-fetch|superagent|got|undici|cheerio|" ..
-- Java libraries
"java|httpclient|okhttp|apache-httpclient|spring-resttemplate|restassured|" ..
-- PHP libraries
"guzzle|guzzlehttp|symfony.*http|" ..
-- Go libraries
"go-http-client|fasthttp|go-resty|" ..
-- Rust libraries
"reqwest|hyper-client|" ..
-- Ruby/Perl
"ruby|perl|libwww|mechanize|httparty|" ..
-- Load testing tools
"jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw\\/|" ..
-- Generic patterns
"http-client|httpclient)" },
-- Core CLI tools
{ header = "User-Agent", condition = "matches", pattern = "(?i)(curl|wget)" },
-- Python libraries
{ header = "User-Agent", condition = "matches", pattern = "(?i)(python-requests|python-urllib|aiohttp|httpx|urllib3|requests-html|scrapy|beautifulsoup)" },
-- JavaScript/Node libraries
{ header = "User-Agent", condition = "matches", pattern = "(?i)(axios|node-fetch|superagent|got|undici|cheerio)" },
-- Java libraries
{ header = "User-Agent", condition = "matches", pattern = "(?i)(java|httpclient|okhttp|apache-httpclient|spring-resttemplate|restassured)" },
-- PHP and Go libraries
{ header = "User-Agent", condition = "matches", pattern = "(?i)(guzzle|guzzlehttp|symfony.*http|go-http-client|fasthttp|go-resty)" },
-- Rust, Ruby, Perl and related HTTP libraries
{ header = "User-Agent", condition = "matches", pattern = "(?i)(reqwest|hyper-client|ruby|perl|libwww|mechanize|httparty)" },
-- Load testing tools
{ header = "User-Agent", condition = "matches", pattern = "(?i)(jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta)" },
-- API testing tools
{ header = "User-Agent", condition = "matches", pattern = "(?i)(postman|insomnia|httpie|paw\\/)" },
-- Generic HTTP client patterns
{ header = "User-Agent", condition = "matches", pattern = "(?i)(http-client|httpclient)" },

Copilot uses AI. Check for mistakes.
-- Extract numeric version
local version = nil
if result.version and result.version ~= "UNKNOWN" then
version = tonumber(result.version:match("^(%d+)"))
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version extraction uses tonumber(result.version:match("^(%d+)")) which could fail if result.version is a non-string value. While the check result.version ~= "UNKNOWN" suggests it's a string, consider adding a type check or using tostring(result.version) before calling match to prevent potential runtime errors if the woothee library returns an unexpected value type.

Suggested change
version = tonumber(result.version:match("^(%d+)"))
local version_str = tostring(result.version)
local major = version_str:match("^(%d+)")
if major then
version = tonumber(major)
end

Copilot uses AI. Check for mistakes.
-- Mobile SDKs
"react.native|flutter|expo|cordova|ionic|capacitor|" ..
-- Native app patterns (app name followed by version)
"[a-z]+app\\/[0-9])" },
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern [a-z]+app\\/[0-9] at the end will only match lowercase letters followed by "app". This means it won't match app names like "MyApp" or "TestApp" that use uppercase letters. If you intend to match app names with mixed case, use [a-zA-Z]+app/[0-9] or add the case-insensitive flag (which is already present with (?i) at the start). However, note that the case-insensitive flag should apply to the entire pattern, so this should work as intended. The escaped slash is also unnecessary.

Copilot uses AI. Check for mistakes.
Comment on lines +107 to +110
local function get_header(ngx_vars, header_name)
local var_name = "http_" .. header_name:lower():gsub("-", "_")
return ngx_vars[var_name]
end
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_header function duplicates the logic of get_header_value function from fingerprint_profiles.lua. Both functions convert header names to nginx variable format identically. Consider extracting this to a shared utility module to avoid code duplication and ensure consistent behavior across modules.

Copilot uses AI. Check for mistakes.
-- Load testing tools
"jmeter|apache-jmeter|wrk|ab\\/|apachebench|bombardier|k6|locust|artillery|vegeta|" ..
-- API testing tools
"postman|insomnia|httpie|paw\\/|" ..
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern paw\/ uses an escaped forward slash which is unnecessary in Lua regex patterns. You can simplify this to paw/ for better readability.

Suggested change
"postman|insomnia|httpie|paw\\/|" ..
"postman|insomnia|httpie|paw/|" ..

Copilot uses AI. Check for mistakes.
opm get ledgetech/lua-resty-http && \
opm get anjia0532/lua-resty-maxminddb && \
opm get zmartzone/lua-resty-openidc && \
luarocks-5.1 install lua-resty-woothee && \
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new luarocks-5.1 install lua-resty-woothee step pulls and executes a third-party Lua module at build time without any pinning (version, commit hash, or checksum) or integrity verification, which exposes the image build to a supply-chain attack if the LuaRocks index or package is compromised or replaced. Because this code executes during the Docker build with access to the build context and potentially sensitive secrets (used by CI/CD or future extensions), an attacker controlling the fetched package could run arbitrary code, alter artifacts, or exfiltrate secrets. To mitigate this, pin lua-resty-woothee to a specific, vetted version or content hash and, where possible, verify its integrity (e.g., via a checksum or vendoring) rather than installing the latest mutable release.

Copilot uses AI. Check for mistakes.
@mclueppers mclueppers merged commit 7465997 into main Dec 24, 2025
9 checks passed
@mclueppers mclueppers deleted the fix/fingerprinting-profiles branch December 24, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant