Skip to content

Conversation

@bliuchak
Copy link
Contributor

@bliuchak bliuchak commented Oct 8, 2025

This PR implements HTTPS server support to proxy-chain.

flowchart LR
    subgraph HTTPS["Desired State"]
        direction LR
        C2[Client] -->|Encrypted HTTPS| P2[HTTPS Proxy] -->|HTTP/HTTPS| T2[Target]
    end
    subgraph HTTP["Current State"]
        direction LR
        C1[Client] -->|Cleartext HTTP| P1[HTTP Proxy] -->|HTTP/HTTPS| T1[Target]
    end
Loading

Readiness Checklist

  • Add ability to create either HTTP or HTTPS proxy server
  • Implement TLS overhead bytes counting for HTTPS proxy
  • Add tests (see Tests section)
  • Document limitations (see Known Limitations section)

Tests

  • Certificate validation and TLS security test/https_edge_cases.js
    • Expired certificates (strict SSL vs ignore errors)
    • Hostname mismatch detection and SNI handling
    • Invalid certificate chain validation
    • Multi-stage certificate validation in proxy chains
    • TLS version negotiation (rejects TLS 1.0/1.1, accepts TLS 1.2/1.3)
    • Strong cipher suite support
    • HTTPS target certificate handling via CONNECT tunnels
  • TLS Overhead Statistics - test/tls_overhead_stats.js:
    • Accurate byte counting including TLS handshake and encryption overhead
    • Fallback behavior when _parent socket unavailable (Node.js API stability)
    • Failed TLS handshake exclusion from statistics
    • Connection lifecycle monitoring with monotonic byte count validation
    • Keep-alive vs separate connections (TLS overhead amortization)
    • TLS session resumption tracking (80% overhead reduction in TLS 1.3)
    • Concurrent stats queries thread-safet
    • TLS overhead bytes counting for WebSocket + HTTPS Proxy (ws, wss)
    • TLS overhead bytes counting for SOCKS upstream
    • TLS overhead bytes counting for HTTPS upstream (see Known Limitations section)
  • Integration Testing
    • HTTPS proxy with SOCKS upstream - test/socks.js
    • HTTPS proxy with custom DNS lookup - test/dns_lookup.js
    • Parametric testing across HTTP/HTTPS server types - test/server.js
    • Test cases for HTTPS with IPv6 (see Known Limitations section)

TLS Overhead Analysis

Investigate the current state of TLS overhead in proxy-chain.

  • Investigate where TLS overhead might be possible (both in legacy and https implementation)
    • Client -> Proxy TLS (HTTPS Main Proxy)
    • Proxy -> Upstream TLS (HTTPS upstream)
    • Client -> Target TLS (End-to-End via CONNECT); no explicit tracking needed since it's end-to-end
  • Verify TLS overhead tracked correctly for these cases

Places Where TLS Overhead Possible:

  • Client - Proxy TLS (HTTPS Main Proxy):
    • direct.ts - CONNECT to HTTPS target, no upstream
    • chain.ts - CONNECT via HTTP/HTTPS upstream
    • chain_socks.ts - CONNECT via SOCKS upstream
    • forward.ts - GET/POST to HTTP target (with or without upstream)
    • forward_socks.ts - GET/POST via SOCKS upstream
    • custom_response.ts - Custom response
    • custom_connect.ts - Custom CONNECT connect
  • Proxy - HTTPS Upstream TLS (for both HTTP and HTTPS Main Proxy):
    • chain.ts - CONNECT via HTTPS upstream
    • forward.ts - GET/POST/etc via HTTPS upstream

Bugs

Extra bugs found and fixed during work on this PR.

Known Limitations

  1. Can't test HTTPS with IPv6 scenarios because of got-scraping limitations.
    1. Scope: HTTPS implementation with IPv6 only. It's implemented but not tested. HTTP with IPv6 is both implemented and tested.
    2. Impact: Only tests are affected.
    3. Proposal A: replace got-scraping library (dev-dependency used only for testing) to something better which supports both Socks and IPv6
    4. Proposal B: OR keep it as-is and fix this in future patch
  2. TLS overhead for HTTPS upstreams. Should be fixed separately as legacy issue.
    1. Scope: We're missing TLS overhead count for HTTPS upstreams. It's legacy issues and exists in current master. I've implemented TLS overhead count for HTTP upstreams in current PR.
    2. Impact: LOW, proxy-chain can fully work with HTTP upstreams including routing traffic to HTTPS targets using CONNECT. Only HTTPS upstreams will be missing TLS overhead bytes in final statistic (rare usage).
    3. Proposal: Merge this with TLS overhead count for HTTP upstreams (our main use-case). Solve TLS overhead for HTTPS upstream separately in future.

- Fix a datarace for error handler
- Add a regression test that verify datarace fix
- Add TLS defaults for better security
@github-actions github-actions bot added t-core-services Issues with this label are in the ownership of the core services team. tested Temporary label used only programatically for some analytics. labels Oct 8, 2025
@bliuchak bliuchak added the t-unblocking Issues with this label are in the ownership of the unblocking team. label Oct 8, 2025
@jirimoravcik
Copy link
Member

Also fixed:

  • a datarace for error event when we might log same events
  • fix for usage statistics trackings

Could you please point me to the changes that are related to the fixes? Thanks. Also, what was wrong with the statistics?

@bliuchak
Copy link
Contributor Author

bliuchak commented Oct 9, 2025

Also fixed:

  • a datarace for error event when we might log same events
  • fix for usage statistics trackings

Could you please point me to the changes that are related to the fixes? Thanks. Also, what was wrong with the statistics?

  1. Datarace
    1. Fix - e6adb19#diff-8a8ae07582c9d433ec8c2e5c4310ff8901e604f4965c5b90a49117ad46c47595R335
    2. Regression tests - https://github.com/apify/proxy-chain/pull/602/files#diff-d14cbfb50ed1cad7db5f4fef6a6076961b7cc9be980a3be06a70998f0eb8ebceR1456-R1599
  2. Statistics
    1. Fix 313f535#diff-8a8ae07582c9d433ec8c2e5c4310ff8901e604f4965c5b90a49117ad46c47595R658-R659
    2. Regression tests - https://github.com/apify/proxy-chain/pull/602/files#diff-d14cbfb50ed1cad7db5f4fef6a6076961b7cc9be980a3be06a70998f0eb8ebceR830-R871

Also, what was wrong with the statistics?

Don't remember right now for 100%, but few tests failed for https scenarios. I believe there was some issues related with undefined values for statistics.

Copy link
Member

@jirimoravcik jirimoravcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, had a few comments.
In addition to that, could you please:

  1. Bump the package version
  2. Describe all the new things in README.md (which serves as the primary user-facing documentation)
    Thanks

@bliuchak
Copy link
Contributor Author

@jirimoravcik @lewis-wow Guys, I've added main logic for TLS overhead bytes. Please take a look 🙏

Gonna polish tests in meantime and push 'em ASAP.

Copy link
Contributor

@lewis-wow lewis-wow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Have nothing to add.

Copy link
Member

@jirimoravcik jirimoravcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few more points for discussion

Comment on lines +231 to +249
if (options.serverType === 'https') {
if (!options.httpsOptions) {
throw new Error('httpsOptions is required when serverType is "https"');
}

// Apply secure TLS defaults (user options can override)
// This prevents users from accidentally configuring insecure TLS settings
const secureDefaults: https.ServerOptions = {
...HTTPS_DEFAULTS,
honorCipherOrder: true, // Server chooses cipher (prevents downgrade attacks)
...options.httpsOptions, // User options override defaults
};

this.server = https.createServer(secureDefaults);
this.serverType = 'https';
} else {
this.server = http.createServer();
this.serverType = 'http';
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe validate if options.serverType is one of http, https? It would make it consistent with the type. I'd also set it to http by default in the constructor parameter. That way you could just do this.serverType = options.serverType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Gonna add this.

src/server.ts Outdated
socket.proxyChainErrorHandled = true;

// Log errors only if there are no user-provided error handlers
if (this.listenerCount('error') === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was === 1 before, why is it === 0 now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly for previous condition this.listenerCount('error') === 1 (which count server-listeners, not socket) app log error only when there is one server handler. Right?

If I'm correct, then in this case the error will be handled by that one server-handler. For us it might be useful to log here when there are no other server-handlers this.listenerCount('error') === 0.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it was broken before, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was broken before. But I was wrong in my assumptions.

Original code should check socket.listenerCount('error') === 1 instead for this (server). Because socker and server error events aren't interconnected.

Also, we're attaching additional error listeners in our handlers (e.g. direct).

So, basically the flow should be like that:

  1. onConnection() - attach out early handler - socket.listenerCount('error') === 1
  2. onReqeust() / onConnect() - prepareRequestHandling
  3. direct() function called - attaches sourceSocket error handler
  4. Now socket.listenerCount('error') === 2

If error occurs after step 3: then the direct() handler will log instead our early handler.

If error occurs between steps 1-2: then our early handler will log it.

I'm gonna prepare fix for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jirimoravcik fixed here e6ea53b


it('handles upstream HTTPS proxy with expired certificate', async () => {
const expiredCert = loadCertificate('expired');
const validCert = loadCertificate('valid');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-core-services Issues with this label are in the ownership of the core services team. t-unblocking Issues with this label are in the ownership of the unblocking team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants