Separate AI connect/request timeouts and log resolved transport profile

## Problem

Data Machine AI requests are still frequently hitting a 120s timeout on `intelligence-chubes4`, even after reducing prompt/data-packet/memory size.

Recent Flow 2 jobs all showed the same first-attempt failure shape:

```text
Network error occurred while sending request to https://api.openai.com/v1/responses: cURL error 28: Connection timed out after 120000 milliseconds
```

Observed jobs:

- Job 639: first AI request timed out after 120s, retry succeeded.
- Job 640: first AI request timed out after 120s, retry reached a tool-path bug and skipped.
- Job 641: first AI request timed out after 120s, retry succeeded and wrote a wiki page.
- Job 642: first AI request timed out after 120s, retry succeeded and rejected the source.

Job 641 was only ~4.5k tokens, so this no longer looks primarily like prompt/data size. It looks like transport timeout semantics.

## Studio Context

Automattic/studio#3120 raises Studio's low-speed watchdog to 120s for AI TTFB. That behavior is already live on the local Studio site:

```php
CURLOPT_CONNECTTIMEOUT = 30;
CURLOPT_LOW_SPEED_LIMIT = 1024;
CURLOPT_LOW_SPEED_TIME = 120;
```

However, Data Machine overrides AI request cURL options in `RequestBuilder`:

```php
curl_setopt( $handle, CURLOPT_CONNECTTIMEOUT, ceil( $connect_timeout ) );
curl_setopt( $handle, CURLOPT_LOW_SPEED_TIME, ceil( $request_timeout ) );
curl_setopt( $handle, CURLOPT_LOW_SPEED_LIMIT, 1 );
```

The local Data Machine setting is currently:

```text
wp_ai_client_connect_timeout = 120
```

The exact `120000ms` error strongly suggests Data Machine's connect timeout override is the timeout currently winning, not Studio's low-speed watchdog.

## Root Cause Hypothesis

Data Machine exposes/tunes the wrong timeout as an operator-visible setting.

Current effective model:

```text
request timeout: 300s hardcoded/filterable
connect timeout: visible setting, currently 120s
retry delay: 60s
```

A 120s connect timeout is too expensive for autonomous runs. If connection establishment or provider edge contact stalls, the attempt should fail fast and retry with a fresh connection. The model response itself should still have a long request timeout.

## Desired Behavior

Separate timeout semantics clearly:

```text
connect timeout: short, e.g. 10-20s default
request timeout: long, e.g. 300s default
low-speed/TTFB watchdog: long enough for non-streaming AI response
retry delay: optionally shorter for transport/connect failures
```

## Acceptance Criteria

- Add first-class Data Machine settings for both:
  - `wp_ai_client_connect_timeout`
  - `wp_ai_client_request_timeout`
- Default connect timeout should be short enough for autonomous retries, likely 15s or 30s. Prefer 15s if tests/compatibility are okay.
- Request timeout should remain long, likely 300s.
- Preserve filters:
  - `datamachine_wp_ai_client_connect_timeout`
  - `datamachine_wp_ai_client_request_timeout`
- Log resolved AI transport profile before/around dispatch, including:
  - mode
  - provider
  - model
  - job_id / flow_step_id when available
  - resolved request timeout
  - resolved connect timeout
  - whether RequestOptions class was used
  - whether Data Machine cURL hook was installed
- On AI request failure logs, include the same resolved timeout profile so operators can tell which timeout likely fired.
- Keep Studio's generic mu-plugin behavior independent; no Studio-specific workaround or endpoint discrimination.
- Add focused tests around timeout resolution/settings/log metadata where practical.

## Notes

This should not reduce the model's thinking/response budget. The point is to fail fast only on connection establishment/transport stalls, while preserving a long total request timeout for connected non-streaming model calls.

This is part of making WordPress.com MGS wiki Flow 2 safe for set-and-forget operation. Processed-item semantics are now safer after Extra-Chill/data-machine#1815 (`reject_source`/`defer_item`), but the transport layer is still too slow/noisy to be boring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate AI connect/request timeouts and log resolved transport profile #1816

Problem

Studio Context

Root Cause Hypothesis

Desired Behavior

Acceptance Criteria

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Separate AI connect/request timeouts and log resolved transport profile #1816

Description

Problem

Studio Context

Root Cause Hypothesis

Desired Behavior

Acceptance Criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions