igerber · igerber · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **`did_had_pretest_workflow(aggregate="event_study")`**: multi-period dispatch on balanced ≥3-period panels. Runs QUG at `F` + joint pre-trends Stute across earlier pre-periods + joint homogeneity-linearity Stute across post-periods. Step 2 closure requires ≥2 pre-periods; with only a single pre-period (the base `F-1`) `pretrends_joint=None` and the verdict flags the skip. Reuses the Phase 2b event-study panel validator (last-cohort auto-filter under staggered timing with `UserWarning`; `ValueError` when `first_treat_col=None` and the panel is staggered). The data-in wrappers `joint_pretrends_test` and `joint_homogeneity_test` also route through that same validator internally, so direct wrapper calls inherit the last-cohort filter and constant-post-dose invariant. `HADPretestReport` extended with `pretrends_joint`, `homogeneity_joint`, and `aggregate` fields; serialization methods (`summary`, `to_dict`, `to_dataframe`, `__repr__`) preserve the Phase 3 output bit-exactly on `aggregate="overall"` — no `aggregate` key, no header row, no schema drift — and only surface the new fields on `aggregate="event_study"`.
 - **`ChaisemartinDHaultfoeuille.by_path`** — per-path event-study disaggregation, mirroring R `did_multiplegt_dyn(..., by_path=k)`. Passing `by_path=k` (positive int) to the estimator reports separate `DID_{path,l}` + SE + inference for the top-k most common observed treatment paths in the window `[F_g-1, F_g-1+L_max]`, answering the practitioner question "is a single pulse enough, or do you need sustained exposure?" across paths like `(0,1,0,0)` vs `(0,1,1,0)` vs `(0,1,1,1)`. The per-path SE follows the joiners-only / leavers-only IF precedent (switcher-side contribution zeroed for non-path groups; control pool and cohort structure unchanged; plug-in SE with path-specific divisor). Requires `drop_larger_lower=False` (multi-switch groups are the object of interest) and `L_max >= 1`. Binary treatment only in this release; combinations with `controls`, `trends_linear`, `trends_nonparam`, `heterogeneity`, `design2`, `honest_did`, `survey_design`, and `n_bootstrap > 0` raise `NotImplementedError` and are deferred to follow-up PRs. Results expose `results.path_effects: Dict[Tuple[int, ...], Dict[str, Any]]` and `results.to_dataframe(level="by_path")`; the summary grows a "Treatment-Path Disaggregation" block. Ties in path frequency are broken lexicographically on the path tuple for deterministic ranking. Overflow (`by_path > n_observed_paths`) returns all observed paths with a `UserWarning`. See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path event-study disaggregation)` for the full contract.
 - **R-parity for `ChaisemartinDHaultfoeuille.by_path`** against `DIDmultiplegtDYN 2.3.3`. Two new scenarios in `benchmarks/data/dcdh_dynr_golden_values.json` generated from `did_multiplegt_dyn(..., by_path=k)`: `mixed_single_switch_by_path` (2 paths, `by_path=2`) and `multi_path_reversible_by_path` (4 observed paths, `by_path=3`, via a new deterministic multi-path DGP pattern in the R generator). Per-path point estimates and per-path switcher counts match R exactly; per-path SE matches within the Phase 2 multi-horizon SE envelope (observed rtol ≤ 10.2% on the 2-path scenario, ≤ 4.2% on the 4-path scenario). Parity tests live at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPath`, matching paths by tuple label via set-equality (robust to R's undocumented frequency-tie tiebreak) and cross-checking per-path switcher counts before SE comparison. **Deviation documented:** cross-path cohort sharing — our full-panel cohort-centered plug-in vs R's per-path re-run diverges materially when a `(D_{g,1}, F_g, S_g)` cohort spans multiple observed paths; the two coincide when every cohort is single-path. The parity scenarios are constructed to keep cohorts single-path (scenario 13 by design, scenario 14 via path-assignment-deterministic-on-F_g). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path...)` for the full write-up.
+- **`profile_panel()` utility + `llms-autonomous.txt` reference guide (agent-facing)** — new `diff_diff.profile_panel(df, *, unit, time, treatment, outcome)` returns a frozen `PanelProfile` dataclass of structural facts (panel balance, treatment-type classification — `"binary_absorbing"` / `"binary_non_absorbing"` / `"continuous"` / `"categorical"`, cohort structure, outcome characteristics, and a `tuple[Alert, ...]` of factual observations). `.to_dict()` returns a JSON-serializable view. Paired with a new bundled `"autonomous"` variant on `get_llm_guide()` — `get_llm_guide("autonomous")` returns a reference-shaped guide (distinct from the existing workflow-prose `"practitioner"` variant) with §1 audience disclaimer, §2 `PanelProfile` field reference, §3 embedded 17-estimator × 9-design-feature support matrix, §4 per-design-feature reasoning citing Baker et al. (2025) and Roth / Sant'Anna (2023), §5 post-fit validation index, §6 BR/DR schema reference, §7 citations, §8 intentional omissions. Both pieces are bundled inside the wheel (no GitHub / RTD dependency at runtime); `diff_diff/__init__.py` module docstring leads with an agent-entry block listing `profile_panel`, `get_llm_guide("autonomous")`, `get_llm_guide("practitioner")`, and `BusinessReport` so `help(diff_diff)` surfaces them. Descriptive, not opinionated — `profile_panel` alerts never recommend a specific estimator, and the guide enumerates trade-offs rather than dispatching. Exports: `profile_panel`, `PanelProfile`, `Alert` from top-level `diff_diff`.
 - **`target_parameter` block in BR/DR schemas (experimental; schema version bumped to 2.0)** — `BUSINESS_REPORT_SCHEMA_VERSION` and `DIAGNOSTIC_REPORT_SCHEMA_VERSION` bumped from `"1.0"` to `"2.0"` because the new `"no_scalar_by_design"` value on the `headline.status` / `headline_metric.status` enum (dCDH `trends_linear=True, L_max>=2` configuration) is a breaking change per the REPORTING.md stability policy. BusinessReport and DiagnosticReport now emit a top-level `target_parameter` block naming what the headline scalar actually represents for each of the 16 result classes. Closes BR/DR foundation gap #6 (target-parameter clarity). Fields: `name`, `definition`, `aggregation` (machine-readable dispatch tag), `headline_attribute` (raw result attribute), `reference` (citation pointer). BR's summary emits the short `name` right after the headline; DR's overall-interpretation paragraph does the same; both full reports carry a "## Target Parameter" section with the full definition. Per-estimator dispatch is sourced from REGISTRY.md and lives in the new `diff_diff/_reporting_helpers.py::describe_target_parameter`. A few branches read fit-time config (`EfficientDiDResults.pt_assumption`, `StackedDiDResults.clean_control`, `ChaisemartinDHaultfoeuilleResults.L_max` / `covariate_residuals` / `linear_trends_effects`); others emit a fixed tag (the fit-time `aggregate` kwarg on CS / Imputation / TwoStage / Wooldridge does not change the `overall_att` scalar — disambiguating horizon / group tables is tracked under gap #9). See `docs/methodology/REPORTING.md` "Target parameter" section.
 - SyntheticDiD coverage Monte Carlo calibration table added to `docs/methodology/REGISTRY.md` §SyntheticDiD — rejection rates at α ∈ {0.01, 0.05, 0.10} across `placebo` / `bootstrap` / `jackknife` on 3 representative DGPs (balanced / exchangeable, unbalanced, and Arkhangelsky et al. (2021) AER §6.3 non-exchangeable). Artifact at `benchmarks/data/sdid_coverage.json` (500 seeds × B=200), regenerable via `benchmarks/python/coverage_sdid.py`.
 

diff --git a/ROADMAP.md b/ROADMAP.md
@@ -137,15 +137,17 @@ Long-running program, framed as "building toward" rather than with discrete ship
 
 - Baker et al. (2025) 8-step workflow enforcement in `diff_diff/practitioner.py`.
 - `practitioner_next_steps()` context-aware guidance.
-- Runtime LLM guides via `get_llm_guide(...)` (`llms.txt`, `llms-full.txt`, `llms-practitioner.txt`), bundled in the wheel.
+- Runtime LLM guides via `get_llm_guide(...)` (`llms.txt`, `llms-full.txt`, `llms-practitioner.txt`, `llms-autonomous.txt`), bundled in the wheel.
+- `profile_panel(df, ...)` returns a `PanelProfile` dataclass of structural facts about the panel - factual, not opinionated. Pairs with the `"autonomous"` guide variant (reference-shaped: estimator-support matrix + per-design-feature reasoning) so agents describe the data then consult a bundled reference rather than calling a deterministic recommender.
+- Package docstring leads with an "For AI agents" entry block so `help(diff_diff)` surfaces the agent entry points automatically.
 - Silent-operation warnings so agents and humans see the same signals at the same time.
 
 **Next blocks toward the vision.**
 
-- **BusinessReport / DiagnosticReport** (in Shipping Next) - the output form the vision assumes.
+- **Post-hoc mismatch detection in BR/DR output** - surfaces structured warnings like "you fit TWFE on staggered data with 37% forbidden-comparison weights" when the profile and the fitted estimator disagree. Safety net, not a pre-emptive rules engine.
+- **Structured `sanity_checks` block in BR/DR** - machine-legible pass / warn / fail signals (pretrends, power, forbidden-comparisons, event-study cleanliness, placebo, sensitivity) so agents can dispatch on a stable schema rather than parsing prose.
 - **Context-aware `practitioner_next_steps()`** that substitutes actual column names - turns guidance into executable recommendations.
-- **AI-legible diagnostic surfaces** - once BusinessReport ships, a structured JSON counterpart that agents can parse without screen-scraping human text.
-- **Scenario-to-estimator selection guidance** - agent-facing extension of `docs/practitioner_decision_tree.rst` that returns a specific estimator choice plus rationale for a given scenario description.
+- **Unified `assess_*` verb** across estimator native-diagnostic methods for a single discoverable convention.
 - **End-to-end scenario walkthrough templates** - reusable orchestration recipes an agent can adapt from data ingest through business-ready output.
 
 ---

diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py
@@ -4,14 +4,20 @@
 This library provides sklearn-like estimators for causal inference
 using the difference-in-differences methodology.
 
-For rigorous analysis, follow the 8-step practitioner workflow based
-on Baker et al. (2025). After estimation, call
-``practitioner_next_steps(results)`` for context-aware guidance on
-remaining diagnostic steps.
+For AI agents:
 
-AI agents: call ``diff_diff.get_llm_guide()`` for a complete API reference.
-Use ``get_llm_guide("practitioner")`` for the 8-step workflow or
-``get_llm_guide("full")`` for comprehensive documentation.
+    1. Describe your data:    ``diff_diff.profile_panel(df, unit=..., time=...,
+                              treatment=..., outcome=...)``
+    2. Consult the reference: ``diff_diff.get_llm_guide("autonomous")``
+                              (estimator-support matrix + reasoning)
+    3. Follow the workflow:   ``diff_diff.get_llm_guide("practitioner")``
+                              (Baker et al. (2025) 8-step recipe)
+    4. Report results:        ``diff_diff.BusinessReport(results)``
+                              (structured agent-legible output)
+
+For a comprehensive API reference call ``diff_diff.get_llm_guide("full")``;
+``practitioner_next_steps(results)`` returns context-aware guidance after
+any estimator's ``fit()``.
 """
 
 # Import backend detection from dedicated module (avoids circular imports)
@@ -244,6 +250,7 @@
     DiagnosticReportResults,
 )
 from diff_diff._guides_api import get_llm_guide
+from diff_diff.profile import Alert, PanelProfile, profile_panel
 from diff_diff.datasets import (
     clear_cache,
     list_datasets,
@@ -487,6 +494,10 @@
     "DiagnosticReport",
     "DiagnosticReportResults",
     "DIAGNOSTIC_REPORT_SCHEMA_VERSION",
+    # Panel profiling (agent-facing pre-fit describe utility)
+    "profile_panel",
+    "PanelProfile",
+    "Alert",
     # LLM guide accessor
     "get_llm_guide",
 ]
diff --git a/diff_diff/_guides_api.py b/diff_diff/_guides_api.py
@@ -1,4 +1,5 @@
 """Runtime accessor for bundled LLM guide files."""
+
 from __future__ import annotations
 
 from importlib.resources import files
@@ -7,6 +8,7 @@
     "concise": "llms.txt",
     "full": "llms-full.txt",
     "practitioner": "llms-practitioner.txt",
+    "autonomous": "llms-autonomous.txt",
 }
 
 
@@ -21,6 +23,10 @@ def get_llm_guide(variant: str = "concise") -> str:
         - ``"concise"`` -- compact API reference (llms.txt)
         - ``"full"`` -- complete API documentation (llms-full.txt)
         - ``"practitioner"`` -- 8-step practitioner workflow (llms-practitioner.txt)
+        - ``"autonomous"`` -- reference guide for AI-agent use: estimator-support
+          matrix, per-design-feature reasoning, post-fit validation index, and
+          BR/DR schema (llms-autonomous.txt). Pair with
+          :func:`diff_diff.profile_panel` for pre-fit data description.
 
     Returns
     -------
@@ -42,7 +48,5 @@ def get_llm_guide(variant: str = "concise") -> str:
         filename = _VARIANT_TO_FILE[variant]
     except (KeyError, TypeError):
         valid = ", ".join(repr(k) for k in _VARIANT_TO_FILE)
-        raise ValueError(
-            f"Unknown guide variant {variant!r}. Valid options: {valid}."
-        ) from None
+        raise ValueError(f"Unknown guide variant {variant!r}. Valid options: {valid}.") from None
     return files("diff_diff.guides").joinpath(filename).read_text(encoding="utf-8")