diff --git a/CHANGELOG.md b/CHANGELOG.md
index aaa2d43d..d7474d25 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **`did_had_pretest_workflow(aggregate="event_study")`**: multi-period dispatch on balanced ≥3-period panels. Runs QUG at `F` + joint pre-trends Stute across earlier pre-periods + joint homogeneity-linearity Stute across post-periods. Step 2 closure requires ≥2 pre-periods; with only a single pre-period (the base `F-1`) `pretrends_joint=None` and the verdict flags the skip. Reuses the Phase 2b event-study panel validator (last-cohort auto-filter under staggered timing with `UserWarning`; `ValueError` when `first_treat_col=None` and the panel is staggered). The data-in wrappers `joint_pretrends_test` and `joint_homogeneity_test` also route through that same validator internally, so direct wrapper calls inherit the last-cohort filter and constant-post-dose invariant. `HADPretestReport` extended with `pretrends_joint`, `homogeneity_joint`, and `aggregate` fields; serialization methods (`summary`, `to_dict`, `to_dataframe`, `__repr__`) preserve the Phase 3 output bit-exactly on `aggregate="overall"` — no `aggregate` key, no header row, no schema drift — and only surface the new fields on `aggregate="event_study"`.
 - **`ChaisemartinDHaultfoeuille.by_path`** — per-path event-study disaggregation, mirroring R `did_multiplegt_dyn(..., by_path=k)`. Passing `by_path=k` (positive int) to the estimator reports separate `DID_{path,l}` + SE + inference for the top-k most common observed treatment paths in the window `[F_g-1, F_g-1+L_max]`, answering the practitioner question "is a single pulse enough, or do you need sustained exposure?" across paths like `(0,1,0,0)` vs `(0,1,1,0)` vs `(0,1,1,1)`. The per-path SE follows the joiners-only / leavers-only IF precedent (switcher-side contribution zeroed for non-path groups; control pool and cohort structure unchanged; plug-in SE with path-specific divisor). Requires `drop_larger_lower=False` (multi-switch groups are the object of interest) and `L_max >= 1`. Binary treatment only in this release; combinations with `controls`, `trends_linear`, `trends_nonparam`, `heterogeneity`, `design2`, `honest_did`, `survey_design`, and `n_bootstrap > 0` raise `NotImplementedError` and are deferred to follow-up PRs. Results expose `results.path_effects: Dict[Tuple[int, ...], Dict[str, Any]]` and `results.to_dataframe(level="by_path")`; the summary grows a "Treatment-Path Disaggregation" block. Ties in path frequency are broken lexicographically on the path tuple for deterministic ranking. Overflow (`by_path > n_observed_paths`) returns all observed paths with a `UserWarning`. See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path per-path event-study disaggregation)` for the full contract.
 - **R-parity for `ChaisemartinDHaultfoeuille.by_path`** against `DIDmultiplegtDYN 2.3.3`. Two new scenarios in `benchmarks/data/dcdh_dynr_golden_values.json` generated from `did_multiplegt_dyn(..., by_path=k)`: `mixed_single_switch_by_path` (2 paths, `by_path=2`) and `multi_path_reversible_by_path` (4 observed paths, `by_path=3`, via a new deterministic multi-path DGP pattern in the R generator). Per-path point estimates and per-path switcher counts match R exactly; per-path SE matches within the Phase 2 multi-horizon SE envelope (observed rtol ≤ 10.2% on the 2-path scenario, ≤ 4.2% on the 4-path scenario). Parity tests live at `tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPath`, matching paths by tuple label via set-equality (robust to R's undocumented frequency-tie tiebreak) and cross-checking per-path switcher counts before SE comparison. **Deviation documented:** cross-path cohort sharing — our full-panel cohort-centered plug-in vs R's per-path re-run diverges materially when a `(D_{g,1}, F_g, S_g)` cohort spans multiple observed paths; the two coincide when every cohort is single-path. The parity scenarios are constructed to keep cohorts single-path (scenario 13 by design, scenario 14 via path-assignment-deterministic-on-F_g). See `docs/methodology/REGISTRY.md` §ChaisemartinDHaultfoeuille `Note (Phase 3 by_path...)` for the full write-up.
+- **`profile_panel()` utility + `llms-autonomous.txt` reference guide (agent-facing)** — new `diff_diff.profile_panel(df, *, unit, time, treatment, outcome)` returns a frozen `PanelProfile` dataclass of structural facts (panel balance, treatment-type classification — `"binary_absorbing"` / `"binary_non_absorbing"` / `"continuous"` / `"categorical"`, cohort structure, outcome characteristics, and a `tuple[Alert, ...]` of factual observations). `.to_dict()` returns a JSON-serializable view. Paired with a new bundled `"autonomous"` variant on `get_llm_guide()` — `get_llm_guide("autonomous")` returns a reference-shaped guide (distinct from the existing workflow-prose `"practitioner"` variant) with §1 audience disclaimer, §2 `PanelProfile` field reference, §3 embedded 17-estimator × 9-design-feature support matrix, §4 per-design-feature reasoning citing Baker et al. (2025) and Roth / Sant'Anna (2023), §5 post-fit validation index, §6 BR/DR schema reference, §7 citations, §8 intentional omissions. Both pieces are bundled inside the wheel (no GitHub / RTD dependency at runtime); `diff_diff/__init__.py` module docstring leads with an agent-entry block listing `profile_panel`, `get_llm_guide("autonomous")`, `get_llm_guide("practitioner")`, and `BusinessReport` so `help(diff_diff)` surfaces them. Descriptive, not opinionated — `profile_panel` alerts never recommend a specific estimator, and the guide enumerates trade-offs rather than dispatching. Exports: `profile_panel`, `PanelProfile`, `Alert` from top-level `diff_diff`.
 - **`target_parameter` block in BR/DR schemas (experimental; schema version bumped to 2.0)** — `BUSINESS_REPORT_SCHEMA_VERSION` and `DIAGNOSTIC_REPORT_SCHEMA_VERSION` bumped from `"1.0"` to `"2.0"` because the new `"no_scalar_by_design"` value on the `headline.status` / `headline_metric.status` enum (dCDH `trends_linear=True, L_max>=2` configuration) is a breaking change per the REPORTING.md stability policy. BusinessReport and DiagnosticReport now emit a top-level `target_parameter` block naming what the headline scalar actually represents for each of the 16 result classes. Closes BR/DR foundation gap #6 (target-parameter clarity). Fields: `name`, `definition`, `aggregation` (machine-readable dispatch tag), `headline_attribute` (raw result attribute), `reference` (citation pointer). BR's summary emits the short `name` right after the headline; DR's overall-interpretation paragraph does the same; both full reports carry a "## Target Parameter" section with the full definition. Per-estimator dispatch is sourced from REGISTRY.md and lives in the new `diff_diff/_reporting_helpers.py::describe_target_parameter`. A few branches read fit-time config (`EfficientDiDResults.pt_assumption`, `StackedDiDResults.clean_control`, `ChaisemartinDHaultfoeuilleResults.L_max` / `covariate_residuals` / `linear_trends_effects`); others emit a fixed tag (the fit-time `aggregate` kwarg on CS / Imputation / TwoStage / Wooldridge does not change the `overall_att` scalar — disambiguating horizon / group tables is tracked under gap #9). See `docs/methodology/REPORTING.md` "Target parameter" section.
 - SyntheticDiD coverage Monte Carlo calibration table added to `docs/methodology/REGISTRY.md` §SyntheticDiD — rejection rates at α ∈ {0.01, 0.05, 0.10} across `placebo` / `bootstrap` / `jackknife` on 3 representative DGPs (balanced / exchangeable, unbalanced, and Arkhangelsky et al. (2021) AER §6.3 non-exchangeable). Artifact at `benchmarks/data/sdid_coverage.json` (500 seeds × B=200), regenerable via `benchmarks/python/coverage_sdid.py`.
 
diff --git a/ROADMAP.md b/ROADMAP.md
index 65a4b119..fcc02f22 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -137,15 +137,17 @@ Long-running program, framed as "building toward" rather than with discrete ship
 
 - Baker et al. (2025) 8-step workflow enforcement in `diff_diff/practitioner.py`.
 - `practitioner_next_steps()` context-aware guidance.
-- Runtime LLM guides via `get_llm_guide(...)` (`llms.txt`, `llms-full.txt`, `llms-practitioner.txt`), bundled in the wheel.
+- Runtime LLM guides via `get_llm_guide(...)` (`llms.txt`, `llms-full.txt`, `llms-practitioner.txt`, `llms-autonomous.txt`), bundled in the wheel.
+- `profile_panel(df, ...)` returns a `PanelProfile` dataclass of structural facts about the panel - factual, not opinionated. Pairs with the `"autonomous"` guide variant (reference-shaped: estimator-support matrix + per-design-feature reasoning) so agents describe the data then consult a bundled reference rather than calling a deterministic recommender.
+- Package docstring leads with an "For AI agents" entry block so `help(diff_diff)` surfaces the agent entry points automatically.
 - Silent-operation warnings so agents and humans see the same signals at the same time.
 
 **Next blocks toward the vision.**
 
-- **BusinessReport / DiagnosticReport** (in Shipping Next) - the output form the vision assumes.
+- **Post-hoc mismatch detection in BR/DR output** - surfaces structured warnings like "you fit TWFE on staggered data with 37% forbidden-comparison weights" when the profile and the fitted estimator disagree. Safety net, not a pre-emptive rules engine.
+- **Structured `sanity_checks` block in BR/DR** - machine-legible pass / warn / fail signals (pretrends, power, forbidden-comparisons, event-study cleanliness, placebo, sensitivity) so agents can dispatch on a stable schema rather than parsing prose.
 - **Context-aware `practitioner_next_steps()`** that substitutes actual column names - turns guidance into executable recommendations.
-- **AI-legible diagnostic surfaces** - once BusinessReport ships, a structured JSON counterpart that agents can parse without screen-scraping human text.
-- **Scenario-to-estimator selection guidance** - agent-facing extension of `docs/practitioner_decision_tree.rst` that returns a specific estimator choice plus rationale for a given scenario description.
+- **Unified `assess_*` verb** across estimator native-diagnostic methods for a single discoverable convention.
 - **End-to-end scenario walkthrough templates** - reusable orchestration recipes an agent can adapt from data ingest through business-ready output.
 
 ---
diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py
index 4a9b93c4..0247eecd 100644
--- a/diff_diff/__init__.py
+++ b/diff_diff/__init__.py
@@ -4,14 +4,20 @@
 This library provides sklearn-like estimators for causal inference
 using the difference-in-differences methodology.
 
-For rigorous analysis, follow the 8-step practitioner workflow based
-on Baker et al. (2025). After estimation, call
-``practitioner_next_steps(results)`` for context-aware guidance on
-remaining diagnostic steps.
+For AI agents:
 
-AI agents: call ``diff_diff.get_llm_guide()`` for a complete API reference.
-Use ``get_llm_guide("practitioner")`` for the 8-step workflow or
-``get_llm_guide("full")`` for comprehensive documentation.
+    1. Describe your data:    ``diff_diff.profile_panel(df, unit=..., time=...,
+                              treatment=..., outcome=...)``
+    2. Consult the reference: ``diff_diff.get_llm_guide("autonomous")``
+                              (estimator-support matrix + reasoning)
+    3. Follow the workflow:   ``diff_diff.get_llm_guide("practitioner")``
+                              (Baker et al. (2025) 8-step recipe)
+    4. Report results:        ``diff_diff.BusinessReport(results)``
+                              (structured agent-legible output)
+
+For a comprehensive API reference call ``diff_diff.get_llm_guide("full")``;
+``practitioner_next_steps(results)`` returns context-aware guidance after
+any estimator's ``fit()``.
 """
 
 # Import backend detection from dedicated module (avoids circular imports)
@@ -244,6 +250,7 @@
     DiagnosticReportResults,
 )
 from diff_diff._guides_api import get_llm_guide
+from diff_diff.profile import Alert, PanelProfile, profile_panel
 from diff_diff.datasets import (
     clear_cache,
     list_datasets,
@@ -487,6 +494,10 @@
     "DiagnosticReport",
     "DiagnosticReportResults",
     "DIAGNOSTIC_REPORT_SCHEMA_VERSION",
+    # Panel profiling (agent-facing pre-fit describe utility)
+    "profile_panel",
+    "PanelProfile",
+    "Alert",
     # LLM guide accessor
     "get_llm_guide",
 ]
diff --git a/diff_diff/_guides_api.py b/diff_diff/_guides_api.py
index 5a00ed77..503c74ba 100644
--- a/diff_diff/_guides_api.py
+++ b/diff_diff/_guides_api.py
@@ -1,4 +1,5 @@
 """Runtime accessor for bundled LLM guide files."""
+
 from __future__ import annotations
 
 from importlib.resources import files
@@ -7,6 +8,7 @@
     "concise": "llms.txt",
     "full": "llms-full.txt",
     "practitioner": "llms-practitioner.txt",
+    "autonomous": "llms-autonomous.txt",
 }
 
 
@@ -21,6 +23,10 @@ def get_llm_guide(variant: str = "concise") -> str:
         - ``"concise"`` -- compact API reference (llms.txt)
         - ``"full"`` -- complete API documentation (llms-full.txt)
         - ``"practitioner"`` -- 8-step practitioner workflow (llms-practitioner.txt)
+        - ``"autonomous"`` -- reference guide for AI-agent use: estimator-support
+          matrix, per-design-feature reasoning, post-fit validation index, and
+          BR/DR schema (llms-autonomous.txt). Pair with
+          :func:`diff_diff.profile_panel` for pre-fit data description.
 
     Returns
     -------
@@ -42,7 +48,5 @@ def get_llm_guide(variant: str = "concise") -> str:
         filename = _VARIANT_TO_FILE[variant]
     except (KeyError, TypeError):
         valid = ", ".join(repr(k) for k in _VARIANT_TO_FILE)
-        raise ValueError(
-            f"Unknown guide variant {variant!r}. Valid options: {valid}."
-        ) from None
+        raise ValueError(f"Unknown guide variant {variant!r}. Valid options: {valid}.") from None
     return files("diff_diff.guides").joinpath(filename).read_text(encoding="utf-8")
diff --git a/diff_diff/guides/llms-autonomous.txt b/diff_diff/guides/llms-autonomous.txt
new file mode 100644
index 00000000..b571911e
--- /dev/null
+++ b/diff_diff/guides/llms-autonomous.txt
@@ -0,0 +1,844 @@
+# diff-diff: Autonomous-agent reference guide
+
+This guide is reference material for AI agents using diff-diff without
+human-in-the-loop supervision. It catalogs the library's estimators, names
+the design features each supports, explains how to read the
+`profile_panel()` output, and points at post-fit validation utilities and
+report schemas.
+
+It is a reference, not a decision tree. Multiple estimators usually fit a
+given panel; choosing between them involves trade-offs the cited literature
+discusses and that this guide does not pretend to resolve.
+
+**Pair this guide with:**
+- `get_llm_guide("practitioner")` - the Baker et al. (2025) 8-step validation
+  workflow in workflow-prose form.
+- `get_llm_guide("full")` - comprehensive API documentation for every public
+  function and class.
+- `profile_panel(df, unit=..., time=..., treatment=..., outcome=...)` - the
+  pre-fit describe utility whose output fields this guide's sections §2 and
+  §4 reason about.
+
+
+## Table of contents
+
+- §1. What this guide is (and is not)
+- §2. PanelProfile field reference
+- §3. Estimator-support matrix
+- §4. Estimator-choice reasoning by design feature
+- §5. Post-fit validation utilities
+- §6. How to read BusinessReport / DiagnosticReport output
+- §7. Glossary + citations
+- §8. Intentional omissions
+
+
+## §1. What this guide is (and is not)
+
+**What it is.** A reference you consult after running `profile_panel()` and
+before calling any estimator's `fit()`. The matrix in §3 and the per-design-
+feature discussions in §4 tell you which estimators are well-suited to the
+panel shape reported by the profile; the post-fit index in §5 tells you
+which diagnostics apply once you have a fitted result.
+
+**What it is not.** A deterministic recommender. No function in diff-diff
+returns "pick estimator X." This guide does not either. When several
+estimators fit a design, it enumerates them and names the trade-offs. The
+agent is responsible for weighing those trade-offs (often with the cited
+references in §7) and justifying the choice in the final write-up.
+
+**Why this shape.** A rules-engine recommender would lock in a policy that
+ages poorly as new estimators land and as the applied-econometrics
+literature evolves. Static reference material + descriptive profiling is
+less brittle: when a new estimator is added it gets a row in §3 and a
+paragraph in §4, without rewriting a dispatcher.
+
+
+## §2. PanelProfile field reference
+
+`profile_panel(df, unit=..., time=..., treatment=..., outcome=...)` returns
+a frozen `PanelProfile` dataclass. Call `.to_dict()` for a JSON-serializable
+view. Every field below appears as a top-level key in that dict.
+
+### Panel structure
+
+- **`n_units: int`** - count of distinct values in the `unit` column.
+- **`n_periods: int`** - count of distinct values in the `time` column.
+- **`n_obs: int`** - total rows in the panel.
+- **`is_balanced: bool`** - true iff every distinct `(unit, time)` cell
+  appears at least once in the panel (i.e. the unique `(unit, time)`
+  support equals `n_units * n_periods`). Duplicate rows do not affect
+  balance but are surfaced via the `duplicate_unit_time_rows` alert.
+- **`observation_coverage: float`** - ratio of unique `(unit, time)`
+  keys to `n_units * n_periods`, always in `[0, 1]` (duplicates do not
+  inflate). A value below `0.70` also triggers the
+  `panel_highly_unbalanced` alert.
+
+### Treatment variation
+
+- **`treatment_type: str`** - classification of the treatment column.
+  Exactly one of:
+    - `"binary_absorbing"`: observed non-NaN values are a subset of
+      {0, 1} (one or two distinct values, covering all-zero and all-one
+      panels as valid degenerate cases) and each unit's treatment
+      sequence (ordered by `time`) is weakly monotone non-decreasing.
+      The canonical DiD setting.
+    - `"binary_non_absorbing"`: values a subset of {0, 1} with at least
+      two distinct values observed, where at least one unit switches
+      from 1 back to 0. Only `ChaisemartinDHaultfoeuille` handles this
+      natively; the other absorbing-only estimators would misapply.
+    - `"continuous"`: numeric with more than two distinct values, or a
+      two-valued numeric column whose values are not in {0, 1} (e.g.,
+      a dose, a discrete-integer partial-adoption score). Use
+      `ContinuousDiD` or `HeterogeneousAdoptionDiD`.
+    - `"categorical"`: non-numeric dtype (object / category), or a
+      column that is entirely NaN. Often indicates a treatment arm.
+      Encode each arm as a binary indicator and fit separately, or
+      use a multi-treatment workflow outside the current estimator
+      suite.
+
+  Bool-dtype treatment columns (`True` / `False`) are classified the
+  same way as numeric `{0, 1}`: the library's binary estimators
+  validate on value support rather than dtype, so `True` and `False`
+  behave like `1` and `0` for absorbing / non-absorbing classification.
+- **`is_staggered: bool`** - true iff treatment is `binary_absorbing` and
+  at least two distinct first-treatment periods are observed. Drives the
+  choice between classic DiD/TWFE and staggered-robust estimators.
+- **`n_cohorts: int`** - for `binary_absorbing`, the number of distinct
+  first-treatment periods (cohorts). Zero for other `treatment_type`
+  values.
+- **`cohort_sizes: Mapping[Any, int]`** - map from first-treatment period
+  to cohort size (number of units adopting at that time). Empty for
+  non-absorbing / continuous / categorical treatments.
+- **`has_never_treated: bool`** - at least one unit has `treatment == 0`
+  in every observed non-NaN row (applies to both binary and continuous
+  treatment columns; for continuous this flags zero-dose control units).
+  Required by `SyntheticDiD`, `SunAbraham`, `EfficientDiD` under both
+  `assumption="PT-All"` and `assumption="PT-Post"` (unless
+  `control_group="last_cohort"` is passed), and `ContinuousDiD`
+  (which requires `P(D=0) > 0` - Remark 3.1 lowest-dose-as-control
+  is not yet implemented). Preferred-but-optional by
+  `CallawaySantAnna` and `ChaisemartinDHaultfoeuille`. Always `False`
+  for `"categorical"`.
+- **`has_always_treated: bool`** - at least one binary-treatment
+  unit has `treatment == 1` in every observed non-NaN row (no
+  pre-treatment information for that unit in the DiD sense).
+  Binary-only semantics: for `"continuous"` panels this field is
+  always `False` because pre-treatment periods are determined by the
+  `first_treat` column supplied to `ContinuousDiD.fit()`, not by
+  whether the dose is positive - a unit with a constant positive dose
+  can still have well-defined pre-treatment periods. Always `False`
+  for `"categorical"` too.
+- **`treatment_varies_within_unit: bool`** - at least one unit has more
+  than one distinct non-NaN treatment value across its observed rows.
+  For binary panels this is normally `True` (pre vs. post the adoption
+  period), and for continuous panels this flags time-varying dose.
+  `ContinuousDiD.fit()` requires this to be `False` (dose must be
+  time-invariant per unit, per Callaway et al. 2024); a `True` value on
+  a continuous panel rules the estimator out. Always `False` for
+  `"categorical"`.
+
+### Timing
+
+- **`first_treatment_period: Optional[Any]`** - earliest first-treatment
+  period observed (for `binary_absorbing`); `None` otherwise.
+- **`last_treatment_period: Optional[Any]`** - latest first-treatment
+  period observed; `None` otherwise.
+- **`min_pre_periods: Optional[int]`** - across treated units, the
+  smallest number of observed pre-treatment periods (each treated
+  unit's observed `(unit, time)` support is counted independently, so
+  this reflects the least-supported treated unit on unbalanced panels).
+  Low values (< 3) fire the `short_pre_panel` alert and limit power
+  for parallel-trends tests.
+- **`min_post_periods: Optional[int]`** - across treated units, the
+  smallest number of observed post-treatment periods; same per-unit
+  support semantics as above. Low values limit event-study dynamics.
+
+### Outcome
+
+- **`outcome_dtype: str`** - the pandas dtype name (e.g. `"float64"`,
+  `"int64"`, `"bool"`).
+- **`outcome_is_binary: bool`** - outcome has exactly two distinct
+  non-NaN values, both in {0, 1}. For binary outcomes the linear
+  parallel-trends assumption is restrictive; consider the logit/log-odds
+  alternative in the Roth/Sant'Anna (2023) survey.
+- **`outcome_has_zeros: bool`** - any non-NaN outcome equals zero.
+  Relevant for log-transform diagnostics.
+- **`outcome_has_negatives: bool`** - any non-NaN outcome is negative.
+  Relevant for log-transform diagnostics.
+- **`outcome_missing_fraction: float`** - share of rows where the
+  outcome column is NaN, in `[0, 1]`.
+- **`outcome_summary: Mapping[str, float]`** - `{min, max, mean, std}`
+  computed with NaN-skipping; empty for non-numeric outcomes.
+
+### Alerts
+
+`alerts: tuple[Alert, ...]` is a list of factual observations. Each
+`Alert` has `code`, `severity` (`"info"` or `"warn"`), `message`, and
+`observed` (the numerical or boolean value that tripped the alert).
+
+The v1 alert catalogue is listed below. Alerts never name a specific
+estimator. Severity `"warn"` means the observation is likely relevant to
+estimator choice or to the interpretation of diagnostics; `"info"` means
+it is descriptive context.
+
+| Alert code | Severity | Fires when |
+|---|---|---|
+| `missing_id_rows_dropped` | warn | rows with NaN `unit` or `time` were dropped before computing structural facts |
+| `duplicate_unit_time_rows` | warn | panel contains more than one row per (unit, time) |
+| `min_cohort_size_below_10` | warn | smallest cohort has fewer than 10 units |
+| `only_one_cohort` | info | all treated units adopt simultaneously |
+| `short_pre_panel` | warn | `min_pre_periods < 3` |
+| `short_post_panel` | info | `min_post_periods < 3` |
+| `no_never_treated` | info | every unit is eventually treated |
+| `has_always_treated_units` | info | some units are treated in every observed period |
+| `all_units_treated_simultaneously` | info | single cohort and no never-treated group |
+| `panel_highly_unbalanced` | warn | `observation_coverage < 0.70` |
+| `only_two_periods` | info | `n_periods == 2` |
+| `outcome_looks_binary_but_dtype_float` | info | outcome takes {0, 1} values but is stored as float |
+
+
+## §3. Estimator-support matrix
+
+Rows are estimator classes exported from `diff_diff`. Columns are design
+features derivable from `PanelProfile`. Cells: `✓` supported; `✗` not
+supported / out of scope; `warn` supported but with documented caveats;
+`partial` supported subject to restrictions discussed in §4.
+
+| Estimator | binary absorbing | staggered | continuous | triple-diff | never-treated required | covariate adjustment | few-treated (synthetic) | heterogeneous adoption | clustered SE |
+|---|---|---|---|---|---|---|---|---|---|
+| `DifferenceInDifferences` | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `MultiPeriodDiD` | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `TwoWayFixedEffects` | ✓ | warn | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `CallawaySantAnna` | ✓ | ✓ | ✗ | ✗ | partial | ✓ | ✗ | ✗ | ✓ |
+| `SunAbraham` | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ | ✗ | ✗ | ✓ |
+| `ChaisemartinDHaultfoeuille` | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `ImputationDiD` | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `TwoStageDiD` | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `StackedDiD` | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `WooldridgeDiD` (ETWFE) | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `EfficientDiD` | ✓ | ✓ | ✗ | ✗ | partial | ✓ | ✗ | ✗ | ✓ |
+| `SyntheticDiD` | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ | ✓ | ✗ | partial |
+| `TROP` | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | partial |
+| `TripleDifference` | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `StaggeredTripleDifference` | ✓ | ✓ | ✗ | ✓ | ✗ | ✓ | ✗ | ✗ | ✓ |
+| `ContinuousDiD` | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ | ✓ |
+| `HeterogeneousAdoptionDiD` | ✗ | partial | partial | ✗ | ✗ | ✗ | ✗ | ✓ | warn |
+
+**Footnotes.**
+- `TwoWayFixedEffects` + staggered: fits but mixes positive and negative
+  cohort-weights that violate the ATT interpretation; consult
+  `BaconDecomposition` to quantify. Prefer any staggered-robust
+  estimator (CS, SA, dCDH, Imputation, TwoStage, ETWFE) for a staggered
+  design.
+- `CallawaySantAnna` + never-treated: the "never-treated" control group
+  is one option; "not-yet-treated" is the other. Pick via the
+  `control_group` argument. If `has_never_treated == False`, use
+  `control_group="not_yet_treated"`.
+- `EfficientDiD` + never-treated: both `assumption="PT-All"` and
+  `assumption="PT-Post"` require actual never-treated units - PT-Post
+  is the weaker parallel-trends assumption but still uses never-treated
+  as the comparison group (REGISTRY.md `EfficientDiD` "Parallel Trends
+  -- two variants"). To admit an all-eventually-treated panel, pass
+  `control_group="last_cohort"` to reclassify the latest treatment
+  cohort as a pseudo-never-treated control and trim post-treatment
+  periods at/after its adoption. The `EfficientDiD.hausman_pretest`
+  classmethod picks between `PT-All` and `PT-Post` on panels that do
+  have never-treated units.
+- `SyntheticDiD` + staggered: not supported. `fit()` raises
+  `ValueError` on within-unit treatment variation; SDiD requires block
+  treatment (all treated units adopt at the same time). For staggered
+  designs use a cohort-level fit loop externally or pick a
+  staggered-robust estimator above.
+- `TROP` staggered support: treatment is an absorbing-state indicator,
+  so staggered adoption is handled via the D matrix. TROP `fit()` has
+  no covariate surface; its local method uses every unit untreated at
+  period `t` as the donor pool (not a never-treated-only set).
+- `HeterogeneousAdoptionDiD` covariate adjustment: identification with
+  covariates (paper Appendix B.1, Equation 19) is deferred to future
+  work; `fit(covariates=...)` is not yet implemented.
+- `HeterogeneousAdoptionDiD` clustered SE: `cluster=` is honored on the
+  mass-point / CR1 path; on the continuous nonparametric paths the
+  kwarg emits a `UserWarning` and is ignored (Phase 2a scope). Use
+  `bias_corrected_local_linear` directly for cluster-robust inference
+  on the nonparametric path.
+- `HeterogeneousAdoptionDiD` continuous: supports partial-adoption
+  intensity as a continuous first-stage variable; not a pure
+  dose-response estimator - use `ContinuousDiD` for that.
+- `HeterogeneousAdoptionDiD` staggered support is `partial`, not
+  general. Paper Appendix B.2 restricts staggered use to the
+  **last treatment cohort plus never-treated units**. With
+  `aggregate="event_study"` and a `first_treat_col` kwarg,
+  `fit()` auto-filters to `F_last = max(cohorts)` and emits a
+  `UserWarning` naming kept/dropped counts; earlier-cohort units
+  are dropped. Without `first_treat_col`, a multi-cohort panel
+  raises `ValueError`. For full staggered support that retains
+  every cohort, use `ChaisemartinDHaultfoeuille` instead.
+
+**Balanced-panel eligibility.** The following estimators require
+exactly one observation per `(unit, time)` cell with every unit
+observed in every period: `ContinuousDiD`, `EfficientDiD`,
+`SyntheticDiD`, `HeterogeneousAdoptionDiD`,
+`StaggeredTripleDifference`. Gate these on BOTH
+`PanelProfile.is_balanced == True` AND the absence of the
+`duplicate_unit_time_rows` alert (`is_balanced` is computed from the
+unique-key support and stays `True` when duplicates exist; the
+alert is the separate signal for duplicates). Treat both
+conditions as hard gates: `EfficientDiD` and
+`HeterogeneousAdoptionDiD` raise `ValueError` at `fit()` on
+duplicate cells, and `ContinuousDiD`'s precompute path resolves
+duplicates with last-row-wins (silent overwrite that can change
+the estimand). If either condition fails, pre-process with
+`diff_diff.prep.balance_panel()` and a
+`drop_duplicates([unit, time])` pass, or pick a balance-tolerant
+estimator from the remaining rows (CS/SA/dCDH/Imputation/TwoStage/
+Stacked/ETWFE all accept unbalanced input, with some caveats in
+their own docs).
+
+
+## §4. Estimator-choice reasoning by design feature
+
+Each subsection names a design feature and lists estimators applicable to
+it with the most important trade-offs. Multiple paths are always
+explicit; no subsection says "pick estimator X."
+
+### §4.1 Classic 2×2 DiD (binary absorbing, two periods, no staggering)
+
+When `treatment_type == "binary_absorbing"`, `n_periods == 2`, and
+`is_staggered == False`, the classic Card-and-Krueger 2×2 design applies.
+Most estimators in the library produce the same point estimate in this
+case; the choice between them is mostly about output shape:
+
+- `DifferenceInDifferences` for a minimal results object.
+- `TwoWayFixedEffects` if you want the equivalent two-way-FE regression
+  output (coefficient table, VCV, etc.). Identical to DiD in the 2×2
+  case.
+- `TripleDifference` if a second comparison dimension is available
+  (DDD) - see §4.6.
+
+### §4.2 Multi-period single-cohort (event-study without staggering)
+
+When `is_staggered == False` and `n_periods > 2`, event-study dynamics
+can be estimated but cohort-mixing bias is moot:
+
+- `MultiPeriodDiD` - per-period effect, standard event-study plot.
+- `TwoWayFixedEffects` with event-time dummies - similar output, no
+  forbidden comparisons because there is only one cohort.
+
+### §4.3 Staggered adoption (multi-cohort binary absorbing)
+
+When `is_staggered == True`, classic TWFE mixes positive- and
+negative-weighted cohort comparisons (Goodman-Bacon 2021,
+de Chaisemartin & d'Haultfoeuille 2020). Use one of the staggered-robust
+estimators:
+
+- `CallawaySantAnna` - group-time ATTs aggregated to ES / overall / cohort
+  dimensions. Flexible control-group choice (never-treated vs.
+  not-yet-treated). Covariate adjustment via doubly-robust (DR), IPW,
+  or regression-adjustment (RA).
+- `SunAbraham` - interaction-weighted estimator; closely tied to
+  two-way-FE output, computationally cheap, produces event-time
+  coefficients. Requires a never-treated cohort (`fit` raises a
+  `ValueError` when none exists).
+- `ChaisemartinDHaultfoeuille` - DID_M / DID_l estimators robust to
+  non-absorbing / reversible treatment (see §4.5). Interference /
+  between-unit spillovers are not supported natively - SUTVA is
+  assumed like every other DiD estimator in the suite.
+- `ImputationDiD` (Borusyak, Jaravel, Spiess) - imputation-based,
+  efficient under homoskedasticity, produces an imputation-based
+  residual at the observation level.
+- `TwoStageDiD` (Gardner) - two-stage residualize-then-regress.
+- `StackedDiD` - stacked event-study regressions, one subpanel per
+  cohort. Conservative interpretation.
+- `WooldridgeDiD` (ETWFE) - extended-TWFE with cohort-by-time-by-
+  covariates interactions; heterogeneous covariate-by-cohort effects.
+- `EfficientDiD` (Chen, Sant'Anna, Xie 2025) - asymptotically efficient
+  under either `PT-All` or `PT-Post`; use `EfficientDiD.hausman_pretest`
+  to pick. Requires a balanced panel (`PanelProfile.is_balanced ==
+  True`); `fit()` raises `ValueError` on unbalanced input.
+
+Diagnostic: `bacon_decompose(df, ...)` shows the weight allocation of a
+TWFE fit to 2×2 comparison types. Forbidden-comparison weight > 10% is a
+strong signal that the TWFE estimate is biased.
+
+### §4.4 No never-treated group
+
+When `has_never_treated == False`:
+
+- `SyntheticDiD` requires a never-treated donor pool - not applicable.
+- `TROP` does not require a strict never-treated partition: its donor
+  pool is every unit untreated at the current period `t` (via the
+  absorbing D matrix). When every unit is eventually treated TROP can
+  still fit, with the donor pool shrinking over time - check the
+  pre-treatment coverage of the factor-model fit in the results
+  diagnostics.
+- `EfficientDiD` requires never-treated comparisons under both
+  `assumption="PT-All"` and `assumption="PT-Post"`. To admit an
+  all-treated panel, pass `control_group="last_cohort"` to use the
+  latest treatment cohort as a pseudo-never-treated control
+  (post-treatment periods at/after that cohort's adoption are
+  trimmed). Distinct from CallawaySantAnna's `not_yet_treated`
+  option.
+- `ContinuousDiD` requires zero-dose control units (`P(D=0) > 0`).
+  Remark 3.1 of the paper (lowest-dose-as-control) is not yet
+  implemented; `fit()` raises `ValueError` when no `D=0` units exist.
+- `CallawaySantAnna` - use `control_group="not_yet_treated"` to use
+  not-yet-treated units as the control pool.
+- `ChaisemartinDHaultfoeuille` - constructs switchers vs. non-switchers
+  directly; no never-treated requirement.
+- TWFE / `MultiPeriodDiD` / `ImputationDiD` / `TwoStageDiD` /
+  `StackedDiD` / `WooldridgeDiD` - use the last-treated or untreated-
+  until-late units as implicit controls; estimators do not error, but
+  consider whether the implicit control structure is what you want.
+
+### §4.5 Non-absorbing binary treatment (treatment switches back to 0)
+
+When `treatment_type == "binary_non_absorbing"`:
+
+- `ChaisemartinDHaultfoeuille` is the only estimator in the library
+  that treats this natively. Switcher / non-switcher comparisons are
+  its primitive object.
+- Other estimators assume absorbing treatment and will produce
+  estimates whose interpretation is unclear. Do not use them without
+  a well-argued reason.
+
+### §4.6 Triple-difference design (DDD)
+
+When a second cross-cutting comparison axis exists (e.g., policy hits
+some states and some demographic subgroups within states):
+
+- `TripleDifference` - classic two-period DDD.
+- `StaggeredTripleDifference` - staggered DDD, robust to cohort-mixing.
+
+Triple-difference is not automatically detected by `profile_panel`;
+it requires the caller to identify the third comparison axis. If a
+`group` covariate in the panel drives differential exposure, DDD is
+worth considering.
+
+### §4.7 Continuous / dose-response treatment
+
+When `treatment_type == "continuous"`:
+
+- `ContinuousDiD` (Callaway, Goodman-Bacon, Sant'Anna 2024) -
+  continuous / dose-response treatment. **Three eligibility
+  prerequisites**: (a) zero-dose control units must exist
+  (`P(D=0) > 0`) because Remark 3.1 (lowest-dose-as-control) is not
+  yet implemented, (b) dose must be time-invariant per unit (rule out
+  panels where `PanelProfile.treatment_varies_within_unit == True`),
+  and (c) the panel must be balanced (`PanelProfile.is_balanced ==
+  True`). `fit()` raises `ValueError` in any of the three cases. Note that
+  staggered adoption IS supported natively (adoption timing is
+  expressed via the `first_treat` column, not via within-unit dose
+  variation). The estimator exposes several dose-indexed targets that
+  require different assumptions: `ATT(d|d)` (effect of dose `d` on
+  units that received `d`) and `ATT^{loc}` (binarized overall ATT)
+  are identified under Parallel Trends; `ATT(d)` (full dose-response
+  curve), `ACRT(d)` (marginal effect, i.e. the average causal
+  response), and `ACRT^{glob}` require the stronger Strong Parallel
+  Trends assumption. The BR headline scalar is the overall ATT; ACR
+  and dose-response tables are available in the result object.
+  Supports B-spline basis construction.
+- `HeterogeneousAdoptionDiD` - partial-adoption intensity, with a
+  scalar first-stage adoption summary. Useful when adoption is
+  graded rather than binary.
+
+### §4.8 Few treated units (one or a handful)
+
+When few treated units exist (not a separate `PanelProfile` field yet,
+but derivable from `cohort_sizes` + `has_never_treated`):
+
+- `SyntheticDiD` - synthetic-control-meets-DiD. Requires never-treated
+  donors and sufficient pre-treatment periods (Arkhangelsky et al. 2021).
+  Block treatment only: all treated units must adopt at the same time.
+  Requires a balanced panel (`PanelProfile.is_balanced == True`);
+  `fit()` raises `ValueError` and points at `balance_panel()`.
+- `TROP` - factor-model-based generalized synthetic control. Uses every
+  unit untreated at period `t` as the donor pool (via the absorbing-state
+  D matrix); supports staggered adoption and more complex factor
+  structures. No covariate-adjustment surface on `fit()`.
+
+Classical DiD estimators will still produce estimates, but inference is
+unreliable with very small treated groups; cluster-robust SE relies on
+the number of clusters, not the number of treated units. Bootstrap
+methods in the library are preferred.
+
+### §4.9 Heterogeneous adoption intensity
+
+When adoption varies in strength across units (partial-adoption settings,
+intensity of exposure differs):
+
+- `HeterogeneousAdoptionDiD` - requires a balanced panel
+  (`PanelProfile.is_balanced == True`; `fit()` raises `ValueError`
+  when any unit is missing a period). Targets a Weighted Average Slope (WAS)
+  on single-period Heterogeneous Adoption Designs where no genuinely
+  untreated group exists (paper Equation 2 / Theorem 1). The
+  `target_parameter` attribute on the results object is literally
+  `"WAS"` for Design 1' and `"WAS_d_lower"` for Design 1 with lower-dose
+  comparison under Assumption 6. `fit(aggregate="overall")` (Phase 2a)
+  returns a single scalar WAS; `fit(aggregate="event_study")` (Phase
+  2b) returns per-event-time WAS estimates. `did_had_pretest_workflow()`
+  runs the paper's three-step TWFE-suitability battery: (1) QUG null
+  via `qug_test`, (2) Assumption 7 pre-trends via `stute_test` /
+  `stute_joint_pretest` (event-study path only; the two-period overall
+  path flags this step as deferred), and (3) linearity of
+  `E[ΔY | D_2]` via `stute_test` / `yatchew_hr_test`. Assumption 3
+  (uniform continuity / no extensive-margin jump) is not testable; the
+  pre-test battery does not and cannot validate it. Not ATT-shaped; do
+  not relabel the headline as ATT in report text.
+
+  **Staggered-timing scope is last-cohort-only (Appendix B.2).**
+  HAD's staggered support is the `partial` cell in §3: on a
+  multi-cohort panel passed to `aggregate="event_study"`, `fit()`
+  auto-filters to the last treatment cohort (`F_last =
+  max(cohorts)`) plus never-treated units and emits a
+  `UserWarning` naming kept/dropped counts; earlier treated
+  cohorts are dropped. The `first_treat_col` kwarg is
+  **required** for the auto-filter to activate; without it a
+  multi-cohort panel raises `ValueError` pointing the caller at
+  `ChaisemartinDHaultfoeuille` for full staggered support. The
+  resulting estimand is a **last-cohort-only WAS**, not a
+  multi-cohort average — report it as such.
+
+### §4.10 Repeated cross-sections (no panel structure)
+
+`profile_panel` assumes long-format panel data. When the same units are
+not observed across time (true repeated cross-sections), only the
+estimators whose documented contract explicitly admits RCS are
+applicable. Do not route RCS data to any other estimator in the suite -
+most of them are panel-only by construction and will either raise at
+fit time or estimate under a misspecified identifying assumption.
+
+Explicit RCS support in this library:
+
+- `CallawaySantAnna(panel=False)` - repeated-cross-section mode per
+  REGISTRY.md §CallawaySantAnna; use this variant on RCS data.
+- `TripleDifference` - DDD cross-sectional use cases are documented
+  in `docs/choosing_estimator.rst`; the two-period DDD estimator does
+  not require within-unit tracking when the third comparison axis
+  carries the identification. The staggered DDD variant is panel-only
+  and listed separately below.
+
+Explicitly rejected for RCS (panel-only):
+
+- `EfficientDiD` - REGISTRY notes "does not handle ... repeated
+  cross-sections."
+- `HeterogeneousAdoptionDiD` - panel-only (requires a balanced panel
+  with per-unit adoption timing).
+- `SyntheticDiD` - requires balanced panel with per-unit donor matching.
+- `ContinuousDiD` - requires balanced panel with per-unit constant
+  dose.
+- `StaggeredTripleDifference` - panel-only; `fit()` has no
+  `panel=False` mode and rejects duplicate / unbalanced
+  `(unit, time)` structure. For cross-sectional DDD data use
+  `TripleDifference` instead.
+
+Treat other estimators in this guide as panel-only unless their own
+docs explicitly say otherwise. When routing, also:
+
+- Cluster SE on the unit proxy (state, region) rather than the
+  individual cross-section respondent.
+- Confirm the treatment assignment is at the cluster level, not at
+  the individual-respondent level, before interpreting the estimate
+  as a group-time ATT.
+
+
+## §5. Post-fit validation utilities
+
+After any `fit()`, the Baker et al. (2025) 8-step workflow recommends a
+diagnostic sequence. The library exposes utilities covering each step.
+Consult `get_llm_guide("practitioner")` for the workflow-prose form; this
+section is the API-reference index.
+
+### Parallel-trends and pre-trends
+
+- `check_parallel_trends(df, ...)` - exported from `diff_diff`.
+  Regression-based visual-plus-numeric test on pre-treatment periods.
+  Returns a structured result with p-value and per-period coefficients.
+- `check_parallel_trends_robust(df, ...)` - Roth (2022) power-adjusted
+  version; adds a "believable-magnitude" check against a power curve.
+- `equivalence_test_trends(df, ...)` - Bilinski-Hatfield-style
+  equivalence test (alternative framing of the PT test).
+- `compute_pretrends_power(results, ...)` - standalone power analysis
+  for the PT test; takes a fitted `MultiPeriodDiDResults` (or
+  compatible event-study results object), not raw DataFrame. Useful
+  when `min_pre_periods` is small.
+
+### Sensitivity / robustness
+
+- `compute_honest_did(results, ...)` - Rambachan-Roth (2023) honest DiD.
+  Quantifies the sensitivity of ATT to parallel-trends violations.
+  Outputs sensitivity bounds under smoothness restrictions.
+- `compute_pretrends_power(results, ...)` - complementary tool for
+  power-aware pre-trends interpretation (same fitted-results-first
+  signature as above).
+
+### Placebo tests
+
+- `run_placebo_test(df, ...)` - generic placebo runner.
+- `run_all_placebo_tests(df, ...)` - batch runner over predefined
+  placebos.
+- `placebo_timing_test(df, ...)` - false placebo-treatment time.
+- `placebo_group_test(df, ...)` - placebo treatment-group assignment.
+- `permutation_test(df, ...)` - Fisher-style exact permutation.
+- `leave_one_out_test(df, ...)` - refit dropping one unit at a time.
+
+### Estimator-native diagnostics
+
+Some estimators expose diagnostics as methods on the result object:
+
+- `SyntheticDiDResults.in_time_placebo()` - placebo treatment applied
+  in a pre-treatment period.
+- `SyntheticDiDResults.sensitivity_to_zeta_omega()` - regularization-
+  hyperparameter sensitivity.
+- `SyntheticDiDResults.get_weight_concentration()` - donor-weight
+  concentration summary.
+- `CallawaySantAnna.diagnose_propensity(df, ...)` - propensity-score
+  overlap check when using DR / IPW controls.
+- `EfficientDiD.hausman_pretest(df, ...)` - chooses between `PT-All` and
+  `PT-Post` for `EfficientDiD`.
+- `did_had_pretest_workflow(df, ...)` - bundled QUG / Stute / Yatchew-
+  Härdle pre-test battery for `HeterogeneousAdoptionDiD`.
+
+### Decomposition and weight auditing
+
+- `bacon_decompose(df, ...)` - Goodman-Bacon (2021) TWFE weight
+  decomposition. Returns a `BaconDecompositionResults` with the weight
+  on forbidden (later-vs-earlier) comparisons. Run before interpreting
+  any TWFE staggered fit.
+
+### Event-study plotting
+
+- `plot_event_study(results, ...)`
+- `plot_group_effects(results, ...)`
+- `plot_group_time_heatmap(results, ...)`
+- `plot_staircase(results, ...)`
+- `plot_honest_event_study(honest_results, ...)` - takes a
+  `HonestDiDResults` returned by `compute_honest_did`, not a fit
+  result directly.
+- `plot_sensitivity(sensitivity_results, ...)` - takes a
+  `SensitivityResults` object (the result of honest-DiD sensitivity
+  analysis), not a fit result directly.
+- `plot_synth_weights(results, ...)`
+- `plot_dose_response(results, ...)`
+- `plot_power_curve(...)`
+
+Event-study plots are also a diagnostic - pre-treatment coefficients
+close to zero support parallel trends.
+
+
+## §6. How to read BusinessReport / DiagnosticReport output
+
+`BusinessReport(results)` and `DiagnosticReport(results)` are experimental
+in the 3.2 line. Their schema is versioned (`BUSINESS_REPORT_SCHEMA_VERSION`
+and `DIAGNOSTIC_REPORT_SCHEMA_VERSION`, both `"2.0"` at time of writing)
+and expected to evolve. Treat `.to_dict()` output as the agent-legible
+contract; the prose renderers (`summary()`, `full_report()`) are derived
+from it.
+
+### BusinessReport `to_dict()` schema (v2.0)
+
+Top-level keys emitted by `BusinessReport.to_dict()`
+(source: `diff_diff/business_report.py`):
+
+- `schema_version: str` - `BUSINESS_REPORT_SCHEMA_VERSION`, e.g. `"2.0"`.
+- `estimator: dict` - `class_name` (the fitted result class) and a
+  human-friendly `display_name`.
+- `context: dict` - the `BusinessContext` bundle: `outcome_label`,
+  `outcome_unit`, `outcome_direction`, `business_question`,
+  `treatment_label`, `alpha`.
+- `headline: dict` - the main point estimate plus framing fields.
+- `target_parameter: dict` - what the headline scalar represents.
+  Fields: `name` (e.g. `"ATT"`, `"DID_M"`, `"dose-response"`,
+  `"WAS"`), `definition` (plain-English description), `aggregation`
+  (machine tag), `headline_attribute` (raw result attribute), and
+  `reference` (REGISTRY.md citation string).
+- `assumption: dict` - named assumptions relied on (parallel trends,
+  no anticipation, SUTVA, ...). Note: singular `"assumption"`, not
+  `"assumptions"`.
+- `pre_trends: dict` - pre-trends test result with verdict string
+  (e.g. `"clean"`, `"inconclusive"`, `"violated"`), p-value, and
+  power assessment if available. Note: underscore-split
+  `"pre_trends"`.
+- `sensitivity: dict` - HonestDiD sensitivity summary when available.
+- `sample: dict` - sample size and coverage details. Note: bare
+  `"sample"`, not `"sample_summary"`.
+- `heterogeneity: dict` - heterogeneity summary if applicable.
+- `robustness: dict` - placebo / robustness summaries if available.
+- `diagnostics: dict` - a wrapper around the auto-constructed
+  `DiagnosticReport`. Always has a `status` field: `"skipped"` with a
+  `reason` when `auto_diagnostics=False`, otherwise `"ran"` with the
+  full DR `to_dict()` payload under `diagnostics["schema"]` and a
+  mirrored `overall_interpretation` string. Parse `schema` (not
+  `diagnostics` directly) to access the DR sections documented below.
+- `next_steps: list[dict]` - Baker et al. next-step guidance from
+  `practitioner_next_steps`.
+- `caveats: list[str]` - free-text caveats generated from failed
+  checks.
+- `references: list[dict]` - citations relevant to the estimator.
+
+### DiagnosticReport `to_dict()` schema (v2.0)
+
+Top-level keys (source: `diff_diff/diagnostic_report.py`):
+
+- `schema_version: str` - `DIAGNOSTIC_REPORT_SCHEMA_VERSION`.
+- `estimator: str` - the fitted result class name.
+- `headline_metric: dict` - the main scalar the report headlines.
+- `target_parameter: dict` - same shape as the BR field above.
+- `parallel_trends: dict` - PT test result.
+- `pretrends_power: dict` - power-aware pre-trends assessment when
+  applicable.
+- `sensitivity: dict` - HonestDiD sensitivity summary.
+- `placebo: dict` - placebo-test results.
+- `bacon: dict` - Goodman-Bacon decomposition when applicable.
+- `design_effect: dict` - survey / clustering design-effect summary.
+- `heterogeneity: dict` - group-time heterogeneity summary.
+- `epv: dict` - events-per-variable / sample-adequacy.
+- `estimator_native_diagnostics: dict` - estimator-specific
+  diagnostics (e.g. SDiD weight concentration, TROP factor-model
+  fit).
+- `skipped: dict` - checks skipped on this estimator type, with the
+  reason.
+- `warnings: list[str]` - top-level aggregated warnings.
+- `overall_interpretation: str` - rendered prose summary of the
+  sections.
+- `next_steps: list[dict]` - same shape as the BR field.
+
+Each section value is a dict. Parse it in two layers:
+
+1. `status: str` — execution state, not qualitative interpretation.
+   The values actually emitted by `DiagnosticReport.to_dict()` are:
+   `"ran"` (section executed), `"not_applicable"` (check does not
+   apply to this estimator or design), `"not_run"` (implementation
+   pending), `"no_scalar_by_design"` (for estimators that return a
+   table instead of a scalar headline, e.g. dCDH with
+   `trends_linear=True, L_max>=2`), and `"skipped"` (auto-diagnostics
+   disabled or the section was short-circuited at top level).
+2. `verdict: str` (only present when `status == "ran"`) — qualitative
+   interpretation of the executed check. Candidate values include
+   `"clean"`, `"inconclusive"`, `"violated"`, and section-specific
+   labels.
+
+`reason: str` is an optional free-text explanation that usually
+accompanies non-`"ran"` statuses; it may also appear on `"ran"`
+sections as supplementary context. The rest of each section dict is
+section-specific payload (e.g. p-values, coefficients, cohort tables).
+
+Forthcoming schema additions (not yet shipped): a top-level
+`sanity_checks` block (machine-legible pass/warn/fail summary) and a
+`mismatch_warnings` list (post-hoc estimator-mismatch detection) are
+queued for a later wave. Treat their current absence as expected.
+
+
+## §7. Glossary + citations
+
+**ATT**: Average Treatment Effect on the Treated. The target parameter
+of most DiD estimators.
+
+**Parallel trends**: counterfactual trends in treated and control
+outcomes would have moved together absent treatment. Untestable directly;
+pre-treatment dynamics are a necessary (not sufficient) indicator.
+
+**No anticipation**: units do not respond to treatment before it occurs.
+If plausible, test via pre-treatment event-study coefficients.
+
+**SUTVA**: Stable Unit Treatment Value Assumption. Rules out spillovers
+and interference between units.
+
+**Forbidden comparison**: in TWFE, a comparison where already-treated
+units serve as controls for later-treated units. Weights are negative
+and the resulting estimate can flip sign vs. the true ATT.
+
+**Cohort / treatment timing**: first-treatment period for an
+absorbing-treatment unit. Units sharing a cohort share an adoption date.
+
+**Staggered adoption**: two or more cohorts present in the panel.
+
+**Doubly-robust (DR) / IPW / RA**: three covariate-adjustment strategies
+in `CallawaySantAnna`. DR is consistent if either the propensity model
+or the outcome model is correctly specified.
+
+### Primary references
+
+- **Baker, Andrew, Brantly Callaway, Scott Cunningham, Andrew
+  Goodman-Bacon, and Pedro H. C. Sant'Anna (2025).** "Difference-in-
+  Differences Designs: A Practitioner's Guide." arXiv:2503.13323.
+  The 8-step workflow and best-practice framing. Ships as
+  `get_llm_guide("practitioner")`.
+- **Roth, Jonathan, Pedro H. C. Sant'Anna, Alyssa Bilinski, and John
+  Poe (2023).** "What's Trending in Difference-in-Differences? A
+  Synthesis of the Recent Econometrics Literature." Journal of
+  Econometrics 235(2): 2218-2244. Canonical-assumption framing;
+  classification of estimator relaxations.
+- **Goodman-Bacon, Andrew (2021).** "Difference-in-Differences with
+  Variation in Treatment Timing." Journal of Econometrics
+  225(2): 254-277. TWFE weight decomposition;
+  `bacon_decompose` implements this.
+- **Callaway, Brantly, and Pedro H. C. Sant'Anna (2021).**
+  "Difference-in-Differences with Multiple Time Periods." Journal of
+  Econometrics 225(2): 200-230. Group-time ATT.
+- **Sun, Liyang, and Sarah Abraham (2021).** "Estimating Dynamic
+  Treatment Effects in Event Studies with Heterogeneous Treatment
+  Effects." Journal of Econometrics 225(2): 175-199. IW estimator.
+- **de Chaisemartin, Clément, and Xavier d'Haultfoeuille (2020).**
+  "Two-Way Fixed Effects Estimators with Heterogeneous Treatment
+  Effects." American Economic Review 110(9): 2964-2996. DID_M
+  estimator.
+- **Borusyak, Kirill, Xavier Jaravel, and Jann Spiess (2024).**
+  "Revisiting Event-Study Designs: Robust and Efficient Estimation."
+  Review of Economic Studies 91(6): 3253-3285. Imputation estimator.
+- **Gardner, John (2022).** "Two-Stage Differences in Differences."
+  arXiv:2207.05943. Two-stage estimator.
+- **Wooldridge, Jeffrey M. (2021).** "Two-Way Fixed Effects, the Two-
+  Way Mundlak Regression, and Difference-in-Differences Estimators."
+  ETWFE formulation.
+- **Arkhangelsky, Dmitry, Susan Athey, David Hirshberg, Guido Imbens,
+  and Stefan Wager (2021).** "Synthetic Difference-in-Differences."
+  American Economic Review 111(12): 4088-4118. SDiD estimator.
+- **Rambachan, Ashesh, and Jonathan Roth (2023).** "A More Credible
+  Approach to Parallel Trends." Review of Economic Studies
+  90(5): 2555-2591. HonestDiD sensitivity.
+- **Bilinski, Alyssa, and Laura A. Hatfield (2019).** "Nothing to See
+  Here? Non-Inferiority Approaches to Parallel Trends and Other
+  Model Assumptions." arXiv:1805.03273. Equivalence test.
+- **Sant'Anna, Pedro H. C., and Jun Zhao (2020).** "Doubly Robust
+  Difference-in-Differences Estimators." Journal of Econometrics
+  219(1): 101-122. DR adjustment.
+- **Chen, Xiaohong, Pedro H. C. Sant'Anna, and Haitian Xie (2025).**
+  "Efficient Difference-in-Differences and Event Study Estimators."
+  Primary source for the `EfficientDiD` estimator (PT-All / PT-Post
+  framing and efficient combination weights).
+- **Callaway, Brantly, Andrew Goodman-Bacon, and Pedro H. C.
+  Sant'Anna (2024).** "Difference-in-Differences with a Continuous
+  Treatment." Primary source for `ContinuousDiD`; introduces the
+  Parallel Trends vs Strong Parallel Trends distinction underlying
+  `ATT(d|d)`, `ATT(d)`, `ACRT(d)`, and `ACRT^{glob}`.
+
+### Online resources
+
+- **psantanna.com/did-resources** - practitioner checklist + reading
+  list maintained by Pedro Sant'Anna.
+- **bcallaway11.github.io/did** - `did` R package tutorials
+  (Callaway-Sant'Anna).
+
+
+## §8. Intentional omissions
+
+This guide does **not**:
+
+- Recommend a specific estimator for a specific dataset. When multiple
+  estimators fit, §4 lists them and names the trade-offs; the choice is
+  the agent's.
+- Enumerate every possible design edge case. The literature cited in §7
+  covers them; this guide is a navigation aid, not a substitute.
+- Promise forward-compatibility of the BR / DR schema or the alert
+  catalogue. Treat these as experimental until the 12-item foundation-
+  gap list closes.
+- Replace `bacon_decompose()`, `compute_honest_did()`, or any of the
+  estimator-native diagnostics. Post-fit validation is mandatory, not
+  optional, and belongs in the final write-up.
+- Cover methods outside diff-diff's estimator suite (e.g., instrumental
+  variables, regression discontinuity, synthetic control for a single
+  treated unit). When those apply, point the user at dedicated
+  libraries.
+
+**If in doubt, consult the primary references in §7 and use
+`get_llm_guide("practitioner")` for the Baker et al. workflow.**
diff --git a/diff_diff/profile.py b/diff_diff/profile.py
new file mode 100644
index 00000000..b343f9d4
--- /dev/null
+++ b/diff_diff/profile.py
@@ -0,0 +1,714 @@
+"""Descriptive panel-profiling utility for agent-facing use.
+
+``profile_panel()`` inspects a DiD panel and returns a :class:`PanelProfile`
+dataclass of structural facts — panel balance, treatment-type classification,
+outcome characteristics, and a list of factual :class:`Alert` observations.
+
+This module is descriptive, not opinionated. Alerts report what is (e.g.
+"smallest cohort has 7 units"), never what to do about it. Estimator
+selection is the caller's responsibility; consult
+``diff_diff.get_llm_guide("autonomous")`` for the estimator-support matrix
+and per-design-feature reasoning.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any, Dict, List, Mapping, Optional, Tuple, cast
+
+import numpy as np
+import pandas as pd
+
+_OBSERVATION_COVERAGE_THRESHOLD = 0.70
+_MIN_COHORT_SIZE_THRESHOLD = 10
+_SHORT_PRE_PANEL_THRESHOLD = 3
+_SHORT_POST_PANEL_THRESHOLD = 3
+
+
+@dataclass(frozen=True)
+class Alert:
+    """A factual observation about a panel.
+
+    ``severity`` is ``"info"`` (descriptive) or ``"warn"`` (descriptive and
+    likely relevant to the caller's estimator choice). Alerts never
+    recommend a specific estimator.
+    """
+
+    code: str
+    severity: str
+    message: str
+    observed: Any
+
+
+@dataclass(frozen=True)
+class PanelProfile:
+    """Structural facts about a DiD panel.
+
+    Returned by :func:`profile_panel`. Mirrors the ``BusinessContext``
+    frozen-dataclass pattern. Consume ``.to_dict()`` for a JSON-serializable
+    representation and reason against the bundled
+    ``llms-autonomous.txt`` guide.
+    """
+
+    n_units: int
+    n_periods: int
+    n_obs: int
+    is_balanced: bool  # every (unit, time) cell appears at least once
+    observation_coverage: float  # unique (unit, time) keys / (n_units * n_periods)
+
+    treatment_type: str
+    is_staggered: bool
+    n_cohorts: int
+    cohort_sizes: Mapping[Any, int]
+    has_never_treated: bool
+    has_always_treated: bool
+    treatment_varies_within_unit: bool
+
+    first_treatment_period: Optional[Any]
+    last_treatment_period: Optional[Any]
+    min_pre_periods: Optional[int]
+    min_post_periods: Optional[int]
+
+    outcome_dtype: str
+    outcome_is_binary: bool
+    outcome_has_zeros: bool
+    outcome_has_negatives: bool
+    outcome_missing_fraction: float
+    outcome_summary: Mapping[str, float]
+
+    alerts: Tuple[Alert, ...]
+
+    def to_dict(self) -> Dict[str, Any]:
+        """Return a JSON-serializable dict representation of the profile."""
+        return {
+            "n_units": self.n_units,
+            "n_periods": self.n_periods,
+            "n_obs": self.n_obs,
+            "is_balanced": self.is_balanced,
+            "observation_coverage": self.observation_coverage,
+            "treatment_type": self.treatment_type,
+            "is_staggered": self.is_staggered,
+            "n_cohorts": self.n_cohorts,
+            "cohort_sizes": {_jsonable_key(k): int(v) for k, v in self.cohort_sizes.items()},
+            "has_never_treated": self.has_never_treated,
+            "has_always_treated": self.has_always_treated,
+            "treatment_varies_within_unit": self.treatment_varies_within_unit,
+            "first_treatment_period": _jsonable(self.first_treatment_period),
+            "last_treatment_period": _jsonable(self.last_treatment_period),
+            "min_pre_periods": self.min_pre_periods,
+            "min_post_periods": self.min_post_periods,
+            "outcome_dtype": self.outcome_dtype,
+            "outcome_is_binary": self.outcome_is_binary,
+            "outcome_has_zeros": self.outcome_has_zeros,
+            "outcome_has_negatives": self.outcome_has_negatives,
+            "outcome_missing_fraction": self.outcome_missing_fraction,
+            "outcome_summary": {k: float(v) for k, v in self.outcome_summary.items()},
+            "alerts": [
+                {
+                    "code": a.code,
+                    "severity": a.severity,
+                    "message": a.message,
+                    "observed": _jsonable(a.observed),
+                }
+                for a in self.alerts
+            ],
+        }
+
+
+def profile_panel(
+    df: pd.DataFrame,
+    *,
+    unit: str,
+    time: str,
+    treatment: str,
+    outcome: str,
+) -> PanelProfile:
+    """Describe the structure of a DiD panel.
+
+    Reports structural facts — balance, treatment-type classification,
+    outcome characteristics, factual alerts. Descriptive, not opinionated:
+    the profile says what is, never what to do about it. Estimator
+    selection is up to the caller.
+
+    Parameters
+    ----------
+    df : pandas.DataFrame
+        Long-format panel data containing the four named columns.
+    unit : str
+        Column identifying the cross-sectional unit.
+    time : str
+        Column identifying the time period.
+    treatment : str
+        Column holding the treatment indicator or dose. See Notes for the
+        classification rules.
+    outcome : str
+        Column holding the outcome variable.
+
+    Returns
+    -------
+    PanelProfile
+        Frozen dataclass. Call ``.to_dict()`` for a JSON-serializable view.
+
+    Raises
+    ------
+    ValueError
+        If any of the four column names is not present in ``df``.
+
+    Examples
+    --------
+    >>> import pandas as pd
+    >>> from diff_diff import profile_panel
+    >>> df = pd.DataFrame({
+    ...     "u":  [1, 1, 2, 2],
+    ...     "t":  [0, 1, 0, 1],
+    ...     "tr": [0, 0, 1, 1],
+    ...     "y":  [0.1, 0.2, 0.1, 0.9],
+    ... })
+    >>> profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    >>> profile.is_balanced
+    True
+    >>> profile.treatment_type
+    'binary_absorbing'
+
+    Notes
+    -----
+    Classification rules for ``treatment_type``:
+
+    - ``"binary_absorbing"``: numeric treatment whose observed non-NaN
+      values are a subset of :math:`\\{0, 1\\}` (one or two distinct
+      values) AND each unit's treatment sequence (ordered by ``time``)
+      is weakly monotone non-decreasing. All-zero and all-one panels
+      are valid degenerate cases.
+    - ``"binary_non_absorbing"``: values a subset of :math:`\\{0, 1\\}`
+      with at least two distinct values observed, where at least one
+      unit switches from 1 back to 0.
+    - ``"continuous"``: numeric treatment with more than two distinct
+      values, or a 2-valued numeric whose values are not in
+      :math:`\\{0, 1\\}` (matches the ``ContinuousDiD`` convention).
+    - ``"categorical"``: non-numeric dtype (object / category) or a
+      column that is entirely NaN.
+
+    Bool-dtype columns (``True`` / ``False``) are classified the same
+    way as numeric ``{0, 1}``: the library's binary estimators validate
+    on value support via :func:`diff_diff.utils.validate_binary`, so
+    ``True`` / ``False`` behave like ``1`` / ``0`` for absorbing /
+    non-absorbing classification.
+
+    ``has_never_treated`` is computed across both binary and
+    continuous numeric treatment types: some unit has ``treatment ==
+    0`` in every observed non-NaN row. For binary this flags the
+    clean-control group; for continuous this flags zero-dose controls
+    (required by ``ContinuousDiD``). Always ``False`` for
+    ``"categorical"``.
+
+    ``has_always_treated`` has binary-only semantics: some unit has
+    ``treatment == 1`` in every observed non-NaN row (no pre-treatment
+    information in the DiD sense). For ``"continuous"`` and
+    ``"categorical"`` treatment this field is always ``False``
+    regardless of dose positivity — pre-treatment periods on
+    continuous DiD are determined by the separate ``first_treat``
+    column passed to ``ContinuousDiD.fit``, not by whether the dose
+    is strictly positive.
+
+    Rows with ``NaN`` in ``unit`` or ``time`` are dropped up front and
+    surfaced via the ``missing_id_rows_dropped`` alert; all subsequent
+    structural facts are computed on the non-missing subset, so
+    ``observation_coverage`` is always in ``[0, 1]``. Duplicate
+    ``(unit, time)`` rows are surfaced separately via the
+    ``duplicate_unit_time_rows`` alert.
+
+    The profile does not recommend an estimator. Consult
+    ``diff_diff.get_llm_guide("autonomous")`` for the estimator-support
+    matrix and per-design-feature reasoning.
+    """
+    _validate_columns(df, unit=unit, time=time, treatment=treatment, outcome=outcome)
+
+    input_row_count = int(len(df))
+    if input_row_count == 0:
+        raise ValueError("profile_panel: DataFrame is empty; at least one row is required.")
+
+    missing_id_mask = cast(pd.Series, df[[unit, time]].isna().any(axis=1))
+    n_rows_with_missing_id = int(missing_id_mask.sum())
+    if n_rows_with_missing_id > 0:
+        df = df.loc[~missing_id_mask]
+    n_obs = int(len(df))
+    if n_obs == 0:
+        raise ValueError(
+            f"profile_panel: no rows remain after dropping "
+            f"{n_rows_with_missing_id} row(s) with missing unit or time "
+            "identifier; at least one valid row is required."
+        )
+
+    n_units = int(df[unit].nunique())
+    n_periods = int(df[time].nunique())
+    n_unique_keys = int(df[[unit, time]].drop_duplicates().shape[0])
+    denom = n_units * n_periods
+    observation_coverage = float(n_unique_keys / denom) if denom > 0 else 0.0
+    is_balanced = n_unique_keys == denom
+    n_duplicate_rows = n_obs - n_unique_keys
+
+    (
+        treatment_type,
+        is_staggered,
+        cohort_sizes,
+        has_never_treated,
+        has_always_treated,
+        first_tp,
+        last_tp,
+    ) = _classify_treatment(df, unit=unit, time=time, treatment=treatment)
+
+    if pd.api.types.is_numeric_dtype(df[treatment]) or pd.api.types.is_bool_dtype(df[treatment]):
+        per_unit_distinct = df.groupby(unit)[treatment].nunique(dropna=True)
+        treatment_varies_within_unit = bool((per_unit_distinct > 1).any())
+    else:
+        treatment_varies_within_unit = False
+
+    min_pre, min_post = _compute_pre_post(
+        df,
+        unit=unit,
+        time=time,
+        treatment=treatment,
+        treatment_type=treatment_type,
+    )
+
+    outcome_col = cast(pd.Series, df[outcome])
+    outcome_dtype = str(outcome_col.dtype)
+    valid = cast(pd.Series, outcome_col.dropna())
+    outcome_missing_fraction = (
+        float(1.0 - len(valid) / len(outcome_col)) if len(outcome_col) > 0 else 0.0
+    )
+    outcome_is_binary, outcome_has_zeros, outcome_has_negatives = _classify_outcome(valid)
+    outcome_summary = _summarize_outcome(valid)
+
+    dtype_kind = getattr(outcome_col.dtype, "kind", "O")
+    alerts = _compute_alerts(
+        n_periods=n_periods,
+        observation_coverage=observation_coverage,
+        cohort_sizes=cohort_sizes,
+        has_never_treated=has_never_treated,
+        has_always_treated=has_always_treated,
+        min_pre_periods=min_pre,
+        min_post_periods=min_post,
+        outcome_is_binary=outcome_is_binary,
+        outcome_dtype_kind=dtype_kind,
+        n_duplicate_rows=n_duplicate_rows,
+        n_rows_with_missing_id=n_rows_with_missing_id,
+    )
+
+    return PanelProfile(
+        n_units=n_units,
+        n_periods=n_periods,
+        n_obs=n_obs,
+        is_balanced=is_balanced,
+        observation_coverage=observation_coverage,
+        treatment_type=treatment_type,
+        is_staggered=is_staggered,
+        n_cohorts=len(cohort_sizes),
+        cohort_sizes=cohort_sizes,
+        has_never_treated=has_never_treated,
+        has_always_treated=has_always_treated,
+        treatment_varies_within_unit=treatment_varies_within_unit,
+        first_treatment_period=first_tp,
+        last_treatment_period=last_tp,
+        min_pre_periods=min_pre,
+        min_post_periods=min_post,
+        outcome_dtype=outcome_dtype,
+        outcome_is_binary=outcome_is_binary,
+        outcome_has_zeros=outcome_has_zeros,
+        outcome_has_negatives=outcome_has_negatives,
+        outcome_missing_fraction=outcome_missing_fraction,
+        outcome_summary=outcome_summary,
+        alerts=tuple(alerts),
+    )
+
+
+def _validate_columns(df: pd.DataFrame, **cols: str) -> None:
+    missing = [(role, name) for role, name in cols.items() if name not in df.columns]
+    if missing:
+        pairs = ", ".join(f"{role}={name!r}" for role, name in missing)
+        raise ValueError(
+            f"profile_panel: column(s) not found in DataFrame: {pairs}. "
+            f"Provided columns: {list(df.columns)}"
+        )
+
+
+def _classify_treatment(
+    df: pd.DataFrame,
+    *,
+    unit: str,
+    time: str,
+    treatment: str,
+) -> Tuple[
+    str,
+    bool,
+    Dict[Any, int],
+    bool,
+    bool,
+    Optional[Any],
+    Optional[Any],
+]:
+    """Return (type, is_staggered, cohort_sizes, has_never, has_always, first_tp, last_tp)."""
+    col = df[treatment]
+    is_numeric = pd.api.types.is_numeric_dtype(col)
+    is_bool = pd.api.types.is_bool_dtype(col)
+
+    # Bool-dtype treatment columns are treated as binary 0/1 inputs.
+    # The library's binary estimators validate value support via
+    # `validate_binary`, which accepts bool because True/False coerce
+    # to 1/0 numerically. Classifying bool columns as "categorical"
+    # here would route a valid binary design away from the supported
+    # estimator set.
+    if (not is_numeric) and (not is_bool):
+        return ("categorical", False, {}, False, False, None, None)
+
+    distinct = col.dropna().unique()
+    n_distinct = len(distinct)
+    values_set = set(distinct.tolist())
+    if n_distinct == 0:
+        return ("categorical", False, {}, False, False, None, None)
+
+    # has_never_treated has a single well-defined meaning across binary
+    # and continuous numeric treatment: some unit has treatment == 0 in
+    # every observed non-NaN row. For binary this is the clean-control
+    # group; for continuous this is the zero-dose control required by
+    # ContinuousDiD (P(D=0) > 0).
+    unit_max = df.groupby(unit)[treatment].max().to_numpy()
+    unit_min = df.groupby(unit)[treatment].min().to_numpy()
+    has_never_treated = bool(np.any(unit_max == 0))
+
+    is_binary_valued = values_set <= {0, 1, 0.0, 1.0}
+    # has_always_treated has binary-only semantics: "unit is treated in
+    # every observed period" = unit_min == 1 on a binary panel (no
+    # pre-treatment information). For continuous panels, positive dose
+    # throughout does not mean "always treated in the DiD sense"
+    # (pre-treatment periods are determined by `first_treat`, not by
+    # whether the dose is positive), so this field is False for
+    # continuous / categorical types.
+    has_always_treated = is_binary_valued and bool(np.any(unit_min == 1))
+
+    if not is_binary_valued:
+        return (
+            "continuous",
+            False,
+            {},
+            has_never_treated,
+            has_always_treated,
+            None,
+            None,
+        )
+
+    sorted_df = df.sort_values([unit, time])
+
+    # Monotonicity check on the observed non-NaN subsequence per unit.
+    # A path like [0, 1, NaN, 0] must be detected as non-absorbing: the
+    # non-NaN subsequence [0, 1, 0] violates weak monotonicity.
+    is_absorbing = True
+    for _, group in sorted_df.groupby(unit, sort=False):
+        vals = group[treatment].to_numpy()
+        mask = ~pd.isna(vals)
+        # Cast to int so np.diff on a bool-dtype column performs
+        # arithmetic (1 - 0 = 1, 0 - 1 = -1) rather than XOR (which
+        # would mask a True -> False transition).
+        observed = vals[mask].astype(np.int64, copy=False)
+        if len(observed) >= 2 and bool(np.any(np.diff(observed) < 0)):
+            is_absorbing = False
+            break
+
+    if not is_absorbing:
+        return (
+            "binary_non_absorbing",
+            False,
+            {},
+            has_never_treated,
+            has_always_treated,
+            None,
+            None,
+        )
+
+    first_treat = sorted_df[sorted_df[treatment] == 1].groupby(unit, sort=False)[time].min()
+    cohort_counts = first_treat.value_counts().sort_index()
+    cohort_sizes: Dict[Any, int] = {k: int(v) for k, v in cohort_counts.items()}
+    first_tp = min(cohort_sizes) if cohort_sizes else None
+    last_tp = max(cohort_sizes) if cohort_sizes else None
+    is_staggered = len(cohort_sizes) >= 2
+
+    return (
+        "binary_absorbing",
+        is_staggered,
+        cohort_sizes,
+        has_never_treated,
+        has_always_treated,
+        first_tp,
+        last_tp,
+    )
+
+
+def _compute_pre_post(
+    df: pd.DataFrame,
+    *,
+    unit: str,
+    time: str,
+    treatment: str,
+    treatment_type: str,
+) -> Tuple[Optional[int], Optional[int]]:
+    """Return (min_pre, min_post) across treated units using each unit's
+    observed (unit, time) support. On unbalanced panels this correctly
+    reflects the actual pre/post exposure of the least-supported treated
+    unit, rather than the global panel period set which could overstate
+    exposure and suppress short-panel alerts.
+    """
+    if treatment_type != "binary_absorbing":
+        return None, None
+
+    support = df[[unit, time]].drop_duplicates()
+    sorted_df = df.sort_values([unit, time])
+    first_treat_per_unit = (
+        sorted_df[sorted_df[treatment] == 1].groupby(unit, sort=False)[time].min()
+    )
+    if first_treat_per_unit.empty:
+        return None, None
+
+    pre_counts: List[int] = []
+    post_counts: List[int] = []
+    treated_units = first_treat_per_unit.index.tolist()
+    for u in treated_units:
+        c_u = first_treat_per_unit.loc[u]
+        unit_periods = support.loc[support[unit] == u, time]
+        pre_counts.append(int((unit_periods < c_u).sum()))
+        post_counts.append(int((unit_periods >= c_u).sum()))
+
+    return int(min(pre_counts)), int(min(post_counts))
+
+
+def _classify_outcome(valid: pd.Series) -> Tuple[bool, bool, bool]:
+    n_distinct = valid.nunique(dropna=False)
+    if n_distinct == 0:
+        return False, False, False
+
+    is_numeric = pd.api.types.is_numeric_dtype(valid)
+    if is_numeric:
+        distinct_set = set(valid.unique().tolist())
+        is_binary = n_distinct == 2 and (distinct_set <= {0, 1} or distinct_set <= {0.0, 1.0})
+        has_zeros = bool((valid == 0).any())
+        has_negatives = bool((valid < 0).any())
+        return is_binary, has_zeros, has_negatives
+
+    return False, False, False
+
+
+def _summarize_outcome(valid: pd.Series) -> Dict[str, float]:
+    if len(valid) == 0 or not pd.api.types.is_numeric_dtype(valid):
+        return {}
+    return {
+        "min": float(valid.min()),
+        "max": float(valid.max()),
+        "mean": float(valid.mean()),
+        "std": float(valid.std(ddof=1)) if len(valid) > 1 else 0.0,
+    }
+
+
+def _compute_alerts(
+    *,
+    n_periods: int,
+    observation_coverage: float,
+    cohort_sizes: Mapping[Any, int],
+    has_never_treated: bool,
+    has_always_treated: bool,
+    min_pre_periods: Optional[int],
+    min_post_periods: Optional[int],
+    outcome_is_binary: bool,
+    outcome_dtype_kind: str,
+    n_duplicate_rows: int,
+    n_rows_with_missing_id: int,
+) -> List[Alert]:
+    alerts: List[Alert] = []
+
+    if n_rows_with_missing_id > 0:
+        alerts.append(
+            Alert(
+                code="missing_id_rows_dropped",
+                severity="warn",
+                message=(
+                    f"Dropped {n_rows_with_missing_id} row(s) with missing "
+                    "unit or time identifier; structural facts are computed "
+                    "from the non-missing subset."
+                ),
+                observed=int(n_rows_with_missing_id),
+            )
+        )
+
+    if n_duplicate_rows > 0:
+        alerts.append(
+            Alert(
+                code="duplicate_unit_time_rows",
+                severity="warn",
+                message=(
+                    f"Found {n_duplicate_rows} duplicate (unit, time) row(s); "
+                    "balance and coverage are computed from the unique support."
+                ),
+                observed=int(n_duplicate_rows),
+            )
+        )
+
+    if cohort_sizes:
+        smallest = min(cohort_sizes.values())
+        if smallest < _MIN_COHORT_SIZE_THRESHOLD:
+            alerts.append(
+                Alert(
+                    code="min_cohort_size_below_10",
+                    severity="warn",
+                    message=(
+                        f"Smallest cohort has {smallest} units; "
+                        "cohort-level inference will be noisy."
+                    ),
+                    observed=int(smallest),
+                )
+            )
+        if len(cohort_sizes) == 1:
+            alerts.append(
+                Alert(
+                    code="only_one_cohort",
+                    severity="info",
+                    message=("All treated units adopt at the same time " "(non-staggered design)."),
+                    observed=1,
+                )
+            )
+            if not has_never_treated:
+                alerts.append(
+                    Alert(
+                        code="all_units_treated_simultaneously",
+                        severity="info",
+                        message=(
+                            "Every unit is treated and every treated unit "
+                            "adopts in the same period; no untreated "
+                            "comparison group exists in the panel."
+                        ),
+                        observed=None,
+                    )
+                )
+
+    if min_pre_periods is not None and min_pre_periods < _SHORT_PRE_PANEL_THRESHOLD:
+        alerts.append(
+            Alert(
+                code="short_pre_panel",
+                severity="warn",
+                message=(
+                    f"Minimum pre-treatment periods across treated units is "
+                    f"{min_pre_periods}; parallel-trends and event-study "
+                    "diagnostics have limited power."
+                ),
+                observed=int(min_pre_periods),
+            )
+        )
+    if min_post_periods is not None and min_post_periods < _SHORT_POST_PANEL_THRESHOLD:
+        alerts.append(
+            Alert(
+                code="short_post_panel",
+                severity="info",
+                message=(
+                    f"Minimum post-treatment periods across treated units is "
+                    f"{min_post_periods}; dynamic-effect estimation is "
+                    "limited."
+                ),
+                observed=int(min_post_periods),
+            )
+        )
+
+    if cohort_sizes and not has_never_treated:
+        alerts.append(
+            Alert(
+                code="no_never_treated",
+                severity="info",
+                message=(
+                    "No never-treated comparison units; every unit in the "
+                    "panel is eventually treated."
+                ),
+                observed=False,
+            )
+        )
+
+    if has_always_treated:
+        alerts.append(
+            Alert(
+                code="has_always_treated_units",
+                severity="info",
+                message=(
+                    "Some units are treated in every observed period; they "
+                    "provide no pre-treatment information."
+                ),
+                observed=True,
+            )
+        )
+
+    if observation_coverage < _OBSERVATION_COVERAGE_THRESHOLD:
+        alerts.append(
+            Alert(
+                code="panel_highly_unbalanced",
+                severity="warn",
+                message=(
+                    f"Observation coverage is {observation_coverage:.1%}; "
+                    "panel is highly unbalanced."
+                ),
+                observed=float(observation_coverage),
+            )
+        )
+
+    if n_periods == 2:
+        alerts.append(
+            Alert(
+                code="only_two_periods",
+                severity="info",
+                message="Only two time periods are observed (2x2 design).",
+                observed=2,
+            )
+        )
+
+    if outcome_is_binary and outcome_dtype_kind == "f":
+        alerts.append(
+            Alert(
+                code="outcome_looks_binary_but_dtype_float",
+                severity="info",
+                message=("Outcome takes values in {0, 1} but is stored with a " "float dtype."),
+                observed=None,
+            )
+        )
+
+    return alerts
+
+
+def _jsonable(x: Any) -> Any:
+    """Coerce a value to a JSON-serializable primitive."""
+    if x is None:
+        return None
+    if isinstance(x, bool):
+        return bool(x)
+    if isinstance(x, (int, float, str)):
+        return x
+    if isinstance(x, np.bool_):
+        return bool(x)
+    if isinstance(x, np.integer):
+        return int(x)
+    if isinstance(x, np.floating):
+        return float(x)
+    if isinstance(x, (pd.Timestamp, np.datetime64)):
+        return str(x)
+    if isinstance(x, dict):
+        return {_jsonable_key(k): _jsonable(v) for k, v in x.items()}
+    if isinstance(x, (list, tuple)):
+        return [_jsonable(v) for v in x]
+    return str(x)
+
+
+def _jsonable_key(k: Any) -> Any:
+    """Coerce a mapping key to a JSON-compatible primitive."""
+    if isinstance(k, bool):
+        return bool(k)
+    if isinstance(k, (int, float, str)):
+        return k
+    if isinstance(k, np.bool_):
+        return bool(k)
+    if isinstance(k, np.integer):
+        return int(k)
+    if isinstance(k, np.floating):
+        return float(k)
+    return str(k)
diff --git a/tests/test_guides.py b/tests/test_guides.py
index bc0abe83..2d08871d 100644
--- a/tests/test_guides.py
+++ b/tests/test_guides.py
@@ -1,4 +1,5 @@
 """Tests for the bundled LLM guide accessor."""
+
 import importlib.resources
 
 import pytest
@@ -7,7 +8,7 @@
 from diff_diff._guides_api import _VARIANT_TO_FILE
 
 
-@pytest.mark.parametrize("variant", ["concise", "full", "practitioner"])
+@pytest.mark.parametrize("variant", ["concise", "full", "practitioner", "autonomous"])
 def test_all_variants_load(variant):
     text = get_llm_guide(variant)
     assert isinstance(text, str)
@@ -19,9 +20,10 @@ def test_default_is_concise():
 
 
 def test_full_is_largest():
-    lengths = {v: len(get_llm_guide(v)) for v in ("concise", "full", "practitioner")}
+    lengths = {v: len(get_llm_guide(v)) for v in ("concise", "full", "practitioner", "autonomous")}
     assert lengths["full"] > lengths["concise"]
     assert lengths["full"] > lengths["practitioner"]
+    assert lengths["full"] > lengths["autonomous"]
 
 
 def test_content_stability_practitioner_workflow():
@@ -32,6 +34,22 @@ def test_content_stability_self_reference_after_rewrite():
     assert "get_llm_guide" in get_llm_guide("concise")
 
 
+def test_content_stability_autonomous_fingerprints():
+    text = get_llm_guide("autonomous")
+    assert "profile_panel" in text
+    assert "estimator-support matrix" in text.lower()
+
+
+def test_autonomous_contains_intact_estimator_matrix():
+    # Section 3 is a markdown table with 10 data columns + the estimator
+    # name column -> rows have at least 11 pipe characters. This guards
+    # against the matrix being accidentally deleted or truncated.
+    text = get_llm_guide("autonomous")
+    assert any(
+        line.count("|") >= 11 for line in text.splitlines()
+    ), "Section 3 estimator-support matrix appears to be missing or truncated."
+
+
 def test_wheel_content_matches_package_resource():
     for variant, filename in _VARIANT_TO_FILE.items():
         on_disk = (
diff --git a/tests/test_profile_panel.py b/tests/test_profile_panel.py
new file mode 100644
index 00000000..b1d7a9f5
--- /dev/null
+++ b/tests/test_profile_panel.py
@@ -0,0 +1,868 @@
+"""Tests for ``diff_diff.profile_panel`` and the ``PanelProfile`` dataclass."""
+
+from __future__ import annotations
+
+import dataclasses
+import json
+from typing import Any, Dict, Iterable, Optional
+
+import numpy as np
+import pandas as pd
+import pytest
+
+from diff_diff import PanelProfile, profile_panel
+from diff_diff.profile import Alert
+
+
+def _make_panel(
+    *,
+    n_units: int,
+    periods: Iterable[int],
+    first_treat: Optional[Dict[int, int]] = None,
+    outcome_fn: Any = None,
+) -> pd.DataFrame:
+    """Build a balanced long panel with optional per-unit first-treatment timing.
+
+    ``first_treat`` maps unit -> first treatment period (inclusive). Units not
+    in the mapping are never-treated.
+    """
+    first_treat = first_treat or {}
+    rows = []
+    rng = np.random.default_rng(0)
+    for u in range(1, n_units + 1):
+        for t in periods:
+            tr = 1 if (u in first_treat and t >= first_treat[u]) else 0
+            if outcome_fn is not None:
+                y = outcome_fn(u, t, tr, rng)
+            else:
+                y = float(u) + 0.1 * t + 0.5 * tr
+            rows.append({"u": u, "t": t, "tr": tr, "y": y})
+    return pd.DataFrame(rows)
+
+
+def _alert_codes(profile: PanelProfile) -> set[str]:
+    return {a.code for a in profile.alerts}
+
+
+def test_balanced_binary_2x2():
+    first_treat = {u: 1 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=[0, 1], first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+    assert profile.is_staggered is False
+    assert profile.has_never_treated is True
+    assert profile.n_units == 20
+    assert profile.n_periods == 2
+    assert profile.is_balanced is True
+
+
+def test_staggered_multi_cohort():
+    first_treat: Dict[int, int] = {}
+    first_treat.update({u: 3 for u in range(1, 11)})
+    first_treat.update({u: 5 for u in range(11, 21)})
+    first_treat.update({u: 7 for u in range(21, 31)})
+    df = _make_panel(n_units=40, periods=range(1, 9), first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+    assert profile.is_staggered is True
+    assert profile.n_cohorts == 3
+    assert profile.cohort_sizes == {3: 10, 5: 10, 7: 10}
+    assert profile.first_treatment_period == 3
+    assert profile.last_treatment_period == 7
+    assert profile.has_never_treated is True
+
+
+def test_binary_non_absorbing_switcher():
+    rows = []
+    rng = np.random.default_rng(0)
+    for u in range(1, 21):
+        treat_seq = [0, 1, 1, 0, 0] if u > 10 else [0, 0, 0, 0, 0]
+        for t, tr in enumerate(treat_seq):
+            rows.append({"u": u, "t": t, "tr": tr, "y": rng.normal()})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_non_absorbing"
+    assert profile.cohort_sizes == {}
+    assert profile.is_staggered is False
+    assert profile.has_never_treated is True
+
+
+def test_continuous_treatment():
+    rng = np.random.default_rng(0)
+    rows = []
+    for u in range(1, 41):
+        dose = float(rng.uniform(0, 5))
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": dose, "y": rng.normal()})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "continuous"
+    assert profile.cohort_sizes == {}
+    assert profile.is_staggered is False
+    # Each unit has a constant dose across all periods → time-invariant.
+    assert profile.treatment_varies_within_unit is False
+
+
+def test_continuous_treatment_with_time_varying_dose():
+    """Time-varying dose must be flagged so agents routed to
+    ContinuousDiD do not hit the fit-time "dose must be time-invariant"
+    ValueError. treatment_varies_within_unit == True signals the
+    incompatibility."""
+    rng = np.random.default_rng(0)
+    rows = []
+    for u in range(1, 21):
+        for t in range(4):
+            dose = float(rng.uniform(0, 5))
+            rows.append({"u": u, "t": t, "tr": dose, "y": rng.normal()})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "continuous"
+    assert profile.treatment_varies_within_unit is True
+
+
+def test_binary_absorbing_varies_within_unit():
+    """Binary-absorbing panels have within-unit treatment variation by
+    construction (0 pre, 1 post). The field is True."""
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_varies_within_unit is True
+
+
+def test_continuous_positive_dose_does_not_fire_has_always_treated():
+    """Valid ContinuousDiD panels have units with a constant positive
+    dose across all periods AND well-defined pre-treatment periods
+    (via a separate `first_treat` column). `has_always_treated` has
+    binary-only semantics, so it must be False on continuous panels
+    regardless of dose positivity. Previously the field conflated
+    "positive dose throughout" with "always treated in the DiD sense",
+    which fired the misleading `has_always_treated_units` alert on
+    valid continuous-DiD panels."""
+    rng = np.random.default_rng(0)
+    rows = []
+    for u in range(1, 21):
+        dose = 0.0 if u <= 5 else 2.5
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": dose, "y": rng.normal()})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "continuous"
+    assert profile.has_never_treated is True
+    assert profile.has_always_treated is False, (
+        "has_always_treated must be False on continuous panels regardless "
+        "of dose positivity (binary-only semantics)"
+    )
+    assert "has_always_treated_units" not in _alert_codes(profile)
+
+
+def test_bool_dtype_treatment_is_binary_absorbing():
+    """Bool-dtype treatment columns (True/False) must classify the same
+    way as numeric {0, 1}. The library's binary estimators validate on
+    value support via `validate_binary`, which accepts bool because
+    True/False coerce to 1/0 numerically. Classifying bool as
+    "categorical" would silently route valid binary DiD panels away
+    from the supported estimator set."""
+    first_treat = {u: 2 for u in range(11, 21)}
+    rows = []
+    for u in range(1, 21):
+        for t in range(4):
+            treated = u in first_treat and t >= first_treat[u]
+            rows.append({"u": u, "t": t, "tr": bool(treated), "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    assert df["tr"].dtype == bool
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+    assert profile.has_never_treated is True
+    assert profile.has_always_treated is False
+    assert profile.treatment_varies_within_unit is True
+    assert profile.cohort_sizes == {2: 10}
+
+
+def test_bool_dtype_non_absorbing():
+    """Reversible 0 -> 1 -> 0 treatment expressed as a bool column must
+    classify as binary_non_absorbing, same as numeric."""
+    rows = []
+    for u in range(1, 11):
+        seq = [False, True, True, False, False] if u > 5 else [False] * 5
+        for t, tr in enumerate(seq):
+            rows.append({"u": u, "t": t, "tr": tr, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    assert df["tr"].dtype == bool
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_non_absorbing"
+    assert profile.has_never_treated is True
+
+
+def test_categorical_treatment_object_dtype():
+    rows = []
+    for u in range(1, 11):
+        arm = "A" if u <= 5 else "B"
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": arm, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "categorical"
+    assert profile.has_never_treated is False
+    assert profile.has_always_treated is False
+
+
+def test_no_never_treated_alert():
+    first_treat = {u: 2 for u in range(1, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 5), first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.has_never_treated is False
+    codes = _alert_codes(profile)
+    assert "no_never_treated" in codes
+
+
+def test_has_always_treated_alert():
+    rows = []
+    for u in range(1, 21):
+        for t in range(5):
+            tr = 1 if u <= 5 else (1 if t >= 3 else 0)
+            rows.append({"u": u, "t": t, "tr": tr, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.has_always_treated is True
+    codes = _alert_codes(profile)
+    assert "has_always_treated_units" in codes
+
+
+def test_unbalanced_panel_below_threshold():
+    first_treat = {u: 3 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 5), first_treat=first_treat)
+    df = df.iloc[::3].reset_index(drop=True)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.is_balanced is False
+    assert profile.observation_coverage < 0.70
+    codes = _alert_codes(profile)
+    assert "panel_highly_unbalanced" in codes
+
+
+def test_binary_outcome_float_dtype_alert():
+    first_treat = {u: 2 for u in range(11, 31)}
+    df = _make_panel(
+        n_units=30,
+        periods=range(0, 4),
+        first_treat=first_treat,
+        outcome_fn=lambda u, t, tr, rng: float(tr),
+    )
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.outcome_is_binary is True
+    assert profile.outcome_dtype == "float64"
+    codes = _alert_codes(profile)
+    assert "outcome_looks_binary_but_dtype_float" in codes
+
+
+def test_outcome_missing_fraction_computed():
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    df.loc[0:9, "y"] = np.nan
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert 0.0 < profile.outcome_missing_fraction < 1.0
+    assert profile.outcome_missing_fraction == pytest.approx(10 / len(df))
+
+
+def test_short_pre_panel_alert():
+    first_treat = {u: 1 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=[0, 1, 2, 3], first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.min_pre_periods == 1
+    codes = _alert_codes(profile)
+    assert "short_pre_panel" in codes
+
+
+def test_missing_column_raises_value_error():
+    df = pd.DataFrame({"u": [1, 2], "t": [0, 1], "y": [0.0, 1.0]})
+    with pytest.raises(ValueError, match="treatment"):
+        profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+
+
+def test_panel_profile_is_frozen():
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    with pytest.raises(dataclasses.FrozenInstanceError):
+        profile.n_units = 999  # type: ignore[misc]
+
+
+def test_to_dict_is_json_serializable():
+    first_treat = {u: 3 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 6), first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    payload = profile.to_dict()
+    as_json = json.dumps(payload)
+    roundtripped = json.loads(as_json)
+    assert roundtripped["treatment_type"] == "binary_absorbing"
+    assert set(roundtripped.keys()) >= {
+        "n_units",
+        "n_periods",
+        "n_obs",
+        "is_balanced",
+        "observation_coverage",
+        "treatment_type",
+        "is_staggered",
+        "n_cohorts",
+        "cohort_sizes",
+        "has_never_treated",
+        "has_always_treated",
+        "treatment_varies_within_unit",
+        "first_treatment_period",
+        "last_treatment_period",
+        "min_pre_periods",
+        "min_post_periods",
+        "outcome_dtype",
+        "outcome_is_binary",
+        "outcome_has_zeros",
+        "outcome_has_negatives",
+        "outcome_missing_fraction",
+        "outcome_summary",
+        "alerts",
+    }
+
+
+def test_alerts_are_factual_no_recommender_language():
+    first_treat = {u: 1 for u in range(11, 21)}
+    df = _make_panel(n_units=12, periods=[0, 1, 2, 3], first_treat=first_treat)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    forbidden_substrings = (
+        "recommend",
+        "should use",
+        "use estimator",
+        "we suggest",
+        "you should",
+    )
+    for alert in profile.alerts:
+        lowered = alert.message.lower()
+        for phrase in forbidden_substrings:
+            assert phrase not in lowered, (
+                f"alert {alert.code!r} contains recommender-adjacent phrase "
+                f"{phrase!r} in message: {alert.message!r}"
+            )
+
+
+def test_alert_dataclass_is_frozen():
+    a = Alert(code="x", severity="info", message="m", observed=None)
+    with pytest.raises(dataclasses.FrozenInstanceError):
+        a.code = "y"  # type: ignore[misc]
+
+
+def test_all_zero_treatment_is_binary_absorbing():
+    """Degenerate binary: no unit is ever treated. Must classify as binary,
+    not continuous, so the documented taxonomy matches the implementation."""
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=None)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+    assert profile.has_never_treated is True
+    assert profile.has_always_treated is False
+    assert profile.cohort_sizes == {}
+    assert profile.n_cohorts == 0
+
+
+def test_all_one_treatment_is_binary_absorbing_always_treated():
+    """Degenerate binary: every unit treated in every period. Must classify as
+    binary_absorbing with has_always_treated=True."""
+    rows = []
+    for u in range(1, 21):
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": 1, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+    assert profile.has_never_treated is False
+    assert profile.has_always_treated is True
+    codes = _alert_codes(profile)
+    assert "has_always_treated_units" in codes
+
+
+def test_binary_with_nans_only_zeros_observed_is_binary():
+    """Binary panel with some NaNs and only 0 observed among non-NaN values —
+    still classify as binary, not continuous."""
+    rows = []
+    for u in range(1, 11):
+        for t in range(4):
+            tr = 0 if (u + t) % 2 == 0 else np.nan
+            rows.append({"u": u, "t": t, "tr": tr, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_absorbing"
+
+
+def test_all_nan_treatment_is_categorical():
+    """Treatment column entirely NaN — classify as categorical (no info)."""
+    rows = []
+    for u in range(1, 11):
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": np.nan, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "categorical"
+
+
+def test_top_level_import_surface():
+    """profile_panel, PanelProfile, and Alert must be importable from the
+    top-level namespace so `help(diff_diff)` points at real symbols."""
+    import diff_diff
+
+    assert callable(diff_diff.profile_panel)
+    assert diff_diff.PanelProfile.__name__ == "PanelProfile"
+    assert diff_diff.Alert.__name__ == "Alert"
+    for name in ("profile_panel", "PanelProfile", "Alert"):
+        assert name in diff_diff.__all__, f"{name} missing from __all__"
+
+
+def test_duplicate_unit_time_rows_do_not_inflate_coverage():
+    """Duplicate (unit, time) rows must not make a panel look balanced.
+    observation_coverage must stay in [0, 1] and derive from the unique
+    (unit, time) support, and the duplicate_unit_time_rows alert fires."""
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    df_dup = pd.concat([df, df.iloc[:5].copy()], ignore_index=True)
+    profile = profile_panel(df_dup, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.is_balanced is True
+    assert 0.0 <= profile.observation_coverage <= 1.0
+    assert "duplicate_unit_time_rows" in _alert_codes(profile)
+
+    df_missing_cell = df.drop(df.index[0]).reset_index(drop=True)
+    df_dup_missing = pd.concat(
+        [df_missing_cell, df_missing_cell.iloc[:5].copy()], ignore_index=True
+    )
+    profile2 = profile_panel(df_dup_missing, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile2.is_balanced is False
+    assert profile2.observation_coverage < 1.0
+    assert "duplicate_unit_time_rows" in _alert_codes(profile2)
+
+
+def test_reversal_through_nan_is_binary_non_absorbing():
+    """A 0 -> 1 -> NaN -> 0 path must be detected as non-absorbing: the
+    observed non-NaN subsequence violates weak monotonicity. Previously a
+    NaN-inclusive diff could report False monotonicity violation."""
+    rows = []
+    for u in range(1, 11):
+        treat_seq = [0, 1, np.nan, 0]
+        for t, tr in enumerate(treat_seq):
+            rows.append({"u": u, "t": t, "tr": tr, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "binary_non_absorbing"
+
+
+def test_continuous_zero_dose_controls_flag_has_never_treated():
+    """Continuous treatment with some zero-dose units must flag
+    has_never_treated=True. Previously continuous panels hardcoded
+    has_never_treated=False regardless of control availability.
+    has_always_treated has binary-only semantics and must remain
+    False on continuous panels regardless of dose positivity."""
+    rows = []
+    rng = np.random.default_rng(0)
+    for u in range(1, 21):
+        dose = 0.0 if u <= 5 else float(rng.uniform(0.5, 3.0))
+        for t in range(4):
+            rows.append({"u": u, "t": t, "tr": dose, "y": rng.normal()})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.treatment_type == "continuous"
+    assert profile.has_never_treated is True
+    assert profile.has_always_treated is False
+
+
+def test_guide_api_strings_resolve_against_public_api():
+    """Sanity-check that every estimator referenced in the autonomous guide
+    exists in the public API, plus the `hausman_pretest` classmethod location
+    and the `not_yet_treated` control-group string. Guards against guide
+    drift that the CI reviewer has previously flagged."""
+    import diff_diff
+    from diff_diff import get_llm_guide
+
+    text = get_llm_guide("autonomous")
+
+    for name in (
+        "DifferenceInDifferences",
+        "MultiPeriodDiD",
+        "TwoWayFixedEffects",
+        "CallawaySantAnna",
+        "SunAbraham",
+        "ChaisemartinDHaultfoeuille",
+        "ImputationDiD",
+        "TwoStageDiD",
+        "StackedDiD",
+        "WooldridgeDiD",
+        "EfficientDiD",
+        "SyntheticDiD",
+        "TROP",
+        "TripleDifference",
+        "StaggeredTripleDifference",
+        "ContinuousDiD",
+        "HeterogeneousAdoptionDiD",
+    ):
+        assert name in text, f"estimator {name!r} missing from guide"
+        assert hasattr(diff_diff, name), f"{name!r} in guide but not exported"
+
+    assert hasattr(
+        diff_diff.EfficientDiD, "hausman_pretest"
+    ), "EfficientDiD.hausman_pretest classmethod missing from the public API"
+
+    assert "EfficientDiD.hausman_pretest" in text
+    assert "Hausman.hausman_pretest" not in text
+
+    assert 'control_group="not_yet_treated"' in text
+    assert "notyettreated" not in text
+
+    # HAD targets WAS / WAS_d_lower, not ATT; event-study is per-event-
+    # time, not per-cohort. Guard against the guide drifting back to
+    # ATT-shaped / per-cohort phrasing.
+    assert "Weighted Average Slope (WAS)" in text
+    assert "WAS_d_lower" in text
+    assert "per-cohort Pierce-Schott" not in text
+
+    # EfficientDiD has three paths when no never-treated exists:
+    # PT-Post, PT-All, or control_group="last_cohort". The guide must
+    # mention last_cohort in the no-never-treated section so agents do
+    # not rule out the supported path.
+    assert 'control_group="last_cohort"' in text
+
+    # SunAbraham requires a never-treated cohort; the fit path raises a
+    # ValueError when none exists. Guard the matrix / prose contract so
+    # the guide cannot drift back to claiming SunAbraham is optional.
+    sun_abraham_row = next(
+        line for line in text.splitlines() if "`SunAbraham`" in line and "|" in line
+    )
+    cells = [cell.strip() for cell in sun_abraham_row.strip("|").split("|")]
+    # Column order: estimator, binary_absorbing, staggered, continuous,
+    # triple-diff, never-treated-required, covariate, few-treated,
+    # heterogeneous-adoption, clustered-SE.
+    assert cells[5] == "✓", (
+        "SunAbraham matrix row must mark never-treated-required=✓ " f"(row: {sun_abraham_row!r})"
+    )
+
+    # HAD Assumption 3 is not testable per REGISTRY.md; the guide must
+    # not claim otherwise.
+    assert "Assumption 3" in text  # mentioned as untestable, not as validated
+    assert "validate Assumptions 3 and 7" not in text
+    assert "not testable" in text
+
+    # EfficientDiD requires never-treated under BOTH assumption="PT-All"
+    # and assumption="PT-Post" — PT-Post is not a "drop the requirement"
+    # escape hatch. Only control_group="last_cohort" admits all-treated
+    # panels. Guard against guide drift back to the incorrect wording.
+    assert "PT-Post is the weaker" in text or "both" in text.lower()
+    # The old claim "switch to `assumption=\"PT-Post\"` to drop" must
+    # not reappear in any form.
+    assert 'switch to `assumption="PT-Post"` to drop' not in text
+
+    # Matrix covariate cells: SyntheticDiD accepts fit(covariates=...)
+    # and residualizes the outcome; ContinuousDiD.fit has no covariate
+    # surface. Guard the matrix rows against drift.
+    sdid_row = next(line for line in text.splitlines() if "`SyntheticDiD`" in line and "|" in line)
+    sdid_cells = [c.strip() for c in sdid_row.strip("|").split("|")]
+    assert sdid_cells[6] in ("✓", "partial"), (
+        "SyntheticDiD covariate-adjustment cell must be ✓ or partial "
+        f"(residualization path exists); got {sdid_cells[6]!r}"
+    )
+    cdid_row = next(line for line in text.splitlines() if "`ContinuousDiD`" in line and "|" in line)
+    cdid_cells = [c.strip() for c in cdid_row.strip("|").split("|")]
+    assert cdid_cells[6] == "✗", (
+        "ContinuousDiD covariate-adjustment cell must be ✗ "
+        f"(no covariate surface on fit()); got {cdid_cells[6]!r}"
+    )
+
+    # §5 API signatures: compute_pretrends_power takes a fitted results
+    # object (not df), plot_sensitivity takes SensitivityResults,
+    # plot_honest_event_study takes HonestDiDResults. Guard against
+    # drift back to the df-first / results-only signatures.
+    assert "`compute_pretrends_power(results" in text
+    assert "`plot_sensitivity(sensitivity_results" in text
+    assert "`plot_honest_event_study(honest_results" in text
+
+    # §6 BR/DR schema alignment. The emitted top-level keys are
+    # singular / underscored ("assumption", "pre_trends", "sample"),
+    # not the plural / run-together variants. DiagnosticReport emits
+    # sections at the top level (not nested under a "checks" dict)
+    # and uses "estimator" (the string class name) / "headline_metric"
+    # / "estimator_native_diagnostics". Guard each real key and
+    # forbid the obsolete ones.
+    for real_key in (
+        "`assumption: dict`",
+        "`pre_trends: dict`",
+        "`sample: dict`",
+        "`headline_metric: dict`",
+        "`estimator_native_diagnostics: dict`",
+        "`overall_interpretation: str`",
+    ):
+        assert real_key in text, f"BR/DR §6 missing real key: {real_key}"
+    for obsolete_key in (
+        "`assumptions: dict`",
+        "`pretrends: dict`",
+        "`main_result: dict`",
+        "`sample_summary: dict`",
+        "`estimator_type: str`",
+        "`checks: dict`",
+    ):
+        assert obsolete_key not in text, f"BR/DR §6 still lists obsolete key: {obsolete_key}"
+
+    # BR `diagnostics` is a wrapper (status + schema/reason + possibly
+    # overall_interpretation), not the DR payload directly. Guard the
+    # wrapper wording so the guide does not drift back to telling
+    # agents to parse BR["diagnostics"] as the DR schema.
+    assert 'diagnostics["schema"]' in text
+    # target_parameter includes a `reference` field per
+    # describe_target_parameter(); guard its documentation.
+    assert "`reference` (REGISTRY.md citation string)" in text
+
+    # Methodology source attribution: EfficientDiD is Chen, Sant'Anna,
+    # Xie (2025), not Arkhangelsky-Imbens. ContinuousDiD is Callaway,
+    # Goodman-Bacon, Sant'Anna (2024). Guard both attributions in the
+    # §4 prose and the §7 citation list.
+    assert "Chen, Sant'Anna, Xie 2025" in text
+    assert "(Arkhangelsky-Imbens)" not in text
+    assert "Callaway, Goodman-Bacon, Sant'Anna 2024" in text
+    # ContinuousDiD prose must distinguish the PT vs SPT identified
+    # targets rather than collapsing everything into "ACR".
+    assert "ATT(d|d)" in text
+    assert "ACRT" in text
+    assert "Strong Parallel Trends" in text
+
+    # ContinuousDiD requires zero-dose (P(D=0) > 0) because Remark 3.1
+    # lowest-dose-as-control is unimplemented; matrix col 5 must be ✓.
+    assert cdid_cells[5] == "✓", (
+        "ContinuousDiD matrix row must mark never-treated-required=✓ "
+        f"(P(D=0) > 0 required per Remark 3.1); got {cdid_cells[5]!r}"
+    )
+    assert "P(D=0) > 0" in text or "P(D=0) &gt; 0" in text
+
+    # ContinuousDiD DOES support staggered adoption natively (via the
+    # `first_treat` column). Matrix column 2 (staggered) must be ✓.
+    assert cdid_cells[2] == "✓", (
+        "ContinuousDiD matrix row must mark staggered=✓ "
+        "(adoption timing via first_treat is supported); "
+        f"got {cdid_cells[2]!r}"
+    )
+
+    # ContinuousDiD also requires dose to be time-invariant per unit;
+    # this is the second eligibility prerequisite the guide must spell
+    # out. Guide text must mention the invariant explicitly AND the
+    # `treatment_varies_within_unit` field used to detect it.
+    assert "time-invariant" in text
+    assert "treatment_varies_within_unit" in text
+
+    # DR §6 section statuses: execution-state vocabulary must include
+    # the actual emitted values ("ran", "not_applicable", "not_run",
+    # "no_scalar_by_design", "skipped"), and `verdict` must be
+    # documented separately from `status`. Guard against drift back
+    # to the pass/warn/inconclusive-as-status framing.
+    for real_status in (
+        '"ran"',
+        '"not_applicable"',
+        '"not_run"',
+        '"no_scalar_by_design"',
+    ):
+        assert real_status in text, f"DR §6 section-status vocabulary must document {real_status}"
+    # `status` must not be described as "pass/warn/inconclusive" —
+    # those belong under `verdict`.
+    assert '`"pass"` / `"warn"` / `"inconclusive"`' not in text
+    assert "verdict" in text.lower()
+
+    # Balanced-panel eligibility: ContinuousDiD, EfficientDiD,
+    # SyntheticDiD, and HeterogeneousAdoptionDiD all hard-reject
+    # unbalanced panels at fit() time. The guide must surface this
+    # so agents gate these estimators on PanelProfile.is_balanced
+    # before selecting them.
+    assert "is_balanced" in text, (
+        "Guide must mention PanelProfile.is_balanced as an eligibility "
+        "check for balance-sensitive estimators"
+    )
+    for estimator in (
+        "ContinuousDiD",
+        "EfficientDiD",
+        "SyntheticDiD",
+        "HeterogeneousAdoptionDiD",
+        "StaggeredTripleDifference",
+    ):
+        idx = 0
+        found = False
+        while idx < len(text):
+            loc = text.find(estimator, idx)
+            if loc < 0:
+                break
+            window = text[max(0, loc - 400) : loc + 400]
+            if "balanced" in window.lower() or "is_balanced" in window:
+                found = True
+                break
+            idx = loc + 1
+        assert found, (
+            f"Guide must mention a balanced-panel constraint near the "
+            f"{estimator!r} bullet / row (hard-rejects unbalanced panels "
+            "at fit time)"
+        )
+
+    # HeterogeneousAdoptionDiD staggered support is `partial` and
+    # specifically last-cohort-only (Appendix B.2): with first_treat_col
+    # supplied, fit() auto-filters to F_last + never-treated; without
+    # first_treat_col, a multi-cohort panel raises. Guide must surface
+    # this explicitly so agents don't route a general staggered panel
+    # to HAD expecting a multi-cohort estimand.
+    assert "last-cohort-only" in text or "last cohort" in text.lower(), (
+        "Guide must name the last-cohort-only restriction on HAD "
+        "staggered support (Appendix B.2)"
+    )
+    assert "first_treat_col" in text, (
+        "Guide must mention that first_treat_col is required to activate "
+        "HAD's staggered last-cohort auto-filter"
+    )
+    assert "ChaisemartinDHaultfoeuille" in text, (
+        "Guide must point at ChaisemartinDHaultfoeuille as the fallback "
+        "for full staggered support"
+    )
+
+    # Balanced-panel gate is incomplete with `is_balanced` alone because
+    # duplicate (unit, time) rows don't flip is_balanced. Guide must
+    # require BOTH is_balanced == True AND absence of the
+    # duplicate_unit_time_rows alert before routing to the duplicate-
+    # intolerant estimators (ContinuousDiD silently overwrites
+    # duplicates via last-row-wins; EfficientDiD/HAD raise).
+    assert "duplicate_unit_time_rows" in text, (
+        "Guide must name the duplicate_unit_time_rows alert as part of "
+        "the balanced-panel eligibility gate"
+    )
+    assert "BOTH" in text or "both" in text, (
+        "Guide must require BOTH is_balanced and absence of the "
+        "duplicate_unit_time_rows alert before routing to duplicate-"
+        "intolerant estimators"
+    )
+
+    # ChaisemartinDHaultfoeuille handles non-absorbing / reversible
+    # treatment; SUTVA is still assumed (no native interference or
+    # spillover support per REGISTRY.md). Guard against the guide
+    # drifting back to advertising dCDH as "robust to spillover
+    # designs" or similar.
+    for phrase in (
+        "robust to spillover",
+        "interference-robust",
+        "supports spillover",
+        "and to spillover",
+    ):
+        assert phrase not in text, (
+            f"Guide must not advertise unsupported dCDH capability "
+            f"{phrase!r}: SUTVA is assumed across the estimator suite."
+        )
+
+    # Repeated-cross-section (§4.10) must not claim broad
+    # applicability. The documented RCS-capable estimators are
+    # CallawaySantAnna(panel=False), TripleDifference, and
+    # StaggeredTripleDifference; EfficientDiD and
+    # HeterogeneousAdoptionDiD explicitly reject RCS per REGISTRY.md.
+    assert "most estimators remain applicable" not in text, (
+        "§4.10 must not claim broad RCS applicability; only the "
+        "explicitly documented RCS-capable subset is applicable."
+    )
+    assert "panel=False" in text, (
+        "§4.10 must point at CallawaySantAnna(panel=False) as the " "explicit RCS mode"
+    )
+    # The section must explicitly name at least one panel-only
+    # estimator as rejected for RCS, so agents do not silently route
+    # RCS data to it.
+    rcs_section_start = text.find("§4.10 Repeated cross-sections")
+    assert rcs_section_start >= 0
+    rcs_section = text[rcs_section_start : rcs_section_start + 2500]
+    for panel_only in (
+        "EfficientDiD",
+        "HeterogeneousAdoptionDiD",
+        "StaggeredTripleDifference",
+    ):
+        assert panel_only in rcs_section, (
+            f"§4.10 must explicitly name {panel_only!r} as panel-only "
+            "so RCS data is not routed to it"
+        )
+
+    # The explicit RCS-capable bullet list must NOT put
+    # StaggeredTripleDifference next to the RCS-support language.
+    # The estimator has no panel=False mode and fit() rejects
+    # unbalanced input; only TripleDifference (non-staggered) is
+    # cross-sectional-DDD-capable.
+    explicit_support_block = text.find("Explicit RCS support", rcs_section_start)
+    rejected_block = text.find("Explicitly rejected for RCS", rcs_section_start)
+    assert 0 <= explicit_support_block < rejected_block, (
+        "§4.10 must separate an Explicit RCS support list from the " "Explicitly rejected list"
+    )
+    explicit_segment = text[explicit_support_block:rejected_block]
+    assert "StaggeredTripleDifference" not in explicit_segment, (
+        "StaggeredTripleDifference must NOT appear in the Explicit RCS "
+        "support list — it is panel-only and balance-enforced."
+    )
+
+
+def test_min_pre_post_use_per_unit_observed_support():
+    """On an unbalanced panel where one treated unit is missing its
+    earliest pre-period, min_pre_periods must reflect that unit's actual
+    observed support. Previously _compute_pre_post used the global period
+    set, which could hide short-panel cases and suppress the short_pre_panel
+    alert."""
+    rows = []
+    for u in range(1, 21):
+        first_treat = 3
+        for t in range(0, 6):
+            if u == 1 and t <= 1:
+                continue
+            tr = 1 if t >= first_treat else 0
+            rows.append({"u": u, "t": t, "tr": tr, "y": float(u) + 0.1 * t})
+    df = pd.DataFrame(rows)
+    profile = profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+    assert profile.min_pre_periods == 1
+    assert "short_pre_panel" in _alert_codes(profile)
+
+
+def test_missing_unit_or_time_ids_are_dropped_consistently():
+    """NaN values in unit or time must not push observation_coverage above
+    1.0. `nunique()` drops NaN while `drop_duplicates()` keeps NaN as a
+    distinct key, which previously produced coverage > 1 silently. The
+    fix drops NaN-id rows up front, emits the missing_id_rows_dropped
+    alert, and computes all structural facts on the non-missing subset."""
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    df_with_missing = df.copy()
+    df_with_missing.loc[[0, 1, 2], "u"] = np.nan
+    df_with_missing.loc[[5, 6], "t"] = np.nan
+    profile = profile_panel(df_with_missing, unit="u", time="t", treatment="tr", outcome="y")
+    assert 0.0 <= profile.observation_coverage <= 1.0
+    codes = _alert_codes(profile)
+    assert "missing_id_rows_dropped" in codes
+    drop_alert = next(a for a in profile.alerts if a.code == "missing_id_rows_dropped")
+    assert drop_alert.observed == 5
+
+
+def test_row_with_both_ids_missing_counted_once():
+    """A row with BOTH unit and time NaN must count as one dropped row,
+    not two. Previously `isna().sum()` summed the two columns and
+    double-counted rows missing both identifiers."""
+    first_treat = {u: 2 for u in range(11, 21)}
+    df = _make_panel(n_units=20, periods=range(0, 4), first_treat=first_treat)
+    df_both_missing = df.copy()
+    df_both_missing.loc[0, "u"] = np.nan
+    df_both_missing.loc[0, "t"] = np.nan
+    profile = profile_panel(df_both_missing, unit="u", time="t", treatment="tr", outcome="y")
+    drop_alert = next(a for a in profile.alerts if a.code == "missing_id_rows_dropped")
+    assert drop_alert.observed == 1
+
+
+def test_empty_dataframe_raises_value_error():
+    """Direct empty input must raise, not silently return a 'balanced'
+    profile with zero units/periods."""
+    df = pd.DataFrame({"u": [], "t": [], "tr": [], "y": []})
+    with pytest.raises(ValueError, match="empty"):
+        profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")
+
+
+def test_empty_after_id_drop_raises_value_error():
+    """If every row has a missing unit or time identifier, the panel is
+    empty after the drop; raise rather than returning is_balanced=True
+    on zero rows."""
+    df = pd.DataFrame(
+        {
+            "u": [np.nan, np.nan],
+            "t": [0, 1],
+            "tr": [0, 1],
+            "y": [0.1, 0.2],
+        }
+    )
+    with pytest.raises(ValueError, match="no rows remain"):
+        profile_panel(df, unit="u", time="t", treatment="tr", outcome="y")