Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ build-backend = "hatchling.build"

[project]
name = "scitex"
version = "2.27.3"
version = "2.28.0"
description = "A comprehensive Python library for scientific computing and data analysis"
readme = "README.md"
requires-python = ">=3.10"
Expand Down Expand Up @@ -52,6 +52,7 @@ classifiers = [
# pip install scitex[audio,scholar] # multiple modules
# pip install scitex[all] # everything
dependencies = [
"scitex-config>=0.3.0",
# Essential Scientific Computing
"numpy",
"pandas",
Expand Down Expand Up @@ -138,6 +139,10 @@ app = [
"scitex-app>=0.1.0",
]

# HPC Module - Generic SLURM dispatch (srun, sbatch, sync, poll, fetch)
# Use: pip install scitex[hpc]
hpc = ["scitex-hpc>=0.1.0"]

# AI Module - LLM APIs and ML tools
# Use: pip install scitex[ai]
ai = [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ All SciTeX packages MUST use the `SCITEX_<MODULE_NAME>_*` prefix for environment
- Show `$ENV_VAR_NAME` in CLI help defaults, not resolved values
- Configuration is external (env vars, config files) — never hardcode secrets or defaults that should be user-configurable

## Resolution precedence

Per-field precedence is **direct (CLI flag) → config_dict (YAML) → env var → default**. Env vars sit between YAML and built-in defaults — they're a fallback when no YAML config is loaded, never an override of explicit YAML or CLI values.

Use `scitex_config.PriorityConfig.resolve()` rather than hand-rolling the cascade. See `03_interface_02_cli.md` §6c for the full CLI rule and `~/proj/scitex-config/src/scitex_config/_PriorityConfig.py` for the canonical implementation.

## Feature Flags

All SciTeX feature flags follow the **opt-out** pattern (default enabled, explicitly disable):
Expand Down
48 changes: 47 additions & 1 deletion src/scitex/_skills/general/01_arch_06_local-state-directories.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,9 +151,55 @@ If a package already ships a different layout (`~/.scitex/<pkg>_config.yaml`, `~

Do not keep permanent back-compat shims — legacy locations silently defeat `SCITEX_DIR`.

## 9. Related
## 9. Cross-package SoC — each package owns a domain

The `.scitex/<pkg-short>/` layout is also the canonical place each package stores config that **other packages then consume**. The rule of thumb:

> If a question has one obviously-correct answer that should be the same everywhere ("what is this machine called?", "what is the SLURM cluster?", "where is the scholar cache?"), exactly one scitex-* package owns it. Every other package imports its API.

Anti-pattern: each package re-deriving the answer (e.g. `socket.gethostname()` in five places, getting different results because of FQDN drift, login-node aliasing, container hostnames). When the answer drifts between packages the user sees inconsistency on the dashboard, in logs, in cron entries.

### Worked example — machine identity (owner: `scitex-resource`)

`scitex-resource` owns "what machine am I?" because resource detection is its domain. Config lives at `~/.scitex/resource/config.yaml`:

```yaml
machine:
canonical_name: mba
aliases:
- Yusukes-MacBook-Air
- Yusukes-MacBook-Air.local
role: head
hpc: # optional
cluster: spartan
login_only: true
```

API (resolution: `$SCITEX_RESOURCE_MACHINE` → project config → user config → short hostname):

```python
from scitex_resource import get_machine_name, get_machine_config

name = get_machine_name() # always returns the same string everywhere
cfg = get_machine_config() # full block with aliases / role / hpc
```

Consumers — `scitex-orochi`, `scitex-hpc`, `scitex-agent-container` — call `get_machine_name()` instead of rolling their own hostname logic. The user sets the canonical name once per host; every package agrees.

### When to make a new package the owner

You're tempted to add config to `~/.scitex/<your-pkg>/config.yaml` for a fact that other packages will also need. Ask:

1. **Is the fact about `<your-pkg>`'s domain?** If yes, you own it. Expose a public function. Done.
2. **Is the fact about a domain another scitex-* package already owns?** Consume their API. Don't duplicate the config.
3. **Is the fact ecosystem-wide and no package owns it yet?** Decide who *should* own it (whose name fits best), put the config there, expose the API there. Don't create a "scitex-shared" or "scitex-common" — that's anti-pattern (everyone depends on it, no one feels responsible for it).

The `runtime/` directory follows the same rule: `<pkg-short>/runtime/` is exclusively for *that* package's regenerable state. Never write into another package's `runtime/`.

## 10. Related

- `03_interface_02_cli.md` §6b — config-file resolution uses this layout.
- `01_arch_03_modules-and-standalone-packages.md` §5–§6 — `PathManager` dependency-injection pattern.
- `01_arch_04_environment-variables.md` — `SCITEX_DIR` and per-package `SCITEX_<PKG>_CONFIG`.
- `06_skills_03_public-vs-private.md` — private skills live under `<pkg-short>/shared/skills/`.
- `scitex-resource` `_machine.py` — reference implementation of cross-package SoC (machine identity).
120 changes: 119 additions & 1 deletion src/scitex/_skills/general/03_interface_02_cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,29 @@ Precedence (highest first): `--config PATH` → `$SCITEX_<PKG>_CONFIG` → `<pro

Canonical filename is always `config.yaml` (not `<pkg>_config.yaml`). Project scope overrides user scope; CLI flags and env vars override both. The full layout rule — two roots, prefix-stripping (`scitex-dev` → `dev`), forbidden locations, `SCITEX_DIR` relocation, `PathManager` usage — lives in `01_arch_06_local-state-directories.md`. Document the fallback order in `--help`.

### 6c. Value precedence — `scitex_config.PriorityConfig`

For every individual config field (host, port, package list, etc.), resolve via the cascade:

```
direct (CLI flag) → config_dict (YAML) → env var → default
```

**Direct CLI flags ALWAYS win** — never let the YAML override an explicit `--flag`. Config files come second, env vars third, hardcoded defaults last.

**Implementation: do not hand-roll the cascade.** Use `scitex_config.PriorityConfig`:

```python
from scitex_config import PriorityConfig

cfg = PriorityConfig(config_dict=yaml_data, env_prefix="SCITEX_DEV_")
host = cfg.resolve("host", direct_val=cli_args.host, default="localhost")
```

Reference implementations: `scitex_hpc._config.JobConfig.resolve()`, the `scitex_config` package itself.

**Why centralized:** when every CLI follows the same precedence, operators learn it once. A YAML silently overriding a CLI flag is a class of "I told you to do X but you did Y" bugs that disappear when this rule is enforced uniformly.

## 7. MCP tool parity

When a CLI command has an MCP tool counterpart:
Expand All @@ -336,7 +359,97 @@ When a CLI command has an MCP tool counterpart:
- **Rule:** a user must be able to `cmd --json | jq ...` with zero
log contamination on stdout.

## 9. Audit checklist
## 9. Mutating operations — observation → dry-run → execute

For any command that **changes remote state** (sync, deploy, install, push, delete, etc.), provide three modes selected by mutually exclusive flags. Default = observation. The operator must understand what will happen *before* it happens.

### 9a. Three modes

| Mode | Flag | Behavior |
|---|---|---|
| **Observation** (default) | (none) | Read-only audit. Print what's currently true, what's stale, what's missing. Exit 0 if everything matches reference; 1 otherwise. |
| **Dry-run** | `--dry-run` | Print the **exact commands** that would execute (`git pull`, `pip install`, `scp`, etc.) — one per line per (target, item). No execution. |
| **Execute** | `--<verb>-<scope>` (e.g. `--update-hosts`, `--upload-files`) | Actually run the dry-run plan. |

### 9b. Flag-naming for execute mode

**Name the action by the scope it touches.** A flag called `--apply` is too abstract — the operator can't tell what gets modified. A flag called `--update-hosts` says "this updates hosts" so the scope is unambiguous.

Examples:
- `--update-hosts` (sync git + pip on remote machines)
- `--upload-files` (push artifacts somewhere)
- `--delete-stale-runs` (remove old job dirs)

Avoid: `--apply`, `--commit`, `--go`, `--run` — too generic.

### 9c. Reference flag for source-of-truth

When the operation converges to a target state, name the source explicitly:

```
--reference origin/develop # default: github branch
--reference localhost # local working copy
--reference pypi # latest published version
--reference <host>:<branch> # arbitrary peer
```

This makes "what are we converging toward?" answerable from the flags alone.

### 9d. Filter flags name the object scope

Filter flags should be **plural nouns** that name the scope:

| Flag | Scope |
|---|---|
| `--hosts mba,nas` | machines (filter sync targets) |
| `--packages scitex-io,scitex-nn` | packages (filter what to operate on) |
| `--branches main,develop` | git branches |
| `--users alice,bob` | accounts |

Singular forms (`--host`, `--package`) are accepted as aliases but plurals signal "list values".

### 9e. Dry-run is **enforced**, not optional

For destructive or wide-scope mutating operations (renames, mass deletes, cross-host syncs), execute mode must **refuse to run** unless a matching `--dry-run` has been performed recently and the operator confirms the same plan is being executed.

**Reference implementation: `scitex-dev rename-symbols`** (canonical workflow):

```
1. Clean git tree ← refuses if uncommitted changes
2. Dry-run preview ← refuses execute without recent --dry-run
3. Review the change list ← inspect the dry-run file list + counts
4. Real run ← matching --dry-run gates the execute
5. Test the result ← reminder printed after execute
```

**Enforcement mechanism:** the dry-run writes a manifest (set of operations + hash of inputs) to a state file (e.g. `~/.scitex/<pkg>/last-dry-run.json`). Execute reads it, recomputes the hash from current state, and refuses to run if:
- No manifest exists, OR
- Manifest is older than N minutes, OR
- Recomputed hash doesn't match (state changed since dry-run)

The error message tells the operator: "Re-run `--dry-run` first; the plan may have changed." A `--force` escape hatch is provided for CI/scripted contexts where the dry-run was already performed and audited out-of-band.

**Why enforce, not just suggest:** a non-enforced dry-run gets skipped under time pressure ("I already know what it'll do"). An enforced dry-run is a literal checkpoint where the operator sees the plan, then opts in. This eliminates "I thought it would do X but it did Y" failures.

### 9f. Worked example

```bash
# Observation: where do hosts stand vs origin/develop?
scitex-dev ecosystem packages

# Filter to a subset
scitex-dev ecosystem packages --hosts mba --packages scitex-io

# Preview what sync would do (writes a manifest)
scitex-dev ecosystem packages --hosts mba --dry-run

# Execute — refuses if no recent matching --dry-run
scitex-dev ecosystem packages --hosts mba --update-hosts
```

The same command surface gives a three-step ritual: observe → preview → act. The operator who reads the dry-run output knows exactly what `--update-hosts` will do, and the tool enforces the order.

## 10. Audit checklist

When auditing a new or existing SciTeX CLI:

Expand All @@ -347,8 +460,13 @@ When auditing a new or existing SciTeX CLI:
- [ ] Deprecated names hard-error with redirect (§5)
- [ ] Env vars use `SCITEX_<PKG>_*` prefix (§6a)
- [ ] Config file path follows §6b
- [ ] Per-field precedence via `scitex_config.PriorityConfig` (§6c)
- [ ] MCP parity if applicable (§7)
- [ ] stdout/stderr separation clean (§8)
- [ ] Mutating ops: observation default + `--dry-run` + scope-named execute flag (§9)
- [ ] Filter flags use plural scope nouns (`--hosts`, `--packages`) (§9d)
- [ ] `--reference` names the source-of-truth for state-converging ops (§9c)
- [ ] Destructive ops enforce `--dry-run` before execute via manifest gate (§9e; reference: `scitex-dev rename-symbols`)

## Cross-references

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ When user says "update all packages" or "full release", for each package:

- `mcp__scitex__dev_ecosystem_list` — initial status check across all packages
- `mcp__scitex__dev_ecosystem_fix_mismatches` — auto-fix installed vs pyproject.toml mismatches after PyPI publish
- CLI equivalent: `scitex-dev ecosystem fix-mismatches --confirm`
- CLI equivalent: `scitex-dev ecosystem packages --update-hosts`

## Dashboard

Expand All @@ -49,23 +49,22 @@ scitex-dev show-config # Show resolved dev config
scitex-dev search-docs <query> # Search package docs
```

### Sync
### Sync / verify packages

```bash
scitex-dev ecosystem sync # Local editable reinstall (dry-run default)
scitex-dev ecosystem sync --confirm # Execute
scitex-dev ecosystem sync-remote --host nas # Push to remote host over SSH
scitex-dev ecosystem sync-remote --confirm --host all
```

### Fix version mismatches
The unified `ecosystem packages` command observes, previews, or executes
sync against `--reference` (default `origin/develop`):

```bash
scitex-dev ecosystem fix-mismatches # Preview
scitex-dev ecosystem fix-mismatches --confirm # Execute
scitex-dev ecosystem packages # observation: state per (host, package)
scitex-dev ecosystem packages --hosts nas # filter by host (plural noun)
scitex-dev ecosystem packages --packages scitex-io --dry-run # preview the plan
scitex-dev ecosystem packages --hosts all --update-hosts # execute (after dry-run)
```

Aligns installed version, pyproject toml version, and git tag for every package.
The legacy `sync-remote` and `fix-mismatches` subcommands are deprecated
aliases; they print a redirect and exit 2.

Aligns installed version, pyproject.toml version, and git tag for every package.

### Utilities

Expand Down Expand Up @@ -145,11 +144,12 @@ Run `scitex-dev ecosystem list` for the authoritative roster and current version
Detects both **version mismatches** (toml != tag != PyPI) and **code-version mismatches** (commits exist since last tag but version not bumped).

```bash
scitex-dev ecosystem fix-mismatches # Preview mismatches
scitex-dev ecosystem fix-mismatches --confirm # Fix them
scitex-dev ecosystem packages # Observation: print mismatches
scitex-dev ecosystem packages --dry-run # Preview the fix plan
scitex-dev ecosystem packages --update-hosts # Execute the fixes
```

Or via MCP: `mcp__scitex__dev_ecosystem_fix_mismatches`.
Or via MCP: `mcp__scitex__dev_ecosystem_packages`.

Python API:

Expand Down
32 changes: 31 additions & 1 deletion src/scitex/_skills/general/06_skills_04_editable-installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,37 @@ is_editable = direct_url and '"editable": true' in direct_url
"src/<pkg_name>/_skills" = "<pkg_name>/_skills"
```

- After every wheel-publish, verify a fresh `pip install scitex-<pkg>` into a clean venv still sees the skills:
### Why setuptools needs the explicit `package-data` entry

`[tool.setuptools.packages.find] where = ["src"]` only picks up **Python packages** (directories containing `__init__.py`). Markdown files in subdirectories like `_skills/<pkg>/SKILL.md` are NOT auto-included. The result is a silent failure mode:

- `git ls-files` shows `SKILL.md` is tracked ✅
- The source tree under `src/<pkg>/_skills/` is intact ✅
- `python -m build --wheel` builds successfully ✅
- But the resulting wheel does NOT contain the file ❌

PyPI users who `pip install <pkg>` see no skill page, and skill-discovery agents iterate over an empty `<pkg>._skills` namespace. The CI workflow won't catch this because nothing imports the markdown file.

### Pre-publish verification (5-second check that catches the silent failure)

After every `python -m build`, **before tagging the release**, verify the wheel actually contains the data files you expect:

```bash
unzip -l dist/<pkg>-<version>-py3-none-any.whl | grep -E '_skills|SKILL\.md'
```

Expected output (one line per shipped skill leaf):
```
6716 2026-04-28 01:58 <pkg>/_skills/<pip-name>/SKILL.md
```

No matching lines = the wheel is missing skills. Re-check `pyproject.toml`'s `[tool.setuptools.package-data]` (or the hatch `force-include` block), rebuild, re-verify. **Don't tag until this passes.**

A real instance of this trap, scitex-hpc 0.6.1 (2026-04-28): the SKILL.md was added to git but the package-data entry was missing. The wheel built without errors, the version bumped fine, CI was green. Caught at the unzip-l step before the tag was pushed; shipped 0.6.2 with the fix on the same day. Without the unzip check, 0.6.1 would have been an "everything looks done" release that failed silently for every PyPI user.

### Post-install verification (after the wheel is live)

For belt-and-suspenders, confirm a fresh `pip install scitex-<pkg>` into a clean venv resolves the skill:

```bash
python -c "from importlib.resources import files; print(list(files('<pkg_name>._skills').iterdir()))"
Expand Down
Loading
Loading