From f491e69a2586e096720134d43e0106b762d7512c Mon Sep 17 00:00:00 2001 From: Nicholas Karlson Date: Wed, 21 Jan 2026 15:24:49 -0800 Subject: [PATCH] Docs: add Track D Playbook outline + strengthen workbook entry points --- docs/source/workbook/index.rst | 1 + docs/source/workbook/track_d.rst | 8 ++ .../source/workbook/track_d_chapter_index.rst | 2 + .../source/workbook/track_d_outputs_guide.rst | 10 ++- .../track_d_playbook/01_orientation.rst | 49 ++++++++++++ .../02_accounting_data_pipeline.rst | 47 +++++++++++ .../03_trackd_dataset_contract.rst | 45 +++++++++++ .../track_d_playbook/04_nso_case_story.rst | 45 +++++++++++ .../05_core_analysis_recipes.rst | 49 ++++++++++++ .../06_time_series_and_forecasting.rst | 45 +++++++++++ .../07_risk_controls_and_quality.rst | 43 +++++++++++ .../08_byod_in_the_real_world.rst | 47 +++++++++++ .../09_reporting_and_storytelling.rst | 46 +++++++++++ .../track_d_playbook/10_capstone_projects.rst | 47 +++++++++++ .../track_d_playbook/a_cli_cheatsheet.rst | 51 ++++++++++++ .../workbook/track_d_playbook/a_glossary.rst | 77 +++++++++++++++++++ .../workbook/track_d_playbook/index.rst | 47 +++++++++++ .../workbook/track_d_student_edition.rst | 8 ++ 18 files changed, 663 insertions(+), 4 deletions(-) create mode 100644 docs/source/workbook/track_d_playbook/01_orientation.rst create mode 100644 docs/source/workbook/track_d_playbook/02_accounting_data_pipeline.rst create mode 100644 docs/source/workbook/track_d_playbook/03_trackd_dataset_contract.rst create mode 100644 docs/source/workbook/track_d_playbook/04_nso_case_story.rst create mode 100644 docs/source/workbook/track_d_playbook/05_core_analysis_recipes.rst create mode 100644 docs/source/workbook/track_d_playbook/06_time_series_and_forecasting.rst create mode 100644 docs/source/workbook/track_d_playbook/07_risk_controls_and_quality.rst create mode 100644 docs/source/workbook/track_d_playbook/08_byod_in_the_real_world.rst create mode 100644 docs/source/workbook/track_d_playbook/09_reporting_and_storytelling.rst create mode 100644 docs/source/workbook/track_d_playbook/10_capstone_projects.rst create mode 100644 docs/source/workbook/track_d_playbook/a_cli_cheatsheet.rst create mode 100644 docs/source/workbook/track_d_playbook/a_glossary.rst create mode 100644 docs/source/workbook/track_d_playbook/index.rst diff --git a/docs/source/workbook/index.rst b/docs/source/workbook/index.rst index d34e5df..abd5c84 100644 --- a/docs/source/workbook/index.rst +++ b/docs/source/workbook/index.rst @@ -14,6 +14,7 @@ PyStatsV1 Workbook troubleshooting track_c track_d_student_edition + track_d_playbook/index track_d_chapter_index track_d track_d_dataset_map diff --git a/docs/source/workbook/track_d.rst b/docs/source/workbook/track_d.rst index 5df32c0..4f308c6 100644 --- a/docs/source/workbook/track_d.rst +++ b/docs/source/workbook/track_d.rst @@ -28,6 +28,14 @@ What you get When you run ``pystatsv1 workbook init --track d``, PyStatsV1 creates a local folder containing: +Big picture map (recommended) +----------------------------- + +If you feel like you are learning lots of commands but losing the "why", read: + +- :doc:`Track D Playbook: Big Picture ` + + * convenience runner scripts (``d01`` … ``d23``) that map to Track D chapters * a reproducible, pre-installed dataset under ``data/synthetic/`` (seed=123) * an ``outputs/track_d/`` folder where results are written diff --git a/docs/source/workbook/track_d_chapter_index.rst b/docs/source/workbook/track_d_chapter_index.rst index a07be94..c99a476 100644 --- a/docs/source/workbook/track_d_chapter_index.rst +++ b/docs/source/workbook/track_d_chapter_index.rst @@ -3,6 +3,8 @@ Track D chapter index (PyPI) This page is a "table of contents" for running Track D from the PyPI workbook. +Track D is the “big picture” track: you’re learning how to do **statistics on accounting data** (not toy datasets) in a way that is repeatable, testable, and usable on your own books. The workflow loop is the same in every chapter: start from accounting tables (canonical demos or your own exports), normalize them into a consistent GL contract (see :doc:`track_d_byod`), validate the structure, then analyze with scripts that produce tidy CSVs, figures, and short JSON/MD summaries. In the PyPI workbook you run chapters with ``pystatsv1 workbook run dXX`` and outputs land under ``outputs/track_d/`` (see :doc:`track_d_outputs_guide`). When you bring your own data, you use the BYOD pipeline (export → ``tables/`` → normalize → ``normalized/`` → analyze); start at :doc:`track_d_byod` and :doc:`track_d_playbook/index` for the end-to-end “how it all fits together.” Keep asking one question as you go: *what does this accounting structure measure, and what statistical summary answers a real decision problem?* + After you've initialized a Track D workbook: .. code-block:: bash diff --git a/docs/source/workbook/track_d_outputs_guide.rst b/docs/source/workbook/track_d_outputs_guide.rst index 717f9a1..9cf7f28 100644 --- a/docs/source/workbook/track_d_outputs_guide.rst +++ b/docs/source/workbook/track_d_outputs_guide.rst @@ -172,15 +172,17 @@ This is a reliable "lab rhythm" that works for almost any Track D chapter: Optional: changing the output location -------------------------------------- -Most Track D scripts support an ``--outdir`` argument. +Most Track D scripts support an ``--outdir`` argument **when you run the script directly**. -This is useful if you want one folder per lab group, or you want to keep a "clean" outputs folder. +The ``pystatsv1 workbook run ...`` command is the simplest way to run Track D, but it does not forward +extra arguments to the underlying script. So if you want a custom outputs folder, run the script with Python: .. code-block:: console - pystatsv1 workbook run d01 --outdir outputs/track_d_groupA + # from inside your Track D workbook folder + python scripts/d01.py --outdir outputs/track_d_groupA -If you are new to command-line tools, ignore this at first and use the default. +If you are new to command-line tools, ignore this at first and use the default ``outputs/track_d`` folder. Common gotchas -------------- diff --git a/docs/source/workbook/track_d_playbook/01_orientation.rst b/docs/source/workbook/track_d_playbook/01_orientation.rst new file mode 100644 index 0000000..163dbd7 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/01_orientation.rst @@ -0,0 +1,49 @@ +Orientation: what Track D is and how to use it +============================================== + +**Why this exists:** Track D can feel like “a lot of scripts.” This chapter shows the *workflow* that ties everything together. + +Learning objectives +------------------- + +- Explain Track D in one sentence (statistics on accounting data). +- Describe the Track D workflow: export → normalize → validate → analyze → communicate. +- Know the three kinds of Track D work: case study, labs, and BYOD. +- If you forget a command, run ``pystatsv1 --help`` or ``pystatsv1 trackd byod --help``. + +Outline +------- + +The Track D workflow in one page +-------------------------------- + +- Start from an accounting export (or the NSO case study dataset). +- Get the data into the Track D dataset contract (either already canonical, or via BYOD normalization). +- Run a chapter script to answer a question (and write outputs). +- Use the artifacts (CSV/PNG/JSON) to write a short business interpretation. + +What you should have at the end +------------------------------- + +- A reproducible folder with inputs + scripts + outputs (so you can rerun later). +- A small set of charts/tables that tell a story about revenue, costs, or risk. +- A written summary that a manager could act on. + +Common mental model mistakes (and fixes) +---------------------------------------- + +- Mistake: treating accounting data as “just categories.” Fix: it’s a time-stamped database with structure. +- Mistake: skipping validation. Fix: always run a quick check before believing results. +- Mistake: staring at raw rows. Fix: aggregate into daily/monthly totals and compare periods. + +Where this connects in the workbook +----------------------------------- + +- :doc:`index` (the Playbook overview / map) +- :doc:`../track_d_student_edition` (how students actually run chapters) +- :doc:`../track_d_outputs_guide` (how to read what scripts produce) +- :doc:`../track_d_byod` (how to analyze your own exports) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/02_accounting_data_pipeline.rst b/docs/source/workbook/track_d_playbook/02_accounting_data_pipeline.rst new file mode 100644 index 0000000..9e4cd24 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/02_accounting_data_pipeline.rst @@ -0,0 +1,47 @@ +Accounting data as a dataset pipeline +===================================== + +**Why this exists:** Students often know debits/credits but not how that becomes an analyzable dataset. This bridges that gap. + +Learning objectives +------------------- + +- Describe the path from business events to statements and analytics. +- Recognize the difference between a chart of accounts, journal, ledger, and trial balance. +- Explain what a “normalization step” does and why it matters. + +Outline +------- + +From events to reports +---------------------- + +- Business event → journal entry (date, accounts, amounts, memo). +- Journal entries are the “source record”; the ledger is the “by-account view” of those entries. +- Trial balance is a snapshot of balances by account. +- Statements are *views* built from the trial balance and classifications. + +From reports to analysis +------------------------ + +- Analytics usually starts from the journal/ledger (not the formatted financial statements). +- We create time series (daily/monthly totals), ratios, and variance explanations. +- We then ask: what changed, why, and what should we do next? + +Where BYOD fits +--------------- + +- Different systems export different CSV shapes. +- Adapters convert exports into the Track D canonical tables. +- After normalization, you typically work from ``normalized/gl_journal.csv`` (plus ``normalized/chart_of_accounts.csv``). +- After normalization, analysis scripts don’t care where the data came from. + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_dataset_map` (what tables exist and what they mean) +- :doc:`../track_d_byod` (the adapter/normalize/validate workflow) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/03_trackd_dataset_contract.rst b/docs/source/workbook/track_d_playbook/03_trackd_dataset_contract.rst new file mode 100644 index 0000000..996d9fe --- /dev/null +++ b/docs/source/workbook/track_d_playbook/03_trackd_dataset_contract.rst @@ -0,0 +1,45 @@ +The Track D dataset contract (what scripts expect) +================================================== + +**Why this exists:** Track D works because every chapter agrees on a shared data contract. This chapter explains the contract at a high level. + +Learning objectives +------------------- + +- Know the minimum tables required for GL-based analysis (``chart_of_accounts`` + ``gl_journal``). +- Explain what ``normalized/`` outputs are and why we prefer them for analysis. +- Understand where synthetic datasets come from (seeded, reproducible). + +Outline +------- + +Inputs vs normalized outputs +---------------------------- + +- BYOD projects store raw exports under ``tables/`` (source-specific). +- Normalization produces ``normalized/chart_of_accounts.csv`` and ``normalized/gl_journal.csv`` (canonical). +- Everything after that is “just analysis.” + +Column naming and why it matters +-------------------------------- + +- Stable column headers allow scripts to be reused across systems. +- If headers drift, you want a failure early (during normalize/validate), not silent bad analysis. + +What ``pystatsv1 trackd validate`` does conceptually +---------------------------------------------------- + +- Uses a profile (for example, ``core_gl``) to decide what tables/columns are required. +- Checks basic schema and required columns. +- Catches common data issues: missing dates, non-numeric amounts, or malformed account identifiers. + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_dataset_map` (table-by-table map) +- :doc:`../track_d_outputs_guide` (artifacts and how to use them) +- :doc:`../track_d_byod` (normalization and validation commands) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/04_nso_case_story.rst b/docs/source/workbook/track_d_playbook/04_nso_case_story.rst new file mode 100644 index 0000000..fad7476 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/04_nso_case_story.rst @@ -0,0 +1,45 @@ +NSO case study: why these numbers exist +======================================= + +**Why this exists:** A narrative is the easiest way to keep students oriented. This chapter frames the NSO dataset as a business story. + +Learning objectives +------------------- + +- Describe the NSO business in 60 seconds (what it sells, who it serves, why data matters). +- Identify the main questions Track D is trying to answer with NSO. +- Explain why a case study is useful before BYOD. + +Outline +------- + +The business story we’re modeling +--------------------------------- + +- What NSO does and what financial drivers matter (sales volume, margins, seasonality). +- What data we have and what we *don’t* have (and how that affects conclusions). + +The analysis questions Track D repeats +-------------------------------------- + +- Performance: what happened this period vs last period? +- Drivers: which categories/accounts explain the change? +- Risk: where are anomalies, volatility, or concentration? +- Decisions: what would you recommend based on evidence? +- Each chapter produces a small set of artifacts (CSV/PNG/JSON/MD) that support one of these questions. + +Transfer to your own data +------------------------- + +- The same questions apply to any small business ledger. +- BYOD lets students swap in their own exports later (see :doc:`../track_d_byod`). + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d` (Track D overview) +- :doc:`../track_d_chapter_index` (where the story shows up in chapters) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/05_core_analysis_recipes.rst b/docs/source/workbook/track_d_playbook/05_core_analysis_recipes.rst new file mode 100644 index 0000000..ba4e979 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/05_core_analysis_recipes.rst @@ -0,0 +1,49 @@ +Core analysis recipes (what students actually do) +================================================= + +**Why this exists:** This is the practical chapter: recurring tasks and the patterns behind them. + +Learning objectives +------------------- + +- Compute daily/monthly totals and compare periods. +- Build a simple sales proxy from ledger data. +- Create a small set of plots/tables that answer one clear question. + +Outline +------- + +Recipe: daily totals +-------------------- + +- Start from ``normalized/gl_journal.csv`` (canonical) and choose a revenue (or cash) account group. +- Group by date, sum signed amounts. +- If you’re using BYOD, you can generate daily totals with ``pystatsv1 trackd byod daily-totals --project ``. +- Plot a time series; note spikes and missing days. +- Write one sentence about the pattern you see. + +Recipe: monthly P&L by category +------------------------------- + +- Start from ``normalized/gl_journal.csv`` and map accounts to categories (revenue/COGS/opex). +- Aggregate by month and category; compute shares and changes. +- Identify the top 3 drivers of change month-over-month. +- Sales proxy = sum of signed amounts for revenue accounts by day/month. + +Recipe: concentration and outliers +---------------------------------- + +- Start from ``normalized/gl_journal.csv`` and find the largest transactions and their accounts. +- Compute the share of total explained by the top N rows. +- Flag unusual values for follow-up documentation. + +Where this connects in the workbook +----------------------------------- + +- :doc:\../track_d_byod` (Bring Your Own Data hub)` +- :doc:`../track_d_outputs_guide` (how to read artifacts) +- :doc:`../track_d_byod_gnucash_demo_analysis` (daily totals helper + example plots) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/06_time_series_and_forecasting.rst b/docs/source/workbook/track_d_playbook/06_time_series_and_forecasting.rst new file mode 100644 index 0000000..8db55c1 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/06_time_series_and_forecasting.rst @@ -0,0 +1,45 @@ +Time series + forecasting for accounting data +============================================= + +**Why this exists:** Forecasting becomes less scary once you’ve built clean daily/monthly series. This chapter outlines the progression. + +Learning objectives +------------------- + +- Explain trend, seasonality, and noise using accounting time series. +- Build a baseline forecast and evaluate it. +- Understand when forecasting is inappropriate (garbage in / structural breaks). + +Outline +------- + +Start with baselines +-------------------- + +- Start from ``normalized/gl_journal.csv`` and build a clean daily/monthly series (revenue proxy, expense totals, or cash). +- Last value, moving average, seasonal naive. +- Always do a simple backtest (train on earlier months, test on later months). +- Compare forecasts with simple error metrics. + +Add explanatory variables +------------------------- + +- Promotions, holidays, payroll cycles, or other known drivers. +- Use regression as a driver model (not magic). + +Keep it business-grounded +------------------------- + +- Always interpret: what would make the forecast wrong? +- Document assumptions and data limitations. +- Structural breaks examples: pricing changes, a new location, system migrations, one-time events, policy changes. + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_chapter_index` (chapters that introduce forecasting ideas) +- :doc:`../track_d_my_own_data` (how to apply the same methods to your exports) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/07_risk_controls_and_quality.rst b/docs/source/workbook/track_d_playbook/07_risk_controls_and_quality.rst new file mode 100644 index 0000000..347c05a --- /dev/null +++ b/docs/source/workbook/track_d_playbook/07_risk_controls_and_quality.rst @@ -0,0 +1,43 @@ +Risk, controls, and data quality checks +======================================= + +**Why this exists:** Accounting data is only useful if you trust it. Track D teaches a light version of audit/control thinking for analysts. + +Learning objectives +------------------- + +- Describe why controls and reconciliation matter for analytics. +- Run simple anomaly checks and interpret them carefully. +- Explain the difference between an error and a legitimate outlier. + +Outline +------- + +Practical checks that scale +--------------------------- + +- Start from ``normalized/gl_journal.csv`` (and optionally ``normalized/chart_of_accounts.csv``). +- Missing dates, negative amounts where unexpected, duplicated rows or duplicated transaction references (when present). +- Unusual spikes relative to typical ranges. + +Sampling mindset +---------------- + +- You can’t check everything; choose samples based on risk and materiality. +- Document what you checked and what you didn’t. + +When to stop and ask for accounting context +------------------------------------------- + +- A statistical red flag is not automatically fraud or error. +- Your next step is often: ask for invoices, contracts, or policy notes (e.g., revenue timing, refunds, capitalization). + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_outputs_guide` (where checks appear in script outputs) +- :doc:`../track_d_byod` (validate step and why it exists) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/08_byod_in_the_real_world.rst b/docs/source/workbook/track_d_playbook/08_byod_in_the_real_world.rst new file mode 100644 index 0000000..ebbd9df --- /dev/null +++ b/docs/source/workbook/track_d_playbook/08_byod_in_the_real_world.rst @@ -0,0 +1,47 @@ +BYOD in the real world (adapters, exports, privacy) +=================================================== + +**Why this exists:** Students need a 'real export' experience. This chapter frames BYOD as a repeatable, safe workflow. + +Learning objectives +------------------- + +- Explain what adapters do and why we prefer them to manual spreadsheet cleaning. +- Know the BYOD commands: ``pystatsv1 trackd byod init``, ``pystatsv1 trackd byod normalize``, ``pystatsv1 trackd validate``, ``pystatsv1 trackd byod daily-totals``. +- Handle privacy safely (what to redact and how to share). + +Outline +------- + +The BYOD workflow +----------------- + +- Initialize a project folder with templates. +- Drop in an export under ``tables/`` (source-specific). +- Normalize to canonical outputs under ``normalized/``. +- Run analysis helpers and Track D scripts on normalized outputs. +- Adapters keep the cleanup step repeatable and testable — no one-off spreadsheet edits. + +Adapters and tradeoffs +---------------------- + +- ``passthrough``: already canonical data. +- ``core_gl``: generic GL export cleaning. +- ``gnucash_gl``: specific to GnuCash multi-line export. + +Privacy + classroom sharing +--------------------------- + +- Never publish raw exports that include names, addresses, invoice details. +- Prefer aggregated outputs (daily totals) for sharing examples. +- If you must share, redact names, shorten the date range, and consider rounding amounts. + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_byod` (BYOD hub) +- :doc:`../track_d_byod_gnucash` (GnuCash tutorial pack) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/09_reporting_and_storytelling.rst b/docs/source/workbook/track_d_playbook/09_reporting_and_storytelling.rst new file mode 100644 index 0000000..7e626f7 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/09_reporting_and_storytelling.rst @@ -0,0 +1,46 @@ +Reporting: turning outputs into decisions +========================================= + +**Why this exists:** Students often stop at plots. This chapter outlines the "so what?" layer Track D is aiming for. + +Learning objectives +------------------- + +- Write short, evidence-based interpretations (not just numbers). +- Choose plots/tables that match the question. +- Communicate uncertainty and limitations clearly. + +Outline +------- + +A simple reporting template +--------------------------- + +- Question → method → key results → interpretation → next step. +- Use the chapter's ``*_summary.json`` + (when present) ``*_memo.md`` / ``*_executive_memo.md`` as your starting draft. +- Include 1–2 plots and 1 table, not 10. + +What makes a good figure +------------------------ + +- Readable axes, clear labels, one message per chart. +- Show comparisons (before/after, categories, distributions). + +Common reporting failures +------------------------- + +- Too many metrics with no narrative. +- No context: missing denominators, time windows, or baselines. +- Confusing accounting signs (debit/credit) with “good/bad”. +- Always state the sign convention you’re using (e.g., “positive = revenue inflow”). + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_byod_gnucash_demo_analysis` (example: daily totals → plots → interpretation) +- :doc:`../track_d_outputs_guide` (artifact types and how to use them) +- :doc:`../track_d_chapter_index` (see D09 for the plotting/reporting style contract) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/10_capstone_projects.rst b/docs/source/workbook/track_d_playbook/10_capstone_projects.rst new file mode 100644 index 0000000..d513c56 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/10_capstone_projects.rst @@ -0,0 +1,47 @@ +Capstones: applying Track D to your own accounting data +======================================================= + +**Why this exists:** This is where Track D becomes a portfolio piece: a reproducible analysis on realistic accounting exports. + +Learning objectives +------------------- + +- Define a narrow, answerable question for a business dataset. +- Build a reproducible pipeline from export → normalized → analysis → report. +- Deliver a short write-up with artifacts that support the claims. + +Outline +------- + +Capstone scope options +---------------------- + +- Performance review: compare two quarters and explain drivers. +- Cash-flow proxy: build a daily inflow/outflow series and summarize volatility. +- Expense audit: identify top sources of expense growth and anomalies. + +Deliverables checklist +---------------------- + +- BYOD project folder (``tables/`` exports + ``normalized/`` outputs + ``config.toml``). +- A small ``outputs/`` folder with plots/tables. +- A short report (1–2 pages) with interpretation and caveats. +- Optional: ``normalized/daily_totals.csv`` generated via ``pystatsv1 trackd byod daily-totals``. + +Rubric outline (draft) +---------------------- + +- Reproducibility (can someone rerun it?). +- Correctness (schema and basic checks pass). +- Insight (the narrative matches the evidence). +- Communication (clear figures and concise writing). + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_my_own_data` (bridge from case study to BYOD) +- :doc:`../track_d_byod` (normalization workflow) + +.. note:: + + This page is intentionally an outline right now. Expand it incrementally as we refine Track D narrative. diff --git a/docs/source/workbook/track_d_playbook/a_cli_cheatsheet.rst b/docs/source/workbook/track_d_playbook/a_cli_cheatsheet.rst new file mode 100644 index 0000000..6e0d082 --- /dev/null +++ b/docs/source/workbook/track_d_playbook/a_cli_cheatsheet.rst @@ -0,0 +1,51 @@ +Appendix: Track D + BYOD CLI cheatsheet +======================================= + +**Why this exists:** A single page students can keep open while working. Everything here should stay stable; if it changes, ``--help`` is the source of truth. + +Learning objectives +------------------- + +- Know the minimal commands to run Track D from PyPI. +- Know the minimal commands to normalize and analyze your own exports. + +Outline +------- + +Install + create workbook (PyPI) +-------------------------------- + +- Install: ``pip install "pystatsv1[workbook]"`` +- Create workbook: ``pystatsv1 workbook init --track d --dest track_d_workbook`` +- List targets: ``pystatsv1 workbook list --track d --workdir track_d_workbook`` +- Optional: ``cd track_d_workbook`` (then you can omit ``--workdir`` below) + +Run Track D scripts +------------------- + +- Peek datasets: ``pystatsv1 workbook run d00_peek_data --workdir track_d_workbook`` +- Run a chapter: ``pystatsv1 workbook run d01 --workdir track_d_workbook`` +- Need options? ``pystatsv1 workbook --help`` + +BYOD: normalize + validate +-------------------------- + +- Init: ``pystatsv1 trackd byod init --dest --profile core_gl`` +- Put your export under ``/tables/`` (typically ``gl_journal.csv``). +- Normalize: ``pystatsv1 trackd byod normalize --project `` +- Normalize (fallback): ``pystatsv1 trackd byod normalize --project --profile core_gl`` +- Validate: ``pystatsv1 trackd validate --datadir /normalized --profile core_gl`` +- Daily totals: ``pystatsv1 trackd byod daily-totals --project `` +- Help: ``pystatsv1 trackd byod --help`` + +Where this connects in the workbook +----------------------------------- + +- :doc:`../track_d_student_edition` (full student workflow) +- :doc:`../track_d_byod` (BYOD hub with tutorials) + +.. note:: + + If anything here differs from what you see locally, run the relevant ``--help`` command and follow that. + General help: ``pystatsv1 --help`` + diff --git a/docs/source/workbook/track_d_playbook/a_glossary.rst b/docs/source/workbook/track_d_playbook/a_glossary.rst new file mode 100644 index 0000000..e1e537a --- /dev/null +++ b/docs/source/workbook/track_d_playbook/a_glossary.rst @@ -0,0 +1,77 @@ +Glossary (draft) +================ + +This is a student-friendly glossary for Track D. Keep definitions short, practical, and tied to what you see in the PyStatsV1 outputs. + +Accounting terms +---------------- + +- **Account**: A labeled bucket that records a type of financial activity (e.g., Cash, Sales, Rent Expense). +- **Chart of accounts (COA)**: The full list of accounts a business uses, usually grouped into Assets, Liabilities, Equity, Revenue, Expenses. +- **Account type**: A label like Asset/Liability/Equity/Revenue/Expense that helps classify accounts for reporting and analysis. +- **Debit / credit**: The two-sided bookkeeping convention used to keep entries balanced. In Track D, the scripts handle the analysis sign convention; when unsure, check the chapter page or the output summaries for “positive means what?”. +- **Journal entry**: A dated record of debits/credits for one event (it should balance to zero when summed). +- **Posting**: One line within a journal entry (one account + amount). A single entry usually has 2+ postings. +- **General ledger (GL)**: The journal entries viewed “by account over time” (a database-like history, not a formatted report). +- **Transaction vs posting**: A *transaction* is the whole event; *postings* are the individual lines inside it. Analytics often works on postings, then aggregates back up. +- **Trial balance (TB)**: A snapshot of balances by account at a point in time; the starting point for building statements. +- **Financial statements**: Summaries built from the trial balance using classifications (Income Statement / Balance Sheet / Cash Flow). +- **Balance**: The net total in an account after combining debits and credits over a period. +- **Opening balance / beginning balance**: The starting balance at the beginning of a period. +- **Accrual vs cash basis**: The timing rule for when revenue/expense is recorded (earned/incurred vs when cash moves). +- **Accounts receivable (AR)**: Money customers owe you (an asset). **Accounts payable (AP)** is money you owe suppliers (a liability). +- **Revenue (sales)**: Money earned from customers for goods/services. Often recorded before cash is received (accrual). +- **Expense**: Costs incurred to run the business (rent, wages, utilities). Often recorded before cash is paid (accrual). +- **COGS (cost of goods sold)**: Direct costs tied to producing/selling products; used to compute gross margin. +- **Gross margin**: Revenue minus COGS (often shown as dollars and as a % of revenue). +- **Depreciation**: Allocating an asset’s cost over time (a non-cash expense). +- **Reconciliation**: A check that two “views” of the world agree (e.g., bank statement vs cash ledger). +- **Materiality**: A practical threshold for “big enough to matter.” Helps decide what to investigate first. + +Analytics terms +--------------- + +- **Observation / row**: One record in a table (e.g., one posting, one invoice, one daily total). +- **Metric**: A number you track (daily revenue proxy, monthly payroll total, cash balance). +- **Aggregation**: Summarizing many rows into totals by day/month/category (the main move in Track D). +- **Grouping key**: The fields you aggregate by (date, month, account, department, customer segment). +- **Tidy data**: A table where each row is one observation and each column is one variable (easy to filter, group, and plot). +- **Time series**: A metric tracked over time (daily revenue proxy, monthly expenses, weekly cash balance). +- **Baseline**: A simple comparison point (last month, last year, moving average, seasonal naive). +- **Variance**: A change between two periods or scenarios (this month vs last month; actual vs budget). +- **Driver**: The category/account that explains most of a variance (the “why” behind the change). +- **Decomposition**: Breaking a total change into parts (e.g., which accounts explain revenue growth). +- **KPI**: A key performance indicator (e.g., gross margin %, days-to-pay, cash coverage). +- **Distribution**: The spread of values (helpful for typical vs unusual transactions). +- **Outlier**: A value that is unusual relative to typical observations (not automatically an error). +- **Seasonality**: Predictable repeating patterns over time (holidays, summer sales, payroll cycles). +- **Structural break**: A real change in how the business operates that makes “past ≈ future” less reliable. +- **Backtest**: Testing a forecasting method on past data (train earlier, test later). +- **Error metric**: A summary of forecast accuracy (e.g., MAE/MAPE). Lower is usually better, but context matters. + +Track D + BYOD terms +-------------------- + +- **Track D**: The “big picture” track: statistics on accounting data, using reproducible scripts and artifacts. +- **Workflow loop**: Export → normalize → validate → analyze → communicate (repeat this across chapters and BYOD projects). +- **Dataset contract**: The required table names + column headers + meanings that scripts assume. +- **Canonical dataset**: A known-good demo dataset shipped with the workbook (used for learning and expected outputs). +- **Source export**: A CSV export from your accounting system (often messy and source-specific). +- **BYOD (Bring Your Own Data)**: Using your own accounting exports instead of the canonical demos. +- **BYOD project folder**: A reproducible folder created by ``pystatsv1 trackd byod init`` (contains ``config.toml``, ``tables/``, and outputs). +- **config.toml**: The project’s settings file (profile + adapter + any source-specific knobs). +- **tables/**: Where you place *raw exports* (source-specific CSVs). +- **Adapter**: Code that converts a source export into the Track D contract (repeatable + testable cleanup). +- **Normalize / normalization**: Running ``pystatsv1 trackd byod normalize`` to produce canonical outputs under ``normalized/``. +- **normalized/**: Canonical tables produced by normalization (typically ``normalized/gl_journal.csv`` and ``normalized/chart_of_accounts.csv``). +- **Schema**: The expected columns + types for a table (what must exist for scripts to run correctly). +- **Validate**: A fast schema + sanity check that catches missing columns, bad types, and common structural problems. +- **Daily totals**: A first analysis-ready time series derived from ``normalized/gl_journal.csv`` (often written to ``normalized/daily_totals.csv``). +- **Profile**: A preset that defines which tables/columns are required for a workflow (e.g., ``core_gl``). +- **Artifacts**: The outputs created by scripts (tidy CSVs, figures, JSON summaries, short memos) under ``outputs/track_d/``. +- **Reproducible**: Someone else can rerun your project and get the same tables/figures (given the same inputs). + +.. note:: + + Keep this glossary short and student-friendly. If you add a new term, prefer a one-sentence definition plus one concrete example. + If anything conflicts with the CLI or outputs, the docs and ``--help`` should be updated to match what the code actually does. diff --git a/docs/source/workbook/track_d_playbook/index.rst b/docs/source/workbook/track_d_playbook/index.rst new file mode 100644 index 0000000..ce447ac --- /dev/null +++ b/docs/source/workbook/track_d_playbook/index.rst @@ -0,0 +1,47 @@ +.. _track_d_playbook: + +============================= +Track D Playbook: Big Picture +============================= + +Track D is about one idea: **use statistics to understand accounting data**. +The loop is: export → normalize → validate → analyze → communicate. + +The case study (NSO) gives you realistic, messy numbers—but the goal is transfer: +you should be able to take *your own accounting exports* and run the same kind of +analysis with PyStatsV1. + +This playbook is a short “map of the territory.” Each chapter is an outline (for now), +meant to be filled in gradually. + +How to use this playbook +------------------------ + +1. Read :doc:`01_orientation` once (it explains the full Track D workflow). +2. Use :doc:`05_core_analysis_recipes` as your “what do I do next?” page while working. +3. When you bring your own data, jump to :doc:`08_byod_in_the_real_world` (and see ``pystatsv1 trackd byod daily-totals``). + +Where to find the commands and file paths +----------------------------------------- + +- **Student entry point**: :doc:`../track_d_student_edition` +- **Track D chapter list**: :doc:`../track_d_chapter_index` +- **Dataset map + outputs**: :doc:`../track_d_dataset_map`, :doc:`../track_d_outputs_guide` +- **Bring your own data (BYOD)**: :doc:`../track_d_byod` +- **This playbook**: :doc:`index` + +.. toctree:: + :maxdepth: 2 + + 01_orientation + 02_accounting_data_pipeline + 03_trackd_dataset_contract + 04_nso_case_story + 05_core_analysis_recipes + 06_time_series_and_forecasting + 07_risk_controls_and_quality + 08_byod_in_the_real_world + 09_reporting_and_storytelling + 10_capstone_projects + a_cli_cheatsheet + a_glossary diff --git a/docs/source/workbook/track_d_student_edition.rst b/docs/source/workbook/track_d_student_edition.rst index 73e28e2..b44b413 100644 --- a/docs/source/workbook/track_d_student_edition.rst +++ b/docs/source/workbook/track_d_student_edition.rst @@ -10,6 +10,14 @@ If you do only one thing first, do this: 1) Follow the workbook quickstart. 2) Run ``d00_peek_data`` to *see the data*. + +Big picture +----------- + +If you want a quick map of how the Track D pieces fit together, read: + +- :doc:`Track D Playbook: Big Picture ` + 3) Run ``d01`` to *see the accounting invariants*. 4) Use the “skill map” below to keep your bearings.