Skip to content

handle nfo, reject cams ticob, flag dividend mismatches, name-match family pans#10

Merged
sandeeprjs92 merged 3 commits intomainfrom
fix/registrar-edge-cases
Apr 15, 2026
Merged

handle nfo, reject cams ticob, flag dividend mismatches, name-match family pans#10
sandeeprjs92 merged 3 commits intomainfrom
fix/registrar-edge-cases

Conversation

@sandeeprjs92
Copy link
Copy Markdown
Contributor

Three registrar-file edge cases that came up in public feedback. Each commit addresses one concern, with tests. Research first, code second — see below.

Research summary

Before writing anything I searched public BSE StAR MF and CAMS docs and compared against the upstream source pipeline.

Code Public docs Source pipeline Decision
NFO Standard 'New Fund Offer' transaction type, documented by BSE StAR MF and every AMC. Not handled. Classify as (BUY, new_fund_offer).
TICOB / TOCOB No public documentation found. Upstream explicitly rejects these for CAMS with 'unsupported type'. Port the rejection.
FC Not found in BSE StAR MF, CAMS public docs, or upstream pipeline. Not handled. Intentionally left alone. Guessing at buy/sell semantics for a code I can't verify would be worse than a visible no-op. Commenters welcome to clarify their registrar's specific meaning and send a follow-up PR.
Dividend option mismatch Real issue — SEBI renamed 'Dividend' to 'IDCW' in 2021, so historical feed files disagree with updated scheme masters. Captures the raw flag but does not reconcile. Capture it, normalise to canonical plan_type, reject the row to the correction queue on mismatch.
Messy investor data Both pipelines match by PAN only. Add canonical name fallback for family PANs.

Commits

1. add nfo classification and reject cams ticob/tocob

  • CamsAdapter and all three KFintechFormatX adapters classify NFO as (BUY, 'new_fund_offer')
  • New FeedAdapter.rejected_types class attribute (default empty set). CamsAdapter sets it to {'TICOB', 'TOCOB'}
  • Cleaner.run drops rejected rows before any further processing
  • Tests cover NFO classification for both registrars and TICOB/TOCOB rejection at the cleaner level

2. capture dividend option flag and reject plan_type mismatches

  • CAMS adapter field map picks up REINVEST_Fdividend_option_flag, normalised via a {Y: idcw_reinvest, N: idcw_payout} lookup
  • KFintech field map picks up DIVOPTdividend_option_flag, normalised via a lookup that folds both legacy 'DIVIDEND PAYOUT' / 'DIVIDEND REINVESTMENT' wording and the current 'IDCW PAYOUT' / 'IDCW REINVESTMENT' wording to the same canonical values
  • adapter.normalize() emits a plan_type_from_feed column after the rename
  • core/validator.py raises ValidationError(CorrectionType.OTHER) when the feed's declared plan_type disagrees with the scheme master's value
  • Silent feeds (empty DIVOPT / growth-only funds) don't trip the check — scheme master stays the source of truth
  • Tests cover mismatch → raise, match → pass, silent feed → pass

3. resolve family pan by canonical investor name when ownership_type is silent

  • New private _canonicalize_name helper — upper-case plus whitespace collapse, deliberately not fuzzy
  • When a family PAN has multiple accounts and ownership_type doesn't tie-break, try a unique name-equality match before falling through to the individual default
  • Only activates when (a) multiple accounts share the PAN, (b) the row has an investor_name, (c) exactly one stored account matches after canonicalisation. Ambiguous multi-match still raises AmbiguousPanError so operators resolve it in the correction queue
  • The adapters already map CAMS INVNAME and KFintech INV_NAME to the investor_name canonical column, so no adapter change needed
  • Tests cover happy match, multi-match falls through, silent feed unchanged

Local verification

  • 105 unit tests pass (was 96 on main, added 9)
  • Ruff clean
  • Bandit clean
  • Forbidden-strings scan clean (108 files)

Acting on a LinkedIn comment about edge-case transaction types.

NFO (New Fund Offer) is a standard BSE STAR MF purchase type used
for initial subscriptions during a fund offer period. Classify it as
(BUY, new_fund_offer) in both CAMS and KFintech adapters. Add NFO to
the CAMS _BUY_TYPES set.

CAMS TICOB and TOCOB (transfer in/out close-of-business variants)
are not real orders — the source pipeline rejects them at validation
time. Port that rejection by adding a per-adapter rejected_types set
and filtering those rows out in Cleaner.run before anything else
runs. Default is an empty set so other adapters are unaffected.

I also researched FC and standalone COB via web search and the
BSE STAR MF file structure spec. Neither is documented in public
CAMS or BSE references and neither appears in the upstream source
pipeline. Leaving them unhandled until a commenter or maintainer
can clarify the specific registrar semantics — guessing at buy/sell
for a code I cannot verify would be worse than a visible no-op.
Acting on a LinkedIn comment about dividend option mismatches.

CAMS ships a REINVEST_F column ('Y' = reinvest, 'N' = payout) and
KFintech ships DIVOPT with human-readable strings that differ across
file vintages: legacy files still use 'DIVIDEND PAYOUT' /
'DIVIDEND REINVESTMENT' wording while post-SEBI-2021 files use
'IDCW PAYOUT' / 'IDCW REINVESTMENT'. Both map to the same canonical
plan_type in the scheme master.

Each adapter's normalize() now emits a plan_type_from_feed column
populated from the raw flag via a case-insensitive, whitespace-
collapsed lookup. The per-row validator reads that column and raises
a ValidationError (CorrectionType.OTHER) whenever the feed's
declared plan_type disagrees with the scheme master's value.

When the feed is silent on dividend option (empty column or 'GROWTH'
with no payout/reinvest distinction), the scheme master remains the
source of truth and validation passes.
…silent

Acting on a LinkedIn comment about messy / inconsistent investor
data on family PANs.

When a family PAN maps to multiple accounts and the row doesn't
carry an ownership_type the resolver can tie-break with, try an
investor-name equality match after canonicalisation. Canonicalisation
is upper-case + whitespace collapse — deliberately not fuzzy, because
false positives on family accounts are worse than an AmbiguousPanError
the operator can fix in the correction queue.

Only activates when (a) the PAN has multiple accounts, (b) the row
carries an investor_name, (c) exactly one stored account matches.
If two stored accounts share the same canonical name or the row is
silent on the name, we fall through to the existing individual
default / ambiguous path unchanged.

The adapters already map CAMS INVNAME and KFintech INV_NAME to the
investor_name canonical column, so no adapter change needed.
@sandeeprjs92 sandeeprjs92 merged commit 5ae771b into main Apr 15, 2026
15 checks passed
@sandeeprjs92 sandeeprjs92 deleted the fix/registrar-edge-cases branch April 15, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant