Skip to content

Conversation

@ali-sefidmouy
Copy link
Contributor

No description provided.

@tavallaie
Copy link
Contributor

  1. Incorrect parent-child matching logic
    Using shared allele overlap (q_set & d_set) instead of Mendelian subset rule. This causes massive false positives — unrelated profiles with common alleles rank highly.

  2. No proper Likelihood Ratio (CLR) calculation
    CLR set to number of "consistent" loci. Real forensic evaluation requires product of per-locus LRs using allele frequencies. Current ranking does not reflect true relationship strength.

  3. No mutation handling
    True parent-child pairs with even one ±1 mutation are incorrectly penalized or excluded.

  4. No candidate pre-filtering / indexing
    Full scan of ~500k profiles per query → extremely slow (likely times out in evaluation). Challenge requires efficient filtering for scalability.

  5. Bidirectional logic not properly implemented
    Subset check must work both ways (query as child or query as parent), but current overlap test fails this.

  6. Mutated_loci always 0
    Required output field not populated.

  7. Inconclusive vs mismatch confusion
    True mismatches counted as "inconclusive" instead of exclusions.

Score impact: True parent rarely appears in top 10 due to false positives and lack of statistical weighting.

@tavallaie tavallaie merged commit 6a3d1b3 into pyday-iran:main Dec 28, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants