Skip to content

Conversation

@johnzfitch
Copy link
Owner

  • Add manual_analysis.py script for watermark detection when spaCy model unavailable
  • Include sample article text (LLM learning research) for analysis
  • Generate detailed JSON analysis output with 46 clause pairs analyzed
  • Result: LIKELY_HUMAN verdict with 0.209 final score (below 0.45 threshold)

- Add manual_analysis.py script for watermark detection when spaCy model unavailable
- Include sample article text (LLM learning research) for analysis
- Generate detailed JSON analysis output with 46 clause pairs analyzed
- Result: LIKELY_HUMAN verdict with 0.209 final score (below 0.45 threshold)
Copilot AI review requested due to automatic review settings November 22, 2025 12:35
@johnzfitch johnzfitch merged commit 0fb0d57 into main Nov 22, 2025
8 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a manual watermark analysis script for detecting Echo Rule watermarks in text when spaCy models are unavailable. The script analyzes an LLM learning research article and generates a detailed JSON report with 46 clause pairs analyzed, concluding with a "LIKELY_HUMAN" verdict (score: 0.209).

Key Changes:

  • Implements phonetic, structural, and semantic echo analysis for watermark detection
  • Provides fallback analysis without requiring full spaCy NLP models
  • Generates comprehensive JSON output with detailed scoring breakdown

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 11 comments.

File Description
scripts/manual_analysis.py New script implementing Echo Rule watermark detection with phonetic, structural, and semantic analysis using cmudict, Levenshtein, and rule-based POS tagging
data/analysis_output.json Generated JSON output containing analysis results for 46 clause pairs with detailed phonetic, structural, and semantic scores
data/analysis_input.txt Sample input text about LLM learning research used for watermark analysis testing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sem_score, sem_details = analyze_semantic_echo(pair.zone_a, pair.zone_b)

# Combined score (using Tier 1 weights: 40% phonetic, 30% structural, 30% semantic)
combined = 0.4 * phon_score + 0.3 * struct_score + 0.3 * sem_score
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same weights (0.4, 0.3, 0.3) are duplicated here. This creates a maintenance issue where changes to the weighting algorithm must be made in multiple places. Consider extracting these to module-level constants.

Copilot uses AI. Check for mistakes.
Comment on lines +533 to +545
if report.final_score >= 0.45:
report.verdict = "HIGH_PROBABILITY_AI"
report.confidence = min(0.95, 0.5 + report.final_score)
report.reasoning.append(f"High echo score ({report.final_score:.3f}) suggests Echo Rule watermark presence")
elif report.final_score >= 0.35:
report.verdict = "MODERATE_PROBABILITY_AI"
report.confidence = 0.3 + report.final_score
report.reasoning.append(f"Moderate echo score ({report.final_score:.3f}) - possible watermark presence")
elif report.final_score >= 0.25:
report.verdict = "LOW_PROBABILITY_AI"
report.confidence = 0.2 + report.final_score * 0.5
report.reasoning.append(f"Low echo score ({report.final_score:.3f}) - unlikely watermark presence")
else:
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threshold values (0.45, 0.35, 0.25) for verdict determination are hardcoded magic numbers without clear documentation. These critical thresholds should be extracted as named constants (e.g., HIGH_PROBABILITY_THRESHOLD, MODERATE_PROBABILITY_THRESHOLD, LOW_PROBABILITY_THRESHOLD) with documentation explaining their basis in research or testing.

Copilot uses AI. Check for mistakes.
Comment on lines +551 to +556
if report.average_phonetic > 0.4:
report.reasoning.append(f"Elevated phonetic echoes ({report.average_phonetic:.3f}) - sound patterns at clause boundaries")
if report.average_structural > 0.5:
report.reasoning.append(f"Strong structural parallelism ({report.average_structural:.3f}) - similar grammatical patterns")
if report.average_semantic > 0.4:
report.reasoning.append(f"Semantic coherence ({report.average_semantic:.3f}) - related concepts across boundaries")
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threshold values (0.4, 0.5, 0.4) for generating specific observations about phonetic, structural, and semantic echoes are magic numbers. Extract these as named constants for clarity and consistency.

Copilot uses AI. Check for mistakes.
def main():
"""Main entry point."""
# Read input file
input_file = Path('/home/user/specHO/data/analysis_input.txt')
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file paths are hardcoded with absolute paths to /home/user/specHO/. This reduces portability and will fail if the script is run in a different environment or by a different user. Consider using relative paths or making the paths configurable via command-line arguments or environment variables.

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +116
"""Check if a word is a content word (not a function word)."""
return word.lower() not in FUNCTION_WORDS and len(word) > 2


Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is_content_word filters words with len(word) > 2, which will exclude valid two-letter content words like "AI" (line 399), "us", "be", etc. While some of these are function words, this threshold may incorrectly filter legitimate short content words. Consider adjusting the logic to handle acronyms and meaningful short words appropriately.

Suggested change
"""Check if a word is a content word (not a function word)."""
return word.lower() not in FUNCTION_WORDS and len(word) > 2
"""Check if a word is a content word (not a function word).
- Excludes words in FUNCTION_WORDS.
- Allows acronyms (all uppercase, length >= 2).
- Allows any word not in FUNCTION_WORDS, regardless of length.
"""
if word.lower() in FUNCTION_WORDS:
return False
if len(word) >= 2 and word.isupper():
return True # Acronym (e.g., "AI", "US")
return True

Copilot uses AI. Check for mistakes.
print_report(report, verbose=True)

# Save JSON report
output_file = Path('/home/user/specHO/data/analysis_output.json')
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file paths are hardcoded with absolute paths to /home/user/specHO/. This reduces portability and will fail if the script is run in a different environment or by a different user. Consider using relative paths or making the paths configurable via command-line arguments or environment variables.

Copilot uses AI. Check for mistakes.
Comment on lines +153 to +154
if len(parts[i].strip()) > 10 and len(parts[i+1].strip()) > 10:
pairs.append(create_clause_pair(parts[i], parts[i+1], sep))
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clause length check (len(parts[i].strip()) > 10) uses a magic number 10. This threshold seems arbitrary and could filter out valid short clauses. Consider making this a named constant with a clear rationale, or adjusting the threshold based on word count rather than character count for more semantic meaning.

Copilot uses AI. Check for mistakes.
Comment on lines +162 to +163
if len(before.strip()) > 10 and len(after.strip()) > 10:
pairs.append(create_clause_pair(before, after, match.group(1)))
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same magic number check (len(before.strip()) > 10 and len(after.strip()) > 10) is repeated. This should be extracted to a named constant for consistency and maintainability.

Copilot uses AI. Check for mistakes.
final_echo = check_final_sounds(phonetics_a, phonetics_b)

# Combine scores (weighted)
combined = 0.4 * avg_sim + 0.3 * initial_echo + 0.3 * final_echo
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weights (0.4, 0.3, 0.3) for phonetic, structural, and semantic scores are hardcoded magic numbers. These should be extracted as named constants (e.g., PHONETIC_WEIGHT, STRUCTURAL_WEIGHT, SEMANTIC_WEIGHT) for better maintainability and documentation of the scoring algorithm.

Copilot uses AI. Check for mistakes.
HAS_LEVENSHTEIN = False

try:
import numpy as np
Copy link

Copilot AI Nov 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'np' is not used.

Suggested change
import numpy as np

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants