Releases · ksanyok/TextHumanize

20 Feb 12:05

ksanyok

v0.11.0

8667229

v0.11.0 — 3x Dictionary Expansion + Composer Fix Latest

Latest

What's New

Massive Dictionary Expansion (3x total)

All 9 language dictionaries expanded from 2,281 to 6,881 entries (3.0x growth):

Language	Before	After	Growth
English	257	1,391	5.4x
Russian	291	956	3.3x
Ukrainian	252	780	3.1x
German	235	724	3.1x
French	263	599	2.3x
Spanish	255	613	2.4x
Italian	244	616	2.5x
Polish	244	617	2.5x
Portuguese	240	585	2.4x

All 9 categories expanded: synonyms, bureaucratic words/phrases, AI connectors, sentence starters, colloquial markers, perplexity boosters, split conjunctions, abbreviations.

Bug Fixes

Composer package name — root composer.json had incorrect name ksanyok/texthumanize (no hyphen). Fixed to ksanyok/text-humanize. Also changed type from project to library with proper Packagist metadata.
TOC dots preservation — table-of-contents leader dots (...........) no longer collapse into ellipsis.

Install

# Python
pip install texthumanize

# PHP
composer require ksanyok/text-humanize

1,455 tests passing.

Assets 2

20 Feb 09:36

ksanyok

v0.10.0

80bd50d

v0.10.0 — Grammar, Uniqueness, Health Score, Semantic & Sentence Readability

What's New in v0.10.0

5 New Analysis Modules (all offline, no ML/API)

Module	Function	Description
Grammar Checker	`check_grammar()` / `fix_grammar()`	Rule-based grammar checking for 9 languages
Uniqueness Score	`uniqueness_score()` / `compare_texts()`	N-gram fingerprinting uniqueness analysis
Content Health	`content_health()`	Composite quality: readability + grammar + uniqueness + AI + coherence
Semantic Similarity	`semantic_similarity()`	Measures semantic preservation between original and processed text
Sentence Readability	`sentence_readability()`	Per-sentence difficulty scoring (easy/medium/hard/very_hard)

Custom Dictionary API

result = humanize(text, custom_dict={
    "implement": "build",
    "utilize": ["use", "apply", "employ"],  # random pick
})

Massively Expanded Dictionaries

All 9 language dictionaries balanced (367-439 entries each):

FR: 281→397, ES: 275→388, IT: 272→379, PL: 257→368, PT: 256→367
EN/RU/UK: added perplexity_boosters

Stats

28 files changed, +2333 lines
1455 tests passing (82 new)
17 new public exports
Zero external dependencies

Assets 2

20 Feb 09:05

ksanyok

v0.9.0

621dba9

v0.9.0 — Kirchenbauer Watermark, HTML Diff, Quality Gate, Selective Humanization, Stylometric Anonymizer

What's New

Kirchenbauer Watermark Detector

Green-list z-test based on Kirchenbauer et al. 2023. Uses SHA-256 hash of previous token to partition vocabulary into green/red lists (γ=0.25), computes z-score and p-value. Flags AI watermark at z ≥ 4.0.

from texthumanize import detect_watermarks
report = detect_watermarks(text)
print(report.kirchenbauer_score, report.kirchenbauer_p_value)

HTML Diff Report

explain() now supports multiple output formats:

html = explain(result, fmt='html')      # self-contained HTML page
json_str = explain(result, fmt='json')  # RFC 6902 JSON Patch
diff = explain(result, fmt='diff')      # unified diff

Quality Gate

CLI + GitHub Action + pre-commit hook to check text for AI artifacts:

python -m texthumanize.quality_gate README.md docs/ --ai-threshold 25

Selective Humanization

Process only AI-flagged sentences, leaving human text untouched:

result = humanize(text, only_flagged=True)

Stylometric Anonymizer

Disguise authorship by transforming text toward a target style:

from texthumanize import anonymize_style
result = anonymize_style(text, target='blogger')

Stats

1,373 Python tests passing
40 new tests for v0.9.0 features
Ruff lint clean
22 files changed, 1,637 additions

Assets 2

19 Feb 21:23

ksanyok

v0.8.1

2acdbf3

v0.8.1 — Dual License, Commercial Pricing, Enterprise-Ready

What's New

Licensing:

Dual license model — free for personal/academic use, commercial licenses from $99/year
COMMERCIAL.md — dedicated page with pricing tiers, feature comparison, FAQ
Clear LICENSE file with no ambiguity for legal/compliance teams

Benchmarks:

benchmarks/full_benchmark.py — reproducible benchmark suite (speed, memory, predictability, AI detection, quality)
Real measured data in README: 26K-38K chars/sec, ~2.5 MB peak memory, 100% determinism

Enterprise Documentation:

"For Business & Enterprise" section with corporate requirements mapping
Processing Modes: normalize / style_soft / rewrite with audit trail
Change Report section with explain() examples
Predictability guarantees with seed-based determinism proof

Commercial License Tiers

Tier	Price	Includes
Indie	$99/yr	1 project, 1 dev
Startup	$299/yr	3 projects, 5 devs
Business	$799/yr	Unlimited, 20 devs
Enterprise	Contact us	On-prem, SLA

Install / Update

pip install git+https://github.com/ksanyok/TextHumanize.git@v0.8.1

Full Changelog: https://github.com/ksanyok/TextHumanize/blob/main/CHANGELOG.md

Assets 2

19 Feb 20:32

ksanyok

v0.8.0

a551cdf

v0.8.0 — Style Presets, Auto-Tuner, Semantic Guards, TS/JS Port

🎯 Highlights

The most feature-complete release yet — 27,000+ lines of code across 3 platforms.

Added

Style Presets — 5 predefined targets: student, copywriter, scientist, journalist, blogger
Auto-Tuner — feedback loop that learns optimal intensity from processing history
Semantic preservation guards — expanded context guards with 20+ patterns across EN/RU/UK/DE
Typography-only fast path — AI ≤ 5% skips semantic stages entirely
TypeScript/JavaScript port — full pipeline with adaptive intensity (28 tests)
Complete documentation rewrite — README (2500+ lines), API Reference (660+ lines), Cookbook (14 recipes)

Changed

change_ratio calculation — switched to SequenceMatcher (fixes critical inflation bug)
Graduated retry — retries at lower intensity instead of full rollback
German dictionaries — bureaucratic 22→64, phrases 14→25, connectors 12→20, synonyms 26→45

Fixed

DE zero-change bug (dictionary contained only infinitives)
Natural text over-processing (AI ≤ 5%)
Validator change_ratio consistency

📊 Stats

Platform	Lines of Code	Tests
Python	16,820	1,333
PHP	10,000	223
TypeScript	1,031	28
Total	27,851	1,584

Benchmark: 100% (45/45) · Coverage: 99% · Speed: 56K chars/sec

Full Changelog: https://github.com/ksanyok/TextHumanize/blob/main/CHANGELOG.md

Install / Update

pip install git+https://github.com/ksanyok/TextHumanize.git@v0.8.0

Assets 2

19 Feb 20:32

ksanyok

v0.7.0

a551cdf

v0.7.0 — AI Detection 2.0, C2PA Watermarks, Streaming

Added

13 AI-detection metrics — new perplexity_score metric (character-level trigram model)
Ensemble boosting — 3-classifier aggregation: base weighted sum (50%), strong-signal detector (30%), majority voting (20%). 90.9% accuracy
Benchmark suite — 11 labeled samples, per-label accuracy breakdown
CLI detect subcommand — texthumanize detect [file] with emoji verdicts
Streaming progress callback — humanize_batch(texts, on_progress=callback)
C2PA / IPTC watermark detection — content provenance pattern detection
Tone replacements for UK/DE/FR/ES — informal ↔ formal replacement pairs
PHP examples/ — basic_usage.php & advanced.php

Changed

Zipf metric rewritten — log-log linear regression with R² goodness-of-fit
Confidence formula — 4-component with text length, metric agreement, extreme bonus
Grammar detection expanded — 5 → 9 indicators

Full Changelog: https://github.com/ksanyok/TextHumanize/blob/main/CHANGELOG.md

Install

pip install git+https://github.com/ksanyok/TextHumanize.git@v0.7.0

Assets 2

19 Feb 20:32

ksanyok

v0.6.0

a551cdf

v0.6.0 — Batch Processing, Quality Metrics, 99% Coverage

Added

humanize_batch() / humanizeBatch() — batch processing (Python + PHP)
HumanizeResult.similarity — Jaccard similarity metric (0..1)
HumanizeResult.quality_score — overall quality score (0..1)
1255 Python tests — up from 500, with 99% code coverage
223 PHP tests (825 assertions)

Changed

Python test coverage 85% → 99% (28 of 38 modules at 100%)
mypy clean — 0 type errors across all 38 source files
Dead code removed — 11 unreachable blocks cleaned up

Fixed

ToneAnalyzer MARKETING direction
PHP SentenceSplitter Cyrillic support
37 mypy type errors fixed

Full Changelog: https://github.com/ksanyok/TextHumanize/blob/main/CHANGELOG.md

Install

pip install git+https://github.com/ksanyok/TextHumanize.git@v0.6.0

Assets 2

18 Feb 23:13

ksanyok

v0.5.0

5926300

v0.5.0 - Code Quality and Coverage

What's New in v0.5.0

Quality Engineering

500 tests - up from 382, covering 85% of the codebase (was 80%)
Zero lint errors - fixed all 67 ruff errors across the project
PEP 561 compliance - py.typed marker for downstream type checkers
Pre-commit hooks - ruff lint+format, trailing whitespace, YAML/TOML checks
mypy integration - type checking configuration in pyproject.toml
Enhanced CI/CD - ruff lint step + mypy type check + XML coverage output

Coverage Improvements

Module	Before	After
morphology.py	55%	93%
coherence.py	68%	96%
paraphrase.py	71%	87%
watermark.py	74%	87%
Overall	80%	85%

PHP Fixes

SentenceSplitter - PREG_OFFSET_CAPTURE offset properly cast to int
ToneAnalyzer - preg_match offset cast to int for mb_substr() compatibility

Full Changelog

See CHANGELOG.md for details.

Full backward compatibility with v0.4.0 - no breaking changes.

Assets 2

Releases: ksanyok/TextHumanize

v0.11.0 — 3x Dictionary Expansion + Composer Fix

What's New

Massive Dictionary Expansion (3x total)

Bug Fixes

Install

Uh oh!

v0.10.0 — Grammar, Uniqueness, Health Score, Semantic & Sentence Readability

What's New in v0.10.0

5 New Analysis Modules (all offline, no ML/API)

Custom Dictionary API

Massively Expanded Dictionaries

Stats

Uh oh!

v0.9.0 — Kirchenbauer Watermark, HTML Diff, Quality Gate, Selective Humanization, Stylometric Anonymizer

What's New

Kirchenbauer Watermark Detector

HTML Diff Report

Quality Gate

Selective Humanization

Stylometric Anonymizer

Stats

Uh oh!

v0.8.1 — Dual License, Commercial Pricing, Enterprise-Ready

What's New

Commercial License Tiers

Install / Update

Uh oh!

v0.8.0 — Style Presets, Auto-Tuner, Semantic Guards, TS/JS Port

🎯 Highlights

Added

Changed

Fixed

📊 Stats

Install / Update

Uh oh!

v0.7.0 — AI Detection 2.0, C2PA Watermarks, Streaming

Added

Changed

Install

Uh oh!

v0.6.0 — Batch Processing, Quality Metrics, 99% Coverage

Added

Changed

Fixed

Install

Uh oh!

v0.5.0 - Code Quality and Coverage

What's New in v0.5.0

Quality Engineering

Coverage Improvements

PHP Fixes

Full Changelog

Uh oh!