-
Notifications
You must be signed in to change notification settings - Fork 1
03 The 40 Checks Explained
Every piece of Ghost Writer content is validated against 40 checks organized into 10 blocks (A–J). Hard checks must pass; soft checks inform quality. Content passes when there are 0 hard fails and ≤3 soft fails.
Counters: Perplexity-based detectors, burstiness analysis, vocabulary uniformity.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 1 | Sentence Length Variance | Hard | stdev ≥ 5 | Low burstiness | Injects varied sentence lengths; prompt demands mix of fragments and long sentences |
| 2 | Vocabulary Richness (TTR) | Soft | TTR ≥ 0.45 | Lexical uniformity | Type-token ratio enforced; voice profile adds domain terms |
| 3 | Hapax Legomena Ratio | Soft | ≥ 0.25 | Repetitive vocabulary | Words used once; human-like vocabulary diversity |
| 4 | Average Sentence Length | Soft | 8–25 words | Uniform structure | Prompt specifies avg sentence length per voice |
| 5 | Short Sentence Presence | Hard | ≥ 1 sentence ≤ 5 words | No fragments | Prompt: "Use fragments when they hit harder" |
| 6 | Long Sentence Presence | Soft | ≥ 1 sentence ≥ 25 words | Overly simple structure | Prompt: "Mix 3-word fragments with 25–35 word complex sentences" |
| 7 | N-gram Diversity | Soft | Varied distribution | Predictable token patterns | Temperature 0.85–0.95; varied generation |
Counters: Binary AI/human classifiers, model attribution.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 8 | Conjunction Starters | Hard | ≥ 1 paragraph starts with And/But/So | AI-typical sentence structure | Prompt: "Start at least 2 paragraphs with conjunctions" |
| 9 | Fragment Usage | Soft | Contains fragments | Uniform sentence types | Regex checks for short sentences; prompt encourages fragments |
| 10 | Parenthetical Asides | Soft | Contains () or — | No human thought injection | Prompt: "Include at least one parenthetical aside" |
| 11 | Temperature Variance | Soft | 0.85–0.95 | Low-temperature uniformity | Generation uses 0.85+; variants use 0.85, 0.89, 0.93 |
| 12 | Model Attribution Defense | Soft | Varied patterns | Model fingerprinting | Voice-specific vocabulary, structural habits break attribution |
Counters: Phrase-based detectors, readability uniformity, lexical patterns.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 13 | Phrase Blacklist | Hard | 0 hits | 120+ AI-detectable phrases | Blacklist in prompt; post-generation replacement if hit |
| 14 | Lexical Diversity | Soft | TTR ≥ 0.50 | Vocabulary repetition | Higher TTR target; voice vocabulary |
| 15 | Readability Variance | Soft | Flesch-Kincaid 20–100 | Uniform readability | Voice-driven; no fixed grade level |
| 16 | Syntactic Variety | Soft | stdev ≥ 4 | Uniform syntax | Sentence length stdev check |
| 17 | Emotional Authenticity | Soft | Voice-driven | Flat tone | Tone anchors in voice profile |
| 18 | Metaphor/Analogy Presence | Soft | ≥ 1 | No creative comparison | Prompt: "Use at least one unexpected metaphor or analogy" |
Counters: Invisible Unicode watermarks, metadata embedding.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 19 | Unicode Normalization | Hard | No invisible chars | Zero-width, BOM, etc. |
normalizeText() strips U+200B–U+200F, U+2028–U+202F, U+FEFF |
| 20 | Metadata Clean | Hard | None | Embedded metadata | Plain text output only; no hidden markers |
Counters: Detector confidence thresholds, sentence-level analysis, plagiarism, anti-humanizer.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 21 | Confidence Score Target | Soft | < 30% AI on all detectors | High AI probability | Triple-detector pipeline; regenerate/revise if fail |
| 22 | Sentence-Level Clean | Soft | No sentence > 80% AI | Per-sentence flagging | Revise failed sentences; splice back |
| 23 | Plagiarism Clear | Hard | < 5% match | Originality.ai | Generate from scratch; no copy-paste |
| 24 | Anti-Humanizer | Hard | Generated human, not paraphrased | Paraphrase detection | Content built with human patterns from scratch |
| 25 | Language Authenticity | Soft | Matches voice profile | Generic AI tone | Voice profile calibration in prompt |
Counters: Non-native bias exploitation, domain mismatch, length mismatch.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 26 | Non-Native Bias Clear | Soft | No exploitation | Detector bias against non-native writers | Ethical constraint; no deliberate bias gaming |
| 27 | Domain Pattern Match | Soft | Uses domain vocabulary | Generic wording | Voice profile domain terms; check for presence |
| 28 | Length Optimization | Soft | Matches platform best length | Wrong length for platform | Platform spec bestLength; word count check |
Counters: Pattern diversity attacks, translation artifacts, authorship inconsistency.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 29 | Pattern Diversity | Soft | stdev ≥ 6 | Structural uniformity | Higher stdev target for adversarial robustness |
| 30 | Translation Proof | Soft | English-native | Translation artifacts | Native generation; no translation step |
| 31 | Authorship Consistency | Soft | Single voice throughout | Mixed styles | Voice profile maintains consistency |
Counters: Single-detector gaming, encoding issues, platform violations.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 32 | Multi-Detector Validation | Hard | GPTZero + Pangram + Originality | Single-detector optimization | All three must pass when configured |
| 33 | Plain Text Normalization | Hard | Normalized output | Hidden chars, encoding | Same normalizeText() as Block D |
| 34 | Platform Format Compliance | Hard | ≤ platform max chars | Over-length content | Truncate to maxChars; email split by spec |
Counters: Self-evaluation bias, FPR exploitation, AI-only classification.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 35 | Third-Party Benchmark | Soft | Validated externally | Internal-only scoring | Real detector APIs; no mock scores |
| 36 | FPR Exploitation Clear | Soft | No FPR gaming | False positive rate exploitation | Ethical constraint; no detector gaming |
| 37 | AI-Assisted Classification | Soft | AI-Assisted or Human | Pangram "AI" classification | Target: Human or AI-Assisted, not AI |
Counters: Disclosure gaps, audit gaps, provenance ambiguity.
| ID | Name | Hard/Soft | Target | Detection Vector | How Ghost Writer Defeats It |
|---|---|---|---|---|---|
| 38 | Disclosure Compliance | Soft | Transparent use | Hidden AI use | Tool provides detection scores for user decision |
| 39 | Audit Trail | Soft | Logged | No traceability | QA report serves as audit trail |
| 40 | Provenance Proof | Soft | Built from scratch | Paraphrased/spun content | Content constructed with human patterns from scratch |
passed = (hardFails === 0) && (softFails <= 3)
- Hard fail: Content is revised (blacklist replacement + GPT revision)
- Soft fail: Counted; if >3, triggers revision
- Max revisions: 3 for QA; 3 for detector failures