thallada/keydr

Fork 0

Files

Tyler Hallada 54ddebf054 N-gram metrics overhaul & UI improvements

2026-02-26 01:26:25 -05:00

18 KiB

Raw Blame History

Plan: EMA Error Decay + Integrated Bigram/Char Focus Generation

Context

Two problems with the current n-gram focus system:

Focus stickiness: Bigram anomaly uses cumulative (error_count+1)/(sample_count+2) Laplace smoothing. A bigram with 20 errors / 25 samples would need ~54 consecutive correct strokes to drop below the 1.5x threshold. Once confirmed, a bigram dominates focus for many drills even as the user visibly improves, while worse bigrams can't take over.
Post-processing bigram focus causes repetition: When a bigram is in focus, apply_bigram_focus() post-processes finished text by replacing 40% of words with dictionary words containing the bigram. This selects randomly from candidates with no duplicate tracking, causing repeated words. It also means the bigram doesn't influence the actual word selection — it's bolted on after generation and overrides the focused char (the weakest char gets replaced by bigram[0]).

This plan addresses both: (A) switch error rate to EMA so anomalies respond to recent performance, and (B) integrate bigram focus directly into the word selection algorithm alongside char focus, enabling both to be active simultaneously.

Part A: EMA Error Rate Decay

Approach

Add an error_rate_ema: f64 field to both NgramStat and KeyStat, updated via exponential moving average on each keystroke (same pattern as existing filtered_time_ms). Use this EMA for all anomaly computations instead of cumulative (error_count+1)/(sample_count+2).

Both bigram AND char error rates must use EMA — error_anomaly_ratio divides one by the other, so asymmetric decay would distort the comparison.

Alpha = 0.1 (same as timing EMA). Half-life ~7 samples. A bigram at 30% error rate recovering with all-correct strokes: drops below 1.5x threshold after ~15 correct (~2 drills). This is responsive without being twitchy.

Changes

`src/engine/ngram_stats.rs`

NgramStat struct (line 34):

Add error_rate_ema: f64 with #[serde(default = "default_error_rate_ema")] and default value 0.5
Add fn default_error_rate_ema() -> f64 { 0.5 } (Laplace-equivalent neutral prior)
Remove recent_correct: Vec<bool> — superseded by EMA and never read

update_stat() (line 67):

After existing error_count increment, add EMA update:

let error_signal = if correct { 0.0 } else { 1.0 };
if stat.sample_count == 1 {
    stat.error_rate_ema = error_signal;
} else {
    stat.error_rate_ema = EMA_ALPHA * error_signal + (1.0 - EMA_ALPHA) * stat.error_rate_ema;
}

Remove recent_correct push/trim logic (lines 89-92)
Keep error_count and sample_count (needed for gating thresholds and display)

smoothed_error_rate_raw() (line 95): Remove. After smoothed_error_rate() on both BigramStatsStore and TrigramStatsStore switch to error_rate_ema, this function has no callers.

BigramStatsStore::smoothed_error_rate() (line 120): Change to return stat.error_rate_ema instead of smoothed_error_rate_raw(stat.error_count, stat.sample_count).

TrigramStatsStore::smoothed_error_rate() (line 333): Same change — return stat.error_rate_ema.

error_anomaly_ratio() (line 123): No changes needed — it calls self.smoothed_error_rate() and char_stats.smoothed_error_rate(), which now both return EMA values.

Default for NgramStat (line 50): Set error_rate_ema: 0.5 (neutral — same as Laplace (0+1)/(0+2)).

`src/engine/key_stats.rs`

KeyStat struct (line 7):

Add error_rate_ema: f64 with #[serde(default = "default_error_rate_ema")] and default value 0.5
Add fn default_error_rate_ema() -> f64 { 0.5 } helper
Note: KeyStat IS persisted to disk. The #[serde(default)] ensures backward compat — existing data without the field gets 0.5.

update_key() (line 50) — called for correct strokes:

Add EMA update: stat.error_rate_ema = if stat.total_count == 1 { 0.0 } else { EMA_ALPHA * 0.0 + (1.0 - EMA_ALPHA) * stat.error_rate_ema }
Use total_count (already incremented on the line before) to detect first sample

update_key_error() (line 83) — called for error strokes:

Add EMA update: stat.error_rate_ema = if stat.total_count == 1 { 1.0 } else { EMA_ALPHA * 1.0 + (1.0 - EMA_ALPHA) * stat.error_rate_ema }

smoothed_error_rate() (line 90): Change to return stat.error_rate_ema (or 0.5 for missing keys).

`src/app.rs`

rebuild_ngram_stats() (line 1155):

Reset error_rate_ema to 0.5 alongside error_count and total_count for KeyStat stores (lines 1165-1172)
NgramStat stores already reset to Default which has error_rate_ema: 0.5
The replay loop (line 1177) naturally rebuilds EMA by calling update_stat() and update_key()/update_key_error() in order

No other app.rs changes needed — the streak update and focus selection code reads through error_anomaly_ratio() which now uses EMA values transparently.

Part B: Integrated Bigram + Char Focus Generation

Approach

Replace the exclusive FocusTarget enum (either char OR bigram) with a FocusSelection struct that carries both independently. The weakest char comes from skill_tree progression; the worst bigram anomaly comes from the anomaly system. Both feed into the PhoneticGenerator simultaneously. Remove apply_bigram_focus() post-processing entirely.

Changes

`src/engine/ngram_stats.rs` — Focus selection

Replace FocusTarget enum (line 510):

// Old
pub enum FocusTarget { Char(char), Bigram(BigramKey) }

// New
#[derive(Clone, Debug, PartialEq)]
pub struct FocusSelection {
    pub char_focus: Option<char>,
    pub bigram_focus: Option<(BigramKey, f64, AnomalyType)>,
}

Replace FocusReasoning enum (line 523):

// Old
pub enum FocusReasoning {
    BigramWins { bigram_anomaly_pct: f64, anomaly_type: AnomalyType, char_key: Option<char> },
    CharWins { char_key: char, bigram_best: Option<(BigramKey, f64)> },
    NoBigrams { char_key: char },
    Fallback,
}

// New — reasoning is now just the selection itself (both fields self-describe)
// FocusReasoning is removed; FocusSelection carries all needed info.

Simplify select_focus_target_with_reasoning() → select_focus():

pub fn select_focus(
    skill_tree: &SkillTree,
    scope: DrillScope,
    ranked_key_stats: &KeyStatsStore,
    ranked_bigram_stats: &BigramStatsStore,
) -> FocusSelection {
    let unlocked = skill_tree.unlocked_keys(scope);
    let char_focus = skill_tree.focused_key(scope, ranked_key_stats);
    let bigram_focus = ranked_bigram_stats.worst_confirmed_anomaly(ranked_key_stats, &unlocked);
    FocusSelection { char_focus, bigram_focus }
}

Remove select_focus_target() and select_focus_target_with_reasoning() — replaced by select_focus().

`src/generator/mod.rs` — Trait update

Update TextGenerator trait (line 14):

pub trait TextGenerator {
    fn generate(
        &mut self,
        filter: &CharFilter,
        focused_char: Option<char>,
        focused_bigram: Option<[char; 2]>,
        word_count: usize,
    ) -> String;
}

`src/generator/phonetic.rs` — Integrated word selection

generate() method — rewrite word selection with tiered approach:

Note: find_matching(filter, None) is used (not focused_char) because we do our own tiering below. find_matching returns ALL words matching the CharFilter — the focused param only sorts, never filters — but passing None avoids an unnecessary sort we'd discard anyway.

fn generate(
    &mut self,
    filter: &CharFilter,
    focused_char: Option<char>,
    focused_bigram: Option<[char; 2]>,
    word_count: usize,
) -> String {
    let matching_words: Vec<String> = self.dictionary
        .find_matching(filter, None)  // no char-sort; we tier ourselves
        .iter().map(|s| s.to_string()).collect();
    let use_real_words = matching_words.len() >= MIN_REAL_WORDS;

    // Pre-categorize words into tiers for real-word mode
    let bigram_str = focused_bigram.map(|b| format!("{}{}", b[0], b[1]));
    let focus_char_lower = focused_char.filter(|ch| ch.is_ascii_lowercase());

    let (bigram_indices, char_indices, other_indices) = if use_real_words {
        let mut bi = Vec::new();
        let mut ci = Vec::new();
        let mut oi = Vec::new();
        for (i, w) in matching_words.iter().enumerate() {
            if bigram_str.as_ref().is_some_and(|b| w.contains(b.as_str())) {
                bi.push(i);
            } else if focus_char_lower.is_some_and(|ch| w.contains(ch)) {
                ci.push(i);
            } else {
                oi.push(i);
            }
        }
        (bi, ci, oi)
    } else {
        (vec![], vec![], vec![])
    };

    let mut words: Vec<String> = Vec::new();
    let mut recent: Vec<String> = Vec::new(); // anti-repeat window

    for _ in 0..word_count {
        if use_real_words {
            let word = self.pick_tiered_word(
                &matching_words,
                &bigram_indices,
                &char_indices,
                &other_indices,
                &recent,
            );
            recent.push(word.clone());
            if recent.len() > 4 { recent.remove(0); }
            words.push(word);
        } else {
            let word = self.generate_phonetic_word(
                filter, focused_char, focused_bigram,
            );
            words.push(word);
        }
    }
    words.join(" ")
}

New pick_tiered_word() method:

fn pick_tiered_word(
    &mut self,
    all_words: &[String],
    bigram_indices: &[usize],
    char_indices: &[usize],
    other_indices: &[usize],
    recent: &[String],
) -> String {
    // Tier selection probabilities:
    // Both available: 40% bigram, 30% char, 30% other
    // Only bigram:    50% bigram, 50% other
    // Only char:      70% char, 30% other (matches current behavior)
    // Neither:        100% other
    //
    // Try up to 6 times to avoid repeating a recent word.
    for _ in 0..6 {
        let tier = self.select_tier(bigram_indices, char_indices, other_indices);
        let idx = tier[self.rng.gen_range(0..tier.len())];
        let word = &all_words[idx];
        if !recent.contains(word) {
            return word.clone();
        }
    }
    // Fallback: accept any non-recent word from full pool
    let idx = self.rng.gen_range(0..all_words.len());
    all_words[idx].clone()
}

select_tier() helper: Returns reference to the tier to sample from based on availability and probability roll. Only considers a tier "available" if it has >= 2 words (prevents unavoidable repeats when a tier has just 1 word and the anti-repeat window rejects it). Falls through to the next tier when the selected tier is too small.

try_generate_word() / generate_phonetic_word() — add bigram awareness for Markov fallback:

Accept focused_bigram: Option<[char; 2]> parameter
Only attempt bigram forcing when both chars pass the CharFilter (avoids pathological starts when bigram chars are rare/unavailable in current filter scope)
When eligible: 30% chance to start word with bigram[0] and force bigram[1] as second char, then continue Markov chain from [' ', bigram[0], bigram[1]] prefix
Falls back to existing focused_char logic otherwise

`src/generator/code_syntax.rs` + `src/generator/passage.rs`

Add _focused_bigram: Option<[char; 2]> parameter to their generate() signatures (ignored, matching trait).

`src/app.rs` — Pipeline update

generate_text() (line 653):

Call select_focus() (new function) instead of select_focus_target()
Extract focused_char from selection.char_focus (the actual weakest char)
Extract focused_bigram from selection.bigram_focus.map(|(k, _, _)| k.0)
Pass both to generator.generate(filter, focused_char, focused_bigram, word_count)
Remove the apply_bigram_focus() call (lines 784-787)
Post-processing passes (capitalize, punctuate, numbers, code_patterns) continue to receive focused_char — this is now the real weakest char, not the bigram's first char

Remove apply_bigram_focus() method (lines 1087-1131) entirely.

Store FocusSelection on App:

Add pub current_focus: Option<FocusSelection> field to App (default None)
Set in generate_text() right after select_focus() — captures the focus that was actually used to generate the current drill's text
Lifecycle: Set when drill starts (in generate_text()). Persists through the drill result screen (so the user sees what was in focus for the drill they just completed). Cleared to None when: starting the next drill (overwritten), leaving drill screen, changing drill scope/mode, or on import/reset. This is a snapshot, not live-recomputed — the header always shows what generated the current text.
Used by drill header display in main.rs (reads app.current_focus instead of re-calling select_focus())

`src/main.rs` — Drill header + stats adapter

Drill header (line 1134):

Read app.current_focus to build focus_text (no re-computation — shows what generated the text)
Display format: Focus: 'n' + "th" (both), Focus: 'n' (char only), Focus: "th" (bigram only)
Replace the current select_focus_target() call with reading the stored selection
When current_focus is None, show no focus text

build_ngram_tab_data() (line 2253):

Call select_focus() instead of select_focus_target_with_reasoning()
Update NgramTabData struct: replace focus_target: FocusTarget and focus_reasoning: FocusReasoning with focus: FocusSelection

`src/ui/components/stats_dashboard.rs` — Focus panel

NgramTabData (line 28):

Replace focus_target: FocusTarget and focus_reasoning: FocusReasoning with focus: FocusSelection
Remove FocusTarget and FocusReasoning imports

render_ngram_focus() (line 1352):

Show both focus targets when both active:
- Line 1: Focus: Char 'n' + Bigram "th" (or just one if only one active)
- Line 2: Details — Char 'n': weakest key | Bigram "th": error anomaly 250%
When neither active: show fallback message
Rendering adapts based on which focuses are present

Files Modified

src/engine/ngram_stats.rs — EMA field on NgramStat, EMA-based smoothed_error_rate, FocusSelection struct, select_focus(), remove old FocusTarget/FocusReasoning
src/engine/key_stats.rs — EMA field on KeyStat, EMA updates in update_key/update_key_error, EMA-based smoothed_error_rate
src/generator/mod.rs — TextGenerator trait: add focused_bigram parameter
src/generator/phonetic.rs — Tiered word selection with bigram+char, anti-repeat window, Markov bigram awareness
src/generator/code_syntax.rs — Add ignored focused_bigram parameter
src/generator/passage.rs — Add ignored focused_bigram parameter
src/app.rs — Use select_focus(), pass both focuses to generator, remove apply_bigram_focus(), store current_focus
src/main.rs — Update drill header, update build_ngram_tab_data() adapter
src/ui/components/stats_dashboard.rs — Update NgramTabData, render_ngram_focus for dual focus display

Test Updates

Part A (EMA)

Update test_error_anomaly_bigrams: Set error_rate_ema directly instead of relying on cumulative error_count/sample_count for anomaly ratio computation
Update test_worst_confirmed_anomaly_dedup and _prefers_error_on_tie: Same — set EMA values
New test_error_rate_ema_decay: Verify that after N correct strokes, error_rate_ema drops as expected. Verify anomaly ratio crosses below threshold after reasonable recovery (~15 correct strokes from 30% error rate).
New test_error_rate_ema_rebuild_from_history: Verify that rebuilding from drill history produces same EMA as live updates (deterministic replay)
New test_ema_ranking_stability_during_recovery: Two bigrams both confirmed. Bigram A has higher anomaly. User corrects bigram A over several drills while bigram B stays bad. Verify that A's anomaly drops below B's and B becomes the new worst_confirmed_anomaly — clean handoff without oscillation.
Update key_stats tests: Verify EMA updates in update_key() and update_key_error(), backward compat (serde default)

Part B (Integrated focus)

Replace focus reasoning tests (test_select_focus_with_reasoning_*): Replace with test_select_focus_* testing FocusSelection struct — verify both char_focus and bigram_focus are populated independently
New test_phonetic_bigram_focus_increases_bigram_words: Generate 1200 words with focused_bigram, verify significantly more words contain the bigram than without
New test_phonetic_dual_focus_no_excessive_repeats: Generate text with both focuses, verify no word appears > 3 times consecutively
Update build_ngram_tab_data_maps_fields_correctly: Update for FocusSelection struct instead of FocusTarget/FocusReasoning
New test_find_matching_focused_is_sort_only (in dictionary.rs or phonetic.rs): Verify that find_matching(filter, Some('k')) and find_matching(filter, None) return the same set of words (same membership, potentially different order). Guards against future regressions where focused param accidentally becomes a filter.
No apply_bigram_focus tests exist to remove (method was untested)

Verification

cargo build — no compile errors
cargo test — all tests pass
Manual: Start adaptive drill, observe both char and bigram appearing in focus header
Manual: Verify drill text contains focused bigram words AND focused char words mixed naturally
Manual: Verify no excessive word repetition (the old apply_bigram_focus problem)
Manual: Practice a bigram focus target correctly for 2-3 drills → verify it drops out of focus and a different bigram (or char-only) takes over
Manual: N-grams tab shows both focuses in the Active Focus panel
Manual: Narrow terminal (<60 cols) stacks anomaly panels vertically; very short terminal (<10 rows available for panels) shows only error anomalies panel; focus panel always shows at least line 1

18 KiB Raw Blame History

Plan: EMA Error Decay + Integrated Bigram/Char Focus Generation

Context

Part A: EMA Error Rate Decay

Approach

Changes

src/engine/ngram_stats.rs

src/engine/key_stats.rs

src/app.rs

Part B: Integrated Bigram + Char Focus Generation

Approach

Changes

src/engine/ngram_stats.rs — Focus selection

src/generator/mod.rs — Trait update

src/generator/phonetic.rs — Integrated word selection

src/generator/code_syntax.rs + src/generator/passage.rs

src/app.rs — Pipeline update

src/main.rs — Drill header + stats adapter

src/ui/components/stats_dashboard.rs — Focus panel

Files Modified

Test Updates

Part A (EMA)

Part B (Integrated focus)

Verification

18 KiB

Raw Blame History

`src/engine/ngram_stats.rs`

`src/engine/key_stats.rs`

`src/app.rs`

`src/engine/ngram_stats.rs` — Focus selection

`src/generator/mod.rs` — Trait update

`src/generator/phonetic.rs` — Integrated word selection

`src/generator/code_syntax.rs` + `src/generator/passage.rs`

`src/app.rs` — Pipeline update

`src/main.rs` — Drill header + stats adapter

`src/ui/components/stats_dashboard.rs` — Focus panel