claude-adam

13 Commits

Author	SHA1	Message	Date
lukaszraczylo	7ed2aecdfa	docs(logo): swap to swaddled-baby design with hands Replaces the geometric-A-with-observation-dot with a softer, more on-theme design: a swaddled-baby silhouette (rounded A-shape bundle), face nestled inside, and the wrap-band extended past the bundle on both sides as little hands. Maintains currentColor + zero external assets; reads cleanly down to favicon size. Ties the visual identity to the 'Story behind Adam' section: the project is named after the author's son, and now the logo is too.	2026-05-13 02:02:02 +01:00
lukaszraczylo	a30f8b1158	docs: replace ASCII pipeline diagram with mermaid flowchart GitHub renders mermaid natively. Diagram now shows three subgraphs (Observation → Analysis → Review + apply) with a nested Pre-processors subgraph inside Analysis. Includes: - Dotted edge labeled 'user runs /reflect' marking the observe→analyze boundary. - Diamond gate node for auto-apply decision (conf≥4 · low blast · cooldown cool) with explicit yes/no branches. - Feedback loop: applied/ entries measure back into adam-ab-measure.mjs on subsequent reflects. - Color-coded classDef for stores (blue), processes (orange), and the clustering trace artifact (purple). ASCII art retired — diagram now legible at any zoom on github.com.	2026-05-13 01:54:38 +01:00
lukaszraczylo	d3e4350d71	docs: modernize README + add SVG logo + inspiration story - New 'Story behind Adam' section at the top: the project is named after the author's newborn son, whose observe-act-adjust-observe-again learning loop is the methodology ADAM applies to LLM sessions. - New SVG logo at assets/logo.svg: stylized 'A' with a captured observation point inside the apex and a feedback crossbar. Uses currentColor + gradient so it adapts to light/dark GitHub themes. - Centered header block with project tagline + 5 badges (License, Version, Tests, Node, Platform). - New 'Highlights' section: 8 emoji-tagged one-liners covering the v0.3.3 design pillars (zero LLM cost observation, A/B measurement, sliding windows, observability, etc.). - New 'How it works' ASCII pipeline diagram: observation -> analysis pre-processors -> analyst -> review + apply. - Signals table now includes per-signal sliding window column. - Rubric section restructured: gates, modifiers (dampener), and skill_edit-specific requirements clearly separated. - New 'Inspecting the analyst's reasoning' section documenting adam-explain.mjs + /reflect --explain. - Layout updated for v0.3.3 state files (active-nudges.json, ab-tracking.jsonl, reinforcements.jsonl, last-trace.txt) and all 9 new helper scripts under adam/scripts/. - Test count: 27 -> 87. - Closing line crediting Adam.	2026-05-13 01:50:59 +01:00
lukaszraczylo	871592a75b	Merge branch 'adam-v0.3.3-fixes' (v0.3.3) v0.3.3	2026-05-13 01:02:40 +01:00
lukaszraczylo	012c40b9ab	chore(v0.3.3): analyst observability, A/B measurement, journal hygiene Storage/window/exclusion split (#7): ISO-week journal rotation with safety fuse replaces size-based rotation (fixes silent under-counting when clusters straddle boundaries). Per-signal sliding windows via adam-window.mjs guard against stale signal accumulation. Legacy YYYY-MM-DD-<ts>.jsonl files remain readable. Error fingerprint normalization (#3): adam-observe.mjs extracts canonical error codes (ENOENT, ECONNREFUSED, etc.) and normalizes paths/timestamps/hex before hashing. 'Connection refused' and 'ECONNREFUSED' now cluster identically. Correction corpus expansion (#1): strong tokens (stop, wrong, undo, try again, different approach, etc.) fire on any occurrence. Weak tokens (no, actually, wait) require negation/contrast co-occurrence within 8 tokens. Kills the 'actually, I think...' false positive. Analyst observability (#6): mandatory clustering trace block; adam-explain.mjs parses to summary/full/json. Cluster decisions now surface rejection reasons (threshold, contradiction, window). Persisted to ~/.claude/adam/last-trace.txt. Dead_end nudge proposal type (#2): single-session auto-apply gate (>=3 dead_end events). Action appends to active-nudges.json, surfaced via adam-nudge.mjs at next SessionStart. Lower blast than skill_edit. Per-(skill, fingerprint) cooldown (#4): adam-cooldown.mjs replaces coarse per-skill check. proposal_fingerprint = djb2(skill_slug + cluster_id + normalized_diff_body). Legacy applied/rejected records gate via 'legacy' fingerprint fallback through resolveSkill helper (handles target_skill, skill, or target: <path>). task_completed scoring integration (#8): adam-score.mjs computes per-session urgency dampener (3 task_completed -> 0.5) and reinforcement candidates (skills cited in >=3 clean completions). New 'reinforcement' proposal type appends to reinforcements.jsonl on apply (no code/memory mutation). A/B effectiveness measurement (#5): every auto-applied edit appends to ab-tracking.jsonl. adam-ab-measure.mjs computes 7d pre/post signal-count delta per entry (improved / neutral / regressed / no_baseline / pending). Analyst surfaces regressions at top of /reflect output. Upgrade UX overhaul (#9): adam-upgrade.mjs implements --list/--diff/--accept /--accept-all. SessionStart nudge prints pending-merge warning when .adam-new files exist (latency ~20ms via fixed shortlist). install.sh emits unmissable final-message hint after creating any .adam-new file. Simplify pass: adam-utils.mjs deduplicates readJsonlSafe / listJsonlFiles / parseFrontmatter across 8 scripts. Net -46 LOC. Test coverage: 30 -> 87 tests. Every new feature has feature-validating assertions (false-case coverage included). T77 statically verifies install.sh references every adam-*.mjs source script (would have caught the missing adam-utils inclusion that review #2 surfaced).	2026-05-13 01:02:33 +01:00
lukaszraczylo	7ddda26bb4	feat: task_completed signal — post-task skill capture (v0.3.2) Adds an 11th signal type emitted when a run of work (between two UserPromptSubmit events) crosses three quality gates: - >=5 tool calls (TASK_TOOL_MIN) - >=3 distinct tool kinds (TASK_DIVERSITY_MIN, filters single-tool sweeps like "wrote 5 files") - 0 correction signals during the run (filters tasks where the user pushed back; correction-during-task disqualifies the recipe) Payload carries tool_count, tool_kinds, active_skills, active_agents so the agent can cluster by sorted tool-kind tuple and route through the existing skill-overlap rule (skill_new vs skill_edit). Importantly: cross_session_evidence is FALSE on first occurrence, so resulting skill_new proposals always queue for review — they only auto-apply when the same multi-tool recipe recurs in a second session (then the existing rubric kicks in). Post-task creation captures novel patterns while preserving the rule "auto-apply requires cross-session". Hook adds state fields: task_tool_count, task_tool_kinds, task_corrections. All reset on UserPromptSubmit boundary and on session change. Agent gets one new signal-types-table row and one clustering bullet referencing the existing skill-overlap rule. 3 new tests (30 passed, 0 failed): - 5 tools + 5 kinds + 0 corrections fires task_completed - 5 tools + 1 kind (Edit only) does NOT fire (diversity gate) - 5 tools + 3 kinds + correction-on-closing-prompt does NOT fire v0.3.2	2026-05-10 22:34:33 +01:00
lukaszraczylo	6d8ff37cb2	v0.3.1: code review pass + DX overhaul Bug fixes (HIGH): - adam-observe.mjs: errorFingerprint no longer false-positives when toolResponse.is_error === false; ERROR_RE only used as fallback when is_error is undefined. - adam-observe.mjs: resetSessionLocal now clears tool_window so retry_loop cannot fire on the first tool of a new session by matching prior session. - adam-archive.mjs: ts dedup uses Map<ts, count> instead of Set<ts>; two journal entries sharing a millisecond are no longer both archived when only one is referenced in source_entries. - adam-nudge.mjs: only counts proposal filenames matching /^\d{4}-\d{2}-\d{3}-/ pattern; README/notes in proposals/ no longer bump. - skills/adam-self-improvement/SKILL.md: contradiction_flag veto now applied at apply time (carry-over from earlier review). Test isolation: - adam/tests/run-tests.sh: ALWAYS runs against an isolated $HOME under mktemp -d. Previously truncated live ~/.claude/adam/journal.jsonl on every run — destructive on production state. Conciseness: - agents/adam.md: -19 LOC (cuts: vestigial cursor sentence, duplicate not-do bullets, blast-radius bullet collapse, Inputs paths delegate to SKILL.md, win-cluster-vs-struggle-cluster commentary already enforced by cluster-key separation, # Overlap section spec compressed). - skills/adam-self-improvement/SKILL.md: -4 LOC (framing paragraph, dead catch-all bullet for non-eligible types). Auto-prune script DELETED: - The cumulative-count primitive cannot distinguish "never used" from "used before tracking began"; mtime gate is meaningless for installed files. Auto-prune deferred to v0.4 with a per-key lastSeen schema. Cross-platform: - macOS (BSD coreutils) and Linux (Alpine, glibc + musl) verified. - All scripts use portable forms (stat -f \|\| stat -c, mktemp -d -t). - README documents platform support explicitly. DX overhaul: - install.sh: hardened — supports `curl \| bash` via auto-clone, --version=vX.Y.Z pinning, --yes / --dry-run flags, jq-based settings.json merge with diff prompt and backup, conservative file copy that detects local mtime drift and writes <file>.adam-new instead of clobbering, idempotent across re-runs. - adam-uninstall.sh: NEW. Soft-archives ~/.claude/adam/ to .bak.<ts>/ by default; --purge to delete; --yes for non-interactive; jq-based settings.json cleanup with diff prompt. - README.md: curl one-liner install + version-pinned variant at top, What's New section through v0.3.1, upgrade-safe data files callout, uninstaller documentation, platform support note, expanded rubric showing skill_edit gate. Test count: 27 passed, 0 failed (was 27 — no regression). v0.3.1	2026-05-10 21:33:17 +01:00
lukaszraczylo	780401e96a	feat: causal diagnosis step on every proposal (v0.3.0) Closes the gap between categorical signal capture (we saw 3 retries) and causal proposal drafting (here is why and what to do). Mirrors the NL trace reflection step Hermes Agent uses before mutating prompts. Adds # Diagnosis section to every proposal body — four labelled lines: - Trigger: what the user wanted / context - Action: what the assistant did - Mismatch: how the action diverged - Outcome: surfacing event with >=1 verbatim transcript quote Constraints: - <=5 LOC of prose total - >=1 backtick-wrapped quote <=80 chars from transcript context window - Cannot speculate; "Mismatch: unclear" is allowed but takes -1 confidence - Win clusters use "Mismatch: None" with recovery quote in Outcome Skill enforces structure at apply time (presence + 4 labelled lines + quote) for both auto-apply and walk-the-queue paths. No semantic check — humans judge causal correctness during walk-the-queue. Adds optional frontmatter field `diagnosis_summary` (<=120 chars from the Mismatch line) so applied/ and rejected/ are searchable by causal pattern. New rubric penalty: -1 confidence when Diagnosis flags Mismatch: unclear. Stops weak-causation proposals from auto-applying (drops below conf>=4). No hook changes. All 27 tests still pass. Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-causal-diagnosis-design.md v0.3.0	2026-05-10 21:02:36 +01:00
lukaszraczylo	2dc76bf203	feat: lessons-learned loop — win signals + skill_edit auto-apply Adds two new hook signal types: - correction_free_streak: 5 consecutive UserPromptSubmits without a correction phrase - clean_recovery: 3 clean PostToolUse events after a struggle signal (tool_error_loop / dead_end / retry_loop) Both carry active_skills/active_agents payloads computed from a 10-event activity ring, so ADAM can attribute wins to whichever skill was active during the streak/recovery. Promotes skill_edit to auto-apply under a strict gate (all required): - conf >= 4 + cross-session evidence (existing rules) - # Why cites a win-signal entry whose active_skills includes target - diff append-only, +lines <= 30 - resulting SKILL.md size <= 2x current size - 7-day cooldown per target (last_auto_edit in applied/ frontmatter) - 30-day blacklist on user rejection (auto_apply_blacklist in rejected/) Skill enforces the gate at apply time as defense in depth: re-stats target, re-checks cooldown and blacklist, verifies append-only, reverts and refuses on byte-cap breach. User-rejected skill_edit proposals automatically write auto_apply_blacklist: true. Win signals participate in the existing v0.2.0 source_entries archive lifecycle, so already-applied evidence does not re-cluster. Test suite: +5 cases (5 new asserts pass), 27 total passing. Spec: ~/.claude/docs/superpowers/specs/2026-05-10-adam-proactive-design.md Plan: ~/.claude/docs/superpowers/plans/2026-05-10-adam-proactive.md v0.2.1	2026-05-10 20:51:12 +01:00
lukaszraczylo	7962e85578	v0.2.0: drop cursor, add source_entries lifecycle, mandate memory frontmatter Lifecycle redesign: - Each proposal records source_entries: [<ts>...] in frontmatter listing the journal timestamps that fed its cluster. - After apply/reject, skill calls adam/scripts/adam-archive.mjs which moves matching entries from journal.jsonl to journal/actioned-<id>.jsonl. - Agent reads applied/ + rejected/ frontmatter on each /reflect, builds an excluded-timestamps set, skips any leftover already-actioned entries. - cursor field in state.json is vestigial; agent ignores it. Effect: journal stays bounded by active observations. Rule changes re-evaluate the remainder without manual rewind. Race-safer for parallel sessions on shared state.json (no cursor write contention). Memory drafting: - agents/adam.md adds 'Memory drafting protocol' parallel to Skill drafting. - Memory proposals MUST contain auto-memory frontmatter (name, description, type, originSessionId) in '# Proposed change' body. - Skill enforces frontmatter check at apply time; refuses if missing. Tests: 18 -> 21. Two new tests for adam-archive happy path + no-op. Migration: existing applied proposals lack source_entries. Their backing journal entries archived as a one-time bulk migration; legacy proposals annotated with migration note. v0.2.0	2026-05-10 04:29:49 +01:00
lukaszraczylo	2b91db6bf3	rubric: lower single-session struggle threshold to >=1 entry The hook emits struggle signals only after crossing internal thresholds (3 retries, 8 tools no-prompt, 4 edits to one file, 2 build failures, etc.). Each journal entry is therefore meaningful evidence on its own. Old rule required >=3 entries within single session, which the once-per-thing emission design rarely produces. New rule: >=1 struggle entry qualifies for proposal at +2 weight (cross-session bonus does not stack). Auto-apply still requires cross_session_evidence; single-session-only proposals always queue for review. v0.1.1	2026-05-10 03:08:02 +01:00
lukaszraczylo	8ccd4aa4a3	Add MIT LICENSE v0.1.0	2026-05-10 02:41:58 +01:00
lukaszraczylo	78bf0f1e1e	Initial commit: ADAM self-improvement layer for Claude Code - 8 friction signals via lightweight hook (correction, retry_loop, weak_agent, tool_error_loop, dead_end, edit_churn, build_loop, subagent_dispatch_pattern) - Deterministic confidence rubric with cross-session evidence gate - /reflect skill to dispatch the analyst subagent and walk the queue - Skill overlap detection (prefer skill_edit over skill_new on collision) - Solution synthesis from transcript context for new skill drafts - Soft-delete trash, never hard rm - 18 tests covering all signals	2026-05-10 02:32:13 +01:00