docs: update README for v0.5.0 release — MOSS-grounded improvements

2026-07-22 05:08:38 +00:00 · 2026-05-24 11:19:15 +01:00
parent 440fb52eb1
commit 2d9257922f
1 changed files with 5 additions and 3 deletions
@@ -13,7 +13,7 @@ Watches the friction in your coding sessions, clusters the signals via an LLM an

 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Version](https://img.shields.io/github/v/release/lukaszraczylo/claude-adam?label=version&color=blue)](https://github.com/lukaszraczylo/claude-adam/releases)
-[![Tests](https://img.shields.io/badge/tests-87%20passing-brightgreen.svg)](./adam/tests/run-tests.sh)
+[![Tests](https://img.shields.io/badge/tests-114%20passing-brightgreen.svg)](./adam/tests/run-tests.sh)
 [![Node](https://img.shields.io/badge/node-22%2B-339933.svg)](https://nodejs.org)
 [![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-lightgrey.svg)]()

@@ -63,8 +63,8 @@ bash ~/.claude/adam/tests/run-tests.sh   # expect: 87 passed, 0 failed
 Pin a release for reproducibility:

 ```sh
-curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/v0.3.3/install.sh \
-  | VERSION=v0.3.3 bash
+curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/v0.5.0/install.sh \
+  | VERSION=v0.5.0 bash
 ```

 ## How it works
@@ -252,6 +252,8 @@ Or pass `--explain` to `/reflect` to render the full trace inline.

 ## What's new

+- **v0.5.0** — MOSS-grounded self-evolution (arXiv 2605.22794). Transcript capture: `context_window` field on struggle signals captures 8 surrounding events for evidence-based diagnosis. Two-stage analysis pipeline: diagnose+plan → inter-stage validation → implement (§3.3). Evidence batching via `adam-batch.mjs`: pre-clusters journal into coherent failure batches (§3.1). Pre-apply verification: 4-check deterministic gate before auto-apply (§3.4). Auto-rollback via `adam-rollback.mjs`: reverts regressed proposals detected by A/B measurement, creates regression nudges (§3.5). Harness self-modification: new `harness_edit` proposal type lets ADAM propose edits to its own scripts with test-suite-gated apply (§1 Table 1). Keypoint matrix: 5 capability dimensions scored per batch for structured evaluation (§4.2). 114 tests (up from 94).
+- **v0.4.0** — expanded struggle detection: `silent_drift` (5 consecutive read-only tools), `error_after_recovery` (same error fingerprint returns after clean recovery); severity-sum scoring with per-type divisors; extended `STRUGGLE_TYPES` set. 94 tests (up from 87).
 - **v0.3.3** — analyst observability, A/B measurement, journal hygiene. ISO-week journal rotation replaces 5MB size-based (fixes silent cluster-straddling under-count); per-signal sliding windows via `adam-window.mjs`; error fingerprint normalisation; correction corpus expanded + weak-token co-occurrence requirement (kills the `"actually, I think..."` false positive); mandatory clustering trace + `adam-explain.mjs`; new `nudge` and `reinforcement` proposal types; per-(skill, fingerprint) cooldown via `adam-cooldown.mjs`; `task_completed` scoring (dampener + reinforcement); A/B effectiveness measurement; upgrade UX overhaul (`adam-upgrade.mjs --list/--diff/--accept`); shared `adam-utils.mjs`. 87 tests (up from 30).
 - **v0.3.2** — `task_completed` signal: post-task skill capture for downstream reinforcement scoring (consumed in v0.3.3).
 - **v0.3.1** — code review pass: bug fixes (`errorFingerprint` no longer false-positives on `is_error: false`, archive script handles same-millisecond duplicates correctly, `tool_window` clears on session change, nudge filters proposal filenames by pattern), prose conciseness cuts, hardened `install.sh` with curl one-liner + settings.json merge, `adam-uninstall.sh`, isolated test harness.