mirror of
https://github.com/lukaszraczylo/claude-adam.git
synced 2026-06-22 02:01:44 +00:00
Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| fcddb6bf79 | |||
| d929101af4 |
@@ -13,7 +13,7 @@ Watches the friction in your coding sessions, clusters the signals via an LLM an
|
||||
|
||||
[](LICENSE)
|
||||
[](https://github.com/lukaszraczylo/claude-adam/releases)
|
||||
[](./adam/tests/run-tests.sh)
|
||||
[](./adam/tests/run-tests.sh)
|
||||
[](https://nodejs.org)
|
||||
[]()
|
||||
|
||||
@@ -54,7 +54,7 @@ The installer copies files into `~/.claude/`, offers to merge ADAM's hook entrie
|
||||
Then:
|
||||
|
||||
```sh
|
||||
bash ~/.claude/adam/tests/run-tests.sh # expect: 132 passed, 0 failed
|
||||
bash ~/.claude/adam/tests/run-tests.sh # expect: 138 passed, 0 failed
|
||||
# … start a fresh Claude Code session …
|
||||
/reflect # walks the proposal queue
|
||||
/reflect --explain # also shows the analyst's clustering trace
|
||||
@@ -63,10 +63,27 @@ bash ~/.claude/adam/tests/run-tests.sh # expect: 132 passed, 0 failed
|
||||
Pin a release for reproducibility:
|
||||
|
||||
```sh
|
||||
curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/v0.6.0/install.sh \
|
||||
| VERSION=v0.6.0 bash
|
||||
curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/v0.6.3/install.sh \
|
||||
| VERSION=v0.6.3 bash
|
||||
```
|
||||
|
||||
### Staying up to date
|
||||
|
||||
`install.sh` records the installed release in `~/.claude/adam/.version`. The
|
||||
SessionStart hook (`adam-nudge.mjs`) then checks the latest GitHub release **at
|
||||
most once a day** (cached in `~/.claude/adam/.update-check.json`, network call
|
||||
hard-capped at 1.5 s, fully best-effort — it never blocks or slows session
|
||||
start). When a newer release exists it prints a one-line, **notify-only** prompt:
|
||||
|
||||
```
|
||||
[adam] update available: v0.6.3 → v0.6.4. Apply: curl -fsSL …/install.sh | bash
|
||||
(re-runs install.sh — resets ADAM's own /reflect-applied skill edits; apply when you're ready)
|
||||
```
|
||||
|
||||
It is deliberately **not** auto-applied: re-running `install.sh` overwrites
|
||||
ADAM's own `/reflect`-applied skill edits, so you decide when to take an update.
|
||||
Disable the check entirely with `ADAM_NO_UPDATE_CHECK=1` in your environment.
|
||||
|
||||
## How it works
|
||||
|
||||
```mermaid
|
||||
@@ -248,11 +265,13 @@ Or pass `--explain` to `/reflect` to render the full trace inline.
|
||||
│ ├── adam-apply-reinforcement.mjs # reinforcement proposal apply
|
||||
│ ├── adam-upgrade.mjs # .adam-new file UX (list/diff/accept)
|
||||
│ └── adam-archive.mjs # post-apply journal cleanup
|
||||
└── tests/run-tests.sh # 132 isolated tests; never touches live state
|
||||
└── tests/run-tests.sh # 138 isolated tests; never touches live state
|
||||
```
|
||||
|
||||
## What's new
|
||||
|
||||
- **v0.6.3** — release-update notifier. `install.sh` now writes a `~/.claude/adam/.version` marker; `adam-nudge.mjs` (SessionStart) compares it against the latest GitHub release at most once/day (cached, 1.5 s network cap, best-effort — never blocks) and prints a **notify-only** one-line update prompt. Deliberately not auto-applied: re-running the installer resets ADAM's own `/reflect`-applied skill edits, so you choose when to update. Opt out with `ADAM_NO_UPDATE_CHECK=1`. See "Staying up to date". 138 tests (up from 134).
|
||||
- **v0.6.2** — two fixes surfaced by running ADAM's loop on a large real journal. **(1) A/B volume normalization** (`adam-ab-measure.mjs`): regressions are now measured on the signal's *share* of total activity (rate = count / window-total), not raw count — so a generally busier journal after an apply no longer masquerades as a regression. Falls back to raw delta when the signal is the only activity in the window (preserves prior behavior + tests); output adds `raw_delta_pct`, `pre_total`, `post_total`, `normalized` for transparency. **(2) Memory frontmatter schema** (`agents/adam.md`, `SKILL.md`): the drafting protocol now emits the live auto-memory shape — `name` = slug + a `metadata: {node_type, type, originSessionId}` block — instead of flat `type:`/`originSessionId:`, so auto-applied memories load and categorize correctly. 134 tests (up from 132).
|
||||
- **v0.6.1** — new `file_reread` signal (MOSS §1 harness self-modification, proposed and approved through ADAM's own `/reflect` loop). Consecutive Reads of the same file at different `offset`/`limit` escaped `retry_loop`'s arg-hash dedup and leaked into `tool_error_loop`; `file_reread` now catches them (same file ≥3× in the 10-event window, offset-agnostic, guarded against double-counting byte-identical reads). Fully wired: detection (`adam-observe.mjs`), 14-day window (`adam-window.mjs`), severity divisor 3 (`adam-score.mjs`), file-basename clustering (`adam-batch.mjs`), and the analyst rubric/spec. 132 tests (up from 126).
|
||||
- **v0.6.0** — review hardening. Struggle signals now emit `active_skills`, so `silent_drift`'s primary cluster key and the §5b skill-attribution sub-clustering (+1 rubric bonus) actually fire (both were silently dead). `proposal_fingerprint` is now deterministically computable via `adam-cooldown.mjs --compute` instead of asking the LLM analyst to hand-compute a djb2 hash; spec now mandates a *stable* cluster id so fingerprints reproduce across runs. `reinforcement` proposals are correctly excluded from A/B tracking (the spec previously contradicted itself). `adam-nudge.mjs` pending-upgrade check now mirrors the full install set (`adam-utils`/`adam-batch`/`adam-rollback` were missing). Doc/test-count drift corrected. 126 tests (up from 114).
|
||||
- **v0.5.0** — MOSS-grounded self-evolution (arXiv 2605.22794). Transcript capture: `context_window` field on struggle signals captures 8 surrounding events for evidence-based diagnosis. Two-stage analysis pipeline: diagnose+plan → inter-stage validation → implement (§3.3). Evidence batching via `adam-batch.mjs`: pre-clusters journal into coherent failure batches (§3.1). Pre-apply verification: 4-check deterministic gate before auto-apply (§3.4). Auto-rollback via `adam-rollback.mjs`: reverts regressed proposals detected by A/B measurement, creates regression nudges (§3.5). Harness self-modification: new `harness_edit` proposal type lets ADAM propose edits to its own scripts with test-suite-gated apply (§1 Table 1). Keypoint matrix: 5 capability dimensions scored per batch for structured evaluation (§4.2). 114 tests (up from 94).
|
||||
|
||||
@@ -3,11 +3,19 @@
|
||||
//
|
||||
// Reads ~/.claude/adam/ab-tracking.jsonl (one line per auto-apply event,
|
||||
// written by adam-self-improvement/SKILL.md), then for each entry old enough
|
||||
// (>= --min-age-days; default 7) compares signal counts in the 7-day window
|
||||
// BEFORE applied_at against the 7-day window AFTER applied_at across the
|
||||
// (>= --min-age-days; default 7) compares the originating signal in the 7-day
|
||||
// window BEFORE applied_at against the 7-day window AFTER applied_at across the
|
||||
// full journal corpus (active + rotated). Surfaces regressions so /reflect
|
||||
// can flag proposals that made things worse.
|
||||
//
|
||||
// Volume normalization: when the windows contain other (non-originating)
|
||||
// activity, the delta is computed on the signal's SHARE of total activity
|
||||
// (rate = count / total), not its raw count — so a generally busier journal
|
||||
// after apply does not masquerade as a regression. When the signal is the only
|
||||
// activity in the windows, it falls back to the raw-count delta. Output carries
|
||||
// both `delta_pct` (drives status) and `raw_delta_pct` + `normalized` for
|
||||
// transparency.
|
||||
//
|
||||
// CLI:
|
||||
// adam-ab-measure.mjs [--home <path>] [--format json|table] [--min-age-days N]
|
||||
//
|
||||
@@ -92,31 +100,60 @@ export function computeDeltas(entries, journal, opts = {}) {
|
||||
|
||||
const preStart = appliedAt - windowDays * DAY_MS;
|
||||
const postEnd = appliedAt + windowDays * DAY_MS;
|
||||
// preCount/postCount = originating-signal occurrences; preTotal/postTotal =
|
||||
// ALL journal entries in the window (the activity denominator).
|
||||
let preCount = 0;
|
||||
let postCount = 0;
|
||||
let preTotal = 0;
|
||||
let postTotal = 0;
|
||||
for (const je of journal || []) {
|
||||
if (!je || typeof je !== "object") continue;
|
||||
if (!sigSet.has(je.type)) continue;
|
||||
const t = tsMs(je);
|
||||
if (Number.isNaN(t)) continue;
|
||||
if (t >= preStart && t < appliedAt) preCount++;
|
||||
else if (t >= appliedAt && t < postEnd) postCount++;
|
||||
const inPre = t >= preStart && t < appliedAt;
|
||||
const inPost = t >= appliedAt && t < postEnd;
|
||||
if (!inPre && !inPost) continue;
|
||||
if (inPre) preTotal++; else postTotal++;
|
||||
if (!sigSet.has(je.type)) continue;
|
||||
if (inPre) preCount++; else postCount++;
|
||||
}
|
||||
|
||||
let status;
|
||||
let deltaPct;
|
||||
let rawDeltaPct = null;
|
||||
let normalized = false;
|
||||
if (preCount === 0) {
|
||||
status = "no_baseline";
|
||||
deltaPct = null;
|
||||
} else {
|
||||
deltaPct = ((postCount - preCount) / preCount) * 100;
|
||||
rawDeltaPct = Math.round(((postCount - preCount) / preCount) * 10000) / 100;
|
||||
// Volume normalization: when the windows contain non-originating activity,
|
||||
// compare the signal's SHARE of activity (rate), not its absolute count —
|
||||
// otherwise a generally busier post-window masquerades as a regression.
|
||||
// No background (signal IS the only activity) → fall back to raw delta,
|
||||
// preserving prior behavior.
|
||||
const hasBackground = (preTotal - preCount) + (postTotal - postCount) > 0;
|
||||
if (hasBackground && postTotal > 0) {
|
||||
const preRate = preCount / preTotal; // preTotal >= preCount > 0
|
||||
const postRate = postCount / postTotal;
|
||||
deltaPct = ((postRate - preRate) / preRate) * 100;
|
||||
normalized = true;
|
||||
} else {
|
||||
deltaPct = ((postCount - preCount) / preCount) * 100;
|
||||
}
|
||||
// Round to 2 dp for stable comparison + presentation.
|
||||
deltaPct = Math.round(deltaPct * 100) / 100;
|
||||
if (deltaPct <= IMPROVED_PCT) status = "improved";
|
||||
else if (deltaPct >= REGRESSED_PCT) status = "regressed";
|
||||
else status = "neutral";
|
||||
}
|
||||
out.push({ ...base, pre_count: preCount, post_count: postCount, delta_pct: deltaPct, status });
|
||||
out.push({
|
||||
...base,
|
||||
pre_count: preCount, post_count: postCount,
|
||||
pre_total: preTotal, post_total: postTotal,
|
||||
raw_delta_pct: rawDeltaPct, normalized,
|
||||
delta_pct: deltaPct, status,
|
||||
});
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
@@ -2014,6 +2014,101 @@ done
|
||||
assert_grep "$ROOT/journal.jsonl" '"type":"retry_loop"' "3x byte-identical reads emit retry_loop"
|
||||
assert_no_grep "$ROOT/journal.jsonl" '"type":"file_reread"' "byte-identical reads NOT double-counted as file_reread (sameToolArgs>=RETRY guard)"
|
||||
|
||||
# --- Test 112: A/B volume normalization — busier journal does NOT fake a regression ---
|
||||
echo "Test 112: A/B volume-normalized (raw +200% but flat share → neutral)"
|
||||
reset_state
|
||||
applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)')
|
||||
cat > "$ROOT/ab-tracking.jsonl" <<EOF
|
||||
{"applied_at":$applied_at_ms,"proposal_id":"ab-vol-001","proposal_type":"memory","target_skill":"vol","proposal_fingerprint":"fpV","originating_signals":[{"type":"correction","count":2,"session_ids":["sV"]}],"pre_window_days":7}
|
||||
EOF
|
||||
> "$ROOT/journal.jsonl"
|
||||
# pre window: 2 correction + 8 dead_end (rate 0.2)
|
||||
for i in 1 2; do ts=$(node -e "console.log(new Date(Date.now()-(15+$i*0.2)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV\",\"type\":\"correction\"}" >> "$ROOT/journal.jsonl"; done
|
||||
for i in 1 2 3 4 5 6 7 8; do ts=$(node -e "console.log(new Date(Date.now()-(15+$i*0.1)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV\",\"type\":\"dead_end\"}" >> "$ROOT/journal.jsonl"; done
|
||||
# post window: 6 correction + 24 dead_end (rate 0.2 — share unchanged, raw count +200%)
|
||||
for i in $(seq 1 6); do ts=$(node -e "console.log(new Date(Date.now()-(8+$i*0.1)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV\",\"type\":\"correction\"}" >> "$ROOT/journal.jsonl"; done
|
||||
for i in $(seq 1 24); do ts=$(node -e "console.log(new Date(Date.now()-(8+$i*0.05)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV\",\"type\":\"dead_end\"}" >> "$ROOT/journal.jsonl"; done
|
||||
out=$(ABMEASURE_RUN --format json 2>/dev/null)
|
||||
if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-vol-001");process.exit(e&&e.normalized===true&&e.raw_delta_pct===200&&e.status==="neutral"?0:1)})'; then
|
||||
echo " PASS: volume growth normalized → neutral (raw +200%)"; PASS=$((PASS+1))
|
||||
else
|
||||
echo " FAIL: volume normalization wrong (got: $out)"; FAIL=$((FAIL+1))
|
||||
fi
|
||||
rm -f "$ROOT/ab-tracking.jsonl"
|
||||
|
||||
# --- Test 113: A/B genuine rate regression still flagged ---
|
||||
echo "Test 113: A/B genuine share increase → regressed"
|
||||
reset_state
|
||||
applied_at_ms=$(node -e 'console.log(Date.now() - 14*86400000)')
|
||||
cat > "$ROOT/ab-tracking.jsonl" <<EOF
|
||||
{"applied_at":$applied_at_ms,"proposal_id":"ab-vol-002","proposal_type":"memory","target_skill":"vol2","proposal_fingerprint":"fpV2","originating_signals":[{"type":"correction","count":2,"session_ids":["sV2"]}],"pre_window_days":7}
|
||||
EOF
|
||||
> "$ROOT/journal.jsonl"
|
||||
# pre: 2 correction + 8 dead_end (rate 0.2)
|
||||
for i in 1 2; do ts=$(node -e "console.log(new Date(Date.now()-(15+$i*0.2)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV2\",\"type\":\"correction\"}" >> "$ROOT/journal.jsonl"; done
|
||||
for i in 1 2 3 4 5 6 7 8; do ts=$(node -e "console.log(new Date(Date.now()-(15+$i*0.1)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV2\",\"type\":\"dead_end\"}" >> "$ROOT/journal.jsonl"; done
|
||||
# post: 6 correction + 6 dead_end (rate 0.5 — share up → genuine regression)
|
||||
for i in $(seq 1 6); do ts=$(node -e "console.log(new Date(Date.now()-(8+$i*0.1)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV2\",\"type\":\"correction\"}" >> "$ROOT/journal.jsonl"; done
|
||||
for i in 1 2 3 4 5 6; do ts=$(node -e "console.log(new Date(Date.now()-(8+$i*0.07)*86400000).toISOString())"); echo "{\"ts\":\"$ts\",\"session\":\"sV2\",\"type\":\"dead_end\"}" >> "$ROOT/journal.jsonl"; done
|
||||
out=$(ABMEASURE_RUN --format json 2>/dev/null)
|
||||
if echo "$out" | node -e 'let b="";process.stdin.on("data",d=>b+=d).on("end",()=>{const a=JSON.parse(b);const e=a.find(x=>x.proposal_id==="ab-vol-002");process.exit(e&&e.normalized===true&&e.status==="regressed"?0:1)})'; then
|
||||
echo " PASS: genuine share increase → regressed"; PASS=$((PASS+1))
|
||||
else
|
||||
echo " FAIL: genuine regression missed (got: $out)"; FAIL=$((FAIL+1))
|
||||
fi
|
||||
rm -f "$ROOT/ab-tracking.jsonl"
|
||||
|
||||
# --- Test 114: update notifier nudges from cache when a newer release exists (no network) ---
|
||||
echo "Test 114: update notifier — cached newer release prints nudge"
|
||||
reset_state
|
||||
printf 'v0.6.2\n' > "$ROOT/.version"
|
||||
node -e "require('fs').writeFileSync('$ROOT/.update-check.json', JSON.stringify({last_check: Date.now(), latest: 'v9.9.9'}))"
|
||||
out=$(echo '{"hook_event_name":"SessionStart","session_id":"sUP"}' | NUDGE_RUN 2>/dev/null)
|
||||
if echo "$out" | grep -q "update available: v0.6.2 → v9.9.9"; then
|
||||
echo " PASS: update nudge printed from cache (offline)"; PASS=$((PASS+1))
|
||||
else
|
||||
echo " FAIL: expected update nudge (got: $out)"; FAIL=$((FAIL+1))
|
||||
fi
|
||||
rm -f "$ROOT/.version" "$ROOT/.update-check.json"
|
||||
|
||||
# --- Test 115: update notifier silent when installed is current ---
|
||||
echo "Test 115: update notifier — up-to-date is silent"
|
||||
reset_state
|
||||
printf 'v9.9.9\n' > "$ROOT/.version"
|
||||
node -e "require('fs').writeFileSync('$ROOT/.update-check.json', JSON.stringify({last_check: Date.now(), latest: 'v9.9.9'}))"
|
||||
out=$(echo '{"hook_event_name":"SessionStart","session_id":"sUP2"}' | NUDGE_RUN 2>/dev/null)
|
||||
if echo "$out" | grep -q "update available"; then
|
||||
echo " FAIL: nudged despite being current (got: $out)"; FAIL=$((FAIL+1))
|
||||
else
|
||||
echo " PASS: no nudge when up-to-date"; PASS=$((PASS+1))
|
||||
fi
|
||||
rm -f "$ROOT/.version" "$ROOT/.update-check.json"
|
||||
|
||||
# --- Test 116: ADAM_NO_UPDATE_CHECK disables the notifier ---
|
||||
echo "Test 116: ADAM_NO_UPDATE_CHECK opt-out"
|
||||
reset_state
|
||||
printf 'v0.6.2\n' > "$ROOT/.version"
|
||||
node -e "require('fs').writeFileSync('$ROOT/.update-check.json', JSON.stringify({last_check: Date.now(), latest: 'v9.9.9'}))"
|
||||
out=$(echo '{"hook_event_name":"SessionStart","session_id":"sUP3"}' | HOME="$TMP_HOME" ADAM_NO_UPDATE_CHECK=1 node "$NUDGE" 2>/dev/null)
|
||||
if echo "$out" | grep -q "update available"; then
|
||||
echo " FAIL: notifier ran despite opt-out (got: $out)"; FAIL=$((FAIL+1))
|
||||
else
|
||||
echo " PASS: ADAM_NO_UPDATE_CHECK suppressed the check"; PASS=$((PASS+1))
|
||||
fi
|
||||
rm -f "$ROOT/.version" "$ROOT/.update-check.json"
|
||||
|
||||
# --- Test 117: no .version marker → notifier no-op (no crash) ---
|
||||
echo "Test 117: missing .version marker → notifier silent, hook still runs"
|
||||
reset_state
|
||||
node -e "require('fs').writeFileSync('$ROOT/.update-check.json', JSON.stringify({last_check: Date.now(), latest: 'v9.9.9'}))"
|
||||
out=$(echo '{"hook_event_name":"SessionStart","session_id":"sUP4"}' | NUDGE_RUN 2>/dev/null)
|
||||
if echo "$out" | grep -q "update available"; then
|
||||
echo " FAIL: nudged without a .version marker (got: $out)"; FAIL=$((FAIL+1))
|
||||
else
|
||||
echo " PASS: no marker → no update nudge"; PASS=$((PASS+1))
|
||||
fi
|
||||
rm -f "$ROOT/.update-check.json"
|
||||
|
||||
echo
|
||||
echo "Results: $PASS passed, $FAIL failed"
|
||||
[ "$FAIL" = "0" ]
|
||||
|
||||
+16
-9
@@ -250,10 +250,12 @@ Required structure:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: <human-readable name, ≤80 chars>
|
||||
description: <one-line description used to decide future relevance — be specific, ≤200 chars>
|
||||
type: user | feedback | project | reference
|
||||
originSessionId: <session_id from journal entries that fed this cluster>
|
||||
name: <slug — snake_case, MUST equal the target filename without `.md`, e.g. feedback_go_test_cache>
|
||||
description: "<one-line used to decide future relevance — be specific, ≤200 chars>"
|
||||
metadata:
|
||||
node_type: memory
|
||||
type: user | feedback | project | reference
|
||||
originSessionId: <session_id from journal entries that fed this cluster>
|
||||
---
|
||||
|
||||
<Body content per type, see CLAUDE.md memory schema:
|
||||
@@ -263,12 +265,17 @@ originSessionId: <session_id from journal entries that fed this cluster>
|
||||
- reference: pointer to external system + what's there.>
|
||||
```
|
||||
|
||||
The frontmatter MUST match the live auto-memory schema exactly: `name` is the
|
||||
slug (NOT a prose title), and `node_type`, `type`, `originSessionId` live under
|
||||
a `metadata:` block (verify against an existing file in the target memory dir
|
||||
before drafting — match its shape).
|
||||
|
||||
Constraints:
|
||||
- Frontmatter fields `name`, `description`, `type` are **required**. Skill enforces this at apply time.
|
||||
- `originSessionId` is required — must be a `session` value from one of the cluster's journal entries.
|
||||
- Top-level `name` + `description` and nested `metadata.node_type` (always `memory`) + `metadata.type` are **required**. Skill enforces this at apply time.
|
||||
- `metadata.originSessionId` is required — must be a `session` value from one of the cluster's journal entries.
|
||||
- ≤50 LOC of body content. Surgical.
|
||||
- Slug (used in `target` path filename) must not collide with any existing memory file.
|
||||
- For `type=feedback` and `type=project`, body MUST contain `**Why:**` and `**How to apply:**` lines (CLAUDE.md memory schema).
|
||||
- `name`/slug (also the `target` path filename) must not collide with any existing memory file.
|
||||
- For `type: feedback` and `type: project`, body MUST contain `**Why:**` and `**How to apply:**` lines (CLAUDE.md memory schema).
|
||||
|
||||
## Diagnosis drafting protocol (required for every proposal)
|
||||
|
||||
@@ -509,7 +516,7 @@ MOSS's core thesis: "routing, hook ordering, state invariants, and dispatch live
|
||||
2. `cross_session_evidence == true` (≥5 occurrences across ≥3 sessions)
|
||||
3. `auto_apply_eligible: false` — **always**. Harness edits are never auto-applied.
|
||||
4. `blast_radius: high`
|
||||
5. Proposal includes a `# Test verification` section with the command `bash ~/.claude/adam/tests/run-tests.sh` and the expected result "132 passed, 0 failed" (or current pass count). The skill runs this test before applying.
|
||||
5. Proposal includes a `# Test verification` section with the command `bash ~/.claude/adam/tests/run-tests.sh` and the expected result "138 passed, 0 failed" (or current pass count). The skill runs this test before applying.
|
||||
6. Change is surgical: ≤30 LOC diff, single file.
|
||||
7. `# Diagnosis` reconstructs the causal chain from harness-level behavior (not from text-artifact behavior). The mismatch must name a specific code path (function, regex, threshold) in the target file.
|
||||
|
||||
|
||||
+86
-4
@@ -1,9 +1,17 @@
|
||||
#!/usr/bin/env node
|
||||
// adam-nudge.mjs — SessionStart hook. Prints two kinds of reminders:
|
||||
// adam-nudge.mjs — SessionStart hook. Prints reminders:
|
||||
// 1. Pending proposals (≥3 queued in adam/proposals/).
|
||||
// 2. Cross-session nudges (entries in adam/active-nudges.json whose
|
||||
// source_session differs from the current session and that haven't
|
||||
// expired or exhausted their max_displays).
|
||||
// 3. Pending local-edit upgrades (`.adam-new` sidecars).
|
||||
// 4. New-release notice: if a newer GitHub release exists than the installed
|
||||
// `.version`, print a notify-only one-line update prompt. Cached + checked
|
||||
// at most once/day, network call hard-capped at 1.5s, fully best-effort —
|
||||
// never blocks SessionStart. Opt out with ADAM_NO_UPDATE_CHECK=1.
|
||||
// NOTE: notify-only by design — applying an update re-runs install.sh,
|
||||
// which resets ADAM's own /reflect-applied skill edits. The user chooses
|
||||
// when to accept that, so we never auto-install.
|
||||
import { readdirSync, readFileSync, writeFileSync, existsSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { homedir } from "node:os";
|
||||
@@ -14,7 +22,13 @@ const ADAM_ROOT = join(CLAUDE_ROOT, "adam");
|
||||
const PROPOSALS = join(ADAM_ROOT, "proposals");
|
||||
const NUDGES_FILE = join(ADAM_ROOT, "active-nudges.json");
|
||||
const STATE_FILE = join(ADAM_ROOT, "state.json");
|
||||
const VERSION_FILE = join(ADAM_ROOT, ".version");
|
||||
const UPDATE_CHECK_FILE = join(ADAM_ROOT, ".update-check.json");
|
||||
const THRESHOLD = 3;
|
||||
const UPDATE_CHECK_INTERVAL_MS = 24 * 60 * 60 * 1000;
|
||||
const UPDATE_FETCH_TIMEOUT_MS = 1500;
|
||||
const RELEASES_API = "https://api.github.com/repos/lukaszraczylo/claude-adam/releases/latest";
|
||||
const INSTALL_ONELINER = "curl -fsSL https://raw.githubusercontent.com/lukaszraczylo/claude-adam/main/install.sh | bash";
|
||||
|
||||
// Known installable paths (mirrors install.sh copy_file list). Checking a
|
||||
// fixed shortlist keeps SessionStart latency under control vs full FS walk.
|
||||
@@ -118,7 +132,75 @@ function emitPendingUpgrades() {
|
||||
} catch { /* never break SessionStart */ }
|
||||
}
|
||||
|
||||
function main() {
|
||||
// --- update notifier (notify-only; see header note) ---
|
||||
|
||||
function readVersion() {
|
||||
try { return readFileSync(VERSION_FILE, "utf8").trim() || null; } catch { return null; }
|
||||
}
|
||||
|
||||
// Parse "vX.Y.Z" (leading v optional; pre-release/build suffix ignored).
|
||||
function parseSemver(s) {
|
||||
if (typeof s !== "string") return null;
|
||||
const m = s.trim().replace(/^v/i, "").match(/^(\d+)\.(\d+)\.(\d+)/);
|
||||
return m ? [Number(m[1]), Number(m[2]), Number(m[3])] : null;
|
||||
}
|
||||
|
||||
// isNewer(a, b): true iff version a is strictly newer than b. Unparseable → false.
|
||||
function isNewer(a, b) {
|
||||
const pa = parseSemver(a), pb = parseSemver(b);
|
||||
if (!pa || !pb) return false;
|
||||
for (let i = 0; i < 3; i++) { if (pa[i] !== pb[i]) return pa[i] > pb[i]; }
|
||||
return false;
|
||||
}
|
||||
|
||||
async function fetchLatestTag() {
|
||||
// Best-effort, hard-capped. Any failure (offline / timeout / rate-limit /
|
||||
// parse / fetch-unavailable) returns null and the caller silently skips.
|
||||
try {
|
||||
if (typeof fetch !== "function") return null;
|
||||
const ctrl = new AbortController();
|
||||
const timer = setTimeout(() => ctrl.abort(), UPDATE_FETCH_TIMEOUT_MS);
|
||||
let tag = null;
|
||||
try {
|
||||
const res = await fetch(RELEASES_API, {
|
||||
signal: ctrl.signal,
|
||||
headers: { "User-Agent": "claude-adam-nudge", "Accept": "application/vnd.github+json" },
|
||||
});
|
||||
if (res && res.ok) {
|
||||
const j = await res.json();
|
||||
if (j && typeof j.tag_name === "string") tag = j.tag_name;
|
||||
}
|
||||
} finally { clearTimeout(timer); }
|
||||
return tag;
|
||||
} catch { return null; }
|
||||
}
|
||||
|
||||
function printUpdateNudge(latest, installed) {
|
||||
process.stdout.write(
|
||||
`[adam] update available: ${installed} → ${latest}. Apply: ${INSTALL_ONELINER}\n` +
|
||||
` (re-runs install.sh — resets ADAM's own /reflect-applied skill edits; apply when you're ready)\n`
|
||||
);
|
||||
}
|
||||
|
||||
async function emitUpdateCheck() {
|
||||
if (process.env.ADAM_NO_UPDATE_CHECK) return; // explicit opt-out
|
||||
const installed = readVersion();
|
||||
if (!installed) return; // no marker → nothing to compare
|
||||
const cache = readJson(UPDATE_CHECK_FILE, {}) || {};
|
||||
const now = Date.now();
|
||||
let nudged = false;
|
||||
// Instant nudge from cache (no network).
|
||||
if (cache.latest && isNewer(cache.latest, installed)) { printUpdateNudge(cache.latest, installed); nudged = true; }
|
||||
// Refresh cache at most once/day, best-effort — drives the nudge on the NEXT run.
|
||||
if (!cache.last_check || (now - Number(cache.last_check)) > UPDATE_CHECK_INTERVAL_MS) {
|
||||
const latest = await fetchLatestTag();
|
||||
const next = { last_check: now, latest: latest || cache.latest || null };
|
||||
try { writeFileSync(UPDATE_CHECK_FILE, JSON.stringify(next)); } catch { /* swallow */ }
|
||||
if (latest && !nudged && isNewer(latest, installed)) printUpdateNudge(latest, installed);
|
||||
}
|
||||
}
|
||||
|
||||
async function main() {
|
||||
const stdinSession = readSessionInput();
|
||||
const stateSession = (() => {
|
||||
const st = readJson(STATE_FILE, null);
|
||||
@@ -128,7 +210,7 @@ function main() {
|
||||
emitProposalReminder();
|
||||
emitActiveNudges(currentSession);
|
||||
emitPendingUpgrades();
|
||||
await emitUpdateCheck();
|
||||
}
|
||||
|
||||
try { main(); } catch { /* never block SessionStart */ }
|
||||
process.exit(0);
|
||||
main().catch(() => { /* never block SessionStart */ }).finally(() => process.exit(0));
|
||||
|
||||
+14
@@ -143,6 +143,20 @@ copy_file "$SRC/adam/tests/fixtures/seed-corrections.jsonl" "$DEST/adam
|
||||
# install marker — used by future runs to detect local mtime drift
|
||||
run "touch \"$DEST/adam/.install-marker\""
|
||||
|
||||
# version marker — records the installed release tag for the update notifier
|
||||
# (adam-nudge.mjs compares it against the latest GitHub release).
|
||||
ADAM_VERSION=""
|
||||
if [ -n "$VERSION" ]; then
|
||||
ADAM_VERSION="$VERSION"
|
||||
elif [ "$PIPED" = 1 ] && [ -n "${REF:-}" ]; then
|
||||
ADAM_VERSION="$REF"
|
||||
else
|
||||
ADAM_VERSION="$(git -C "$SRC" describe --tags --abbrev=0 2>/dev/null || true)"
|
||||
fi
|
||||
[ -z "$ADAM_VERSION" ] && ADAM_VERSION="unknown"
|
||||
run "printf '%s\\n' \"$ADAM_VERSION\" > \"$DEST/adam/.version\""
|
||||
log " version marker: $ADAM_VERSION"
|
||||
|
||||
# --------------------------------------------------------------------- settings.json
|
||||
SETTINGS="$DEST/settings.json"
|
||||
EXAMPLE="$SRC/settings.json.example"
|
||||
|
||||
@@ -300,7 +300,7 @@ Before writing any proposal:
|
||||
- For `skill_new`: confirm the slug doesn't collide with any existing skill in `~/.claude/skills/`. If it does, refuse and ask user to rename.
|
||||
- For `skill_edit`: confirm the diff is append-only (no `-` lines that remove existing content) and that target SKILL.md exists. When auto-applying, ALSO re-verify the eligibility gate steps in §3 (cooldown, blacklist, byte cap) before any `Edit` call — never trust frontmatter alone.
|
||||
- For `skill_edit` with `auto_apply_eligible: true`: confirm `contradiction_flag` is absent or null in frontmatter. Refuse auto-apply if `contradiction_flag` is set with any non-empty value (treat the agent's flag as a hard veto on auto-apply; user can still manually approve in walk-the-queue if they disagree with the heuristic).
|
||||
- For `memory`: confirm `# Proposed change` body starts with `---` frontmatter containing required fields `name`, `description`, `type`, `originSessionId`. Refuse if frontmatter missing — agent must redraft per the Memory drafting protocol.
|
||||
- For `memory`: confirm `# Proposed change` body starts with `---` frontmatter matching the live auto-memory schema — top-level `name` (the slug) + `description`, plus a `metadata:` block with `node_type: memory`, `type`, and `originSessionId`. Cross-check the shape against an existing file in the target memory dir. Refuse if frontmatter is flat (`type:`/`originSessionId:` at top level) or missing the `metadata:` block — agent must redraft per the Memory drafting protocol.
|
||||
- For `harness_edit`: confirm `auto_apply_eligible: false` (never auto-apply). Confirm `confidence ≥ 5`. Confirm `# Test verification` section names the test command. Confirm diff is ≤30 LOC and targets a single allowed harness file (see `agents/adam.md` §"Harness self-modification"). Run test suite before AND after applying — revert on any regression.
|
||||
- Confirm `source_entries` is present in proposal frontmatter as a non-empty list (used for archive). Warn (do not refuse) if missing — legacy proposals from before v0.2.0 won't have it.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user