mirror of
https://github.com/lukaszraczylo/claude-mnemonic.git
synced 2026-06-05 23:03:55 +00:00
5c2685c7b6
* feat(leann-phase2): implement hybrid vector storage and graph-based search
- [x] Add AST-aware code chunking for Go, Python, and TypeScript using tree-sitter
- [x] Implement LEANN-inspired hybrid vector storage with hub detection and selective embedding storage (60-80% savings)
- [x] Add observation relationship graph with CSR format and edge detection (file overlap, semantic similarity, temporal, concept)
- [x] Implement graph-aware search with two-level traversal and relationship-based ranking
- [x] Add auto-tuning system for dynamic hub threshold adjustment based on query performance
- [x] Add comprehensive metrics tracking for vector storage, queries, latency, and graph traversals
- [x] Update configuration system with graph and hybrid storage settings
- [x] Add graph stats and vector metrics endpoints to worker service
- [x] Enhance UI sidebar with advanced metrics display and graph visualization
- [x] Optimize struct field alignment throughout codebase for memory efficiency
- [x] Update documentation with LEANN Phase 2 features and performance benefits
- [x] Add tree-sitter dependency for AST parsing
* fix: add fts5 build tag to CI workflow
Pass build-tags: "fts5" to shared workflow to properly compile
sqlite-vec-go-bindings with SQLite FTS5 support.
This fixes test failures in hybrid vector storage tests that require
CGO and FTS5 build tags.
Requires shared-actions@8f7f235 or later.
* docs: add testing documentation and macOS ARM64 known issue
Document the macOS ARM64 CGO linking issue with sqlite-vec-go-bindings
that prevents hybrid package tests from compiling locally.
Added:
- .github/TESTING.md: Comprehensive testing guide with platform-specific
issues, workarounds, and CI configuration details
- internal/vector/hybrid/README.md: Package-specific documentation
explaining the macOS limitation
- .github/CI_FIX_SUMMARY.md: Technical details of the CI fix
Key points:
- 41 out of 42 packages test successfully on all platforms
- hybrid package tests fail only on macOS ARM64 (local dev issue)
- Linux CI tests pass with proper build-tags: "fts5" configuration
- Production builds and runtime functionality unaffected
This is a known limitation of sqlite-vec-go-bindings on macOS ARM64
and does not impact CI/CD or production deployments.
* fix: add SQLite busy_timeout to prevent database locked errors
Set PRAGMA busy_timeout=5000 (5 seconds) to allow SQLite to retry
when the database is locked instead of failing immediately.
This fixes race conditions when multiple goroutines try to write
simultaneously, particularly in tests where StoreObservation spawns
async cleanup goroutines.
Root cause:
- StoreObservation launches goroutine -> CleanupOldObservations
- Multiple concurrent cleanups caused "database is locked" errors
- Without busy_timeout, SQLite fails immediately on lock contention
Solution:
- Add 5-second busy timeout for automatic retry on lock
- Standard practice for concurrent SQLite usage
- Works with existing WAL mode configuration
Fixes TestObservationStore_CleanupOldObservations in CI.
* docs: complete summary of all CI test fixes
Comprehensive documentation of all fixes applied:
1. Missing build tags (fts5)
2. Database locked errors (busy_timeout)
All 41/42 packages now pass tests. The hybrid package has a known
macOS ARM64 limitation that doesn't affect CI or production.
No functionality was removed - all fixes are additive only.
* fix: add SQLite driver import to hybrid tests for CGO linking
Add blank import of mattn/go-sqlite3 to hybrid test files to ensure
the SQLite driver is linked into the test binary. This provides the
SQLite symbols that sqlite-vec-go-bindings requires.
Root cause:
- hybrid package imports sqlitevec (transitively depends on sqlite-vec CGO)
- Test binary needs SQLite symbols for linking
- sqlitevec tests already had this import, but hybrid tests didn't
- Without the driver import, linker fails with "undefined symbols"
This fix enables hybrid tests to run with -race flag on all platforms.
Before: 41/42 packages pass (hybrid failed to link)
After: 42/42 packages pass ✅
Fixes hybrid test compilation on macOS ARM64, Linux, and Windows.
* docs: remove outdated macOS limitation documentation
The hybrid test linking issue has been fixed by adding the SQLite
driver import. All tests now pass on all platforms including macOS.
Removed:
- internal/vector/hybrid/README.md (documented workaround no longer needed)
- .github/TESTING.md (macOS limitation section obsolete)
All 42/42 packages now test successfully with -race flag.
* docs: final comprehensive summary of all CI fixes
All three issues now resolved:
1. Missing fts5 build tags
2. Database busy_timeout for concurrent writes
3. Missing SQLite driver import in hybrid tests
Result: 42/42 packages pass with -race on all platforms.
Credit to reviewer for identifying the race detector concern.
830 lines
24 KiB
Go
830 lines
24 KiB
Go
// Package sdk provides SDK agent integration for claude-mnemonic.
|
|
package sdk
|
|
|
|
import (
|
|
"bytes"
|
|
"context"
|
|
"fmt"
|
|
"os"
|
|
"os/exec"
|
|
"path/filepath"
|
|
"strings"
|
|
"time"
|
|
|
|
json "github.com/goccy/go-json"
|
|
|
|
"github.com/lukaszraczylo/claude-mnemonic/internal/config"
|
|
"github.com/lukaszraczylo/claude-mnemonic/internal/db/gorm"
|
|
"github.com/lukaszraczylo/claude-mnemonic/pkg/models"
|
|
"github.com/lukaszraczylo/claude-mnemonic/pkg/similarity"
|
|
"github.com/rs/zerolog/log"
|
|
)
|
|
|
|
// BroadcastFunc is a callback for broadcasting events to SSE clients.
|
|
type BroadcastFunc func(event map[string]interface{})
|
|
|
|
// SyncObservationFunc is a callback for syncing observations to vector DB.
|
|
type SyncObservationFunc func(obs *models.Observation)
|
|
|
|
// SyncSummaryFunc is a callback for syncing summaries to vector DB.
|
|
type SyncSummaryFunc func(summary *models.SessionSummary)
|
|
|
|
// Processor handles SDK agent processing of observations and summaries using Claude Code CLI.
|
|
type Processor struct {
|
|
observationStore *gorm.ObservationStore
|
|
summaryStore *gorm.SummaryStore
|
|
broadcastFunc BroadcastFunc
|
|
syncObservationFunc SyncObservationFunc
|
|
syncSummaryFunc SyncSummaryFunc
|
|
sem chan struct{}
|
|
claudePath string
|
|
model string
|
|
}
|
|
|
|
// SetBroadcastFunc sets the broadcast callback for SSE events.
|
|
func (p *Processor) SetBroadcastFunc(fn BroadcastFunc) {
|
|
p.broadcastFunc = fn
|
|
}
|
|
|
|
// SetSyncObservationFunc sets the callback for syncing observations to vector DB.
|
|
func (p *Processor) SetSyncObservationFunc(fn SyncObservationFunc) {
|
|
p.syncObservationFunc = fn
|
|
}
|
|
|
|
// SetSyncSummaryFunc sets the callback for syncing summaries to vector DB.
|
|
func (p *Processor) SetSyncSummaryFunc(fn SyncSummaryFunc) {
|
|
p.syncSummaryFunc = fn
|
|
}
|
|
|
|
// broadcast sends an event via the broadcast callback if set.
|
|
func (p *Processor) broadcast(event map[string]interface{}) {
|
|
if p.broadcastFunc != nil {
|
|
p.broadcastFunc(event)
|
|
}
|
|
}
|
|
|
|
// MaxConcurrentCLICalls is the maximum number of concurrent Claude CLI calls.
|
|
// This prevents overwhelming the API and manages resource usage.
|
|
const MaxConcurrentCLICalls = 4
|
|
|
|
// NewProcessor creates a new SDK processor.
|
|
func NewProcessor(observationStore *gorm.ObservationStore, summaryStore *gorm.SummaryStore) (*Processor, error) {
|
|
cfg := config.Get()
|
|
|
|
// Find Claude Code CLI
|
|
claudePath := cfg.ClaudeCodePath
|
|
if claudePath == "" {
|
|
// Try to find in PATH
|
|
path, err := exec.LookPath("claude")
|
|
if err != nil {
|
|
return nil, fmt.Errorf("claude CLI not found in PATH and CLAUDE_CODE_PATH not set")
|
|
}
|
|
claudePath = path
|
|
}
|
|
|
|
// Verify it exists
|
|
if _, err := os.Stat(claudePath); err != nil {
|
|
return nil, fmt.Errorf("claude CLI not found at %s: %w", claudePath, err)
|
|
}
|
|
|
|
return &Processor{
|
|
claudePath: claudePath,
|
|
model: cfg.Model,
|
|
observationStore: observationStore,
|
|
summaryStore: summaryStore,
|
|
sem: make(chan struct{}, MaxConcurrentCLICalls),
|
|
}, nil
|
|
}
|
|
|
|
// IsAvailable checks if the Claude CLI is available for processing.
|
|
func (p *Processor) IsAvailable() bool {
|
|
_, err := os.Stat(p.claudePath)
|
|
return err == nil
|
|
}
|
|
|
|
// ProcessObservation processes a single tool observation and extracts insights.
|
|
func (p *Processor) ProcessObservation(ctx context.Context, sdkSessionID, project string, toolName string, toolInput, toolResponse interface{}, promptNumber int, cwd string) error {
|
|
// Skip certain tools that aren't worth processing
|
|
if shouldSkipTool(toolName) {
|
|
log.Info().Str("tool", toolName).Msg("Skipping tool (not interesting for memory)")
|
|
return nil
|
|
}
|
|
|
|
// Convert tool data to strings for pre-filtering
|
|
inputStr := toJSONString(toolInput)
|
|
outputStr := toJSONString(toolResponse)
|
|
|
|
// Pre-filter trivial operations without calling Haiku
|
|
if shouldSkipTrivialOperation(toolName, inputStr, outputStr) {
|
|
log.Debug().Str("tool", toolName).Msg("Skipping trivial operation (pre-filter)")
|
|
return nil
|
|
}
|
|
|
|
log.Info().Str("tool", toolName).Msg("Processing tool execution with Claude CLI")
|
|
|
|
// Note: Removed the "file already has observations" check
|
|
// Each tool execution can produce unique insights even for the same file
|
|
// Similarity-based deduplication will handle true duplicates
|
|
|
|
// Build the prompt
|
|
exec := ToolExecution{
|
|
ToolName: toolName,
|
|
ToolInput: inputStr,
|
|
ToolOutput: outputStr,
|
|
CWD: cwd,
|
|
}
|
|
prompt := BuildObservationPrompt(exec)
|
|
|
|
// Acquire semaphore slot (limits concurrent CLI calls)
|
|
select {
|
|
case p.sem <- struct{}{}:
|
|
defer func() { <-p.sem }()
|
|
case <-ctx.Done():
|
|
return ctx.Err()
|
|
}
|
|
|
|
// Call Claude Code CLI
|
|
response, err := p.callClaudeCLI(ctx, prompt)
|
|
if err != nil {
|
|
log.Error().Err(err).Str("tool", toolName).Msg("Failed to call Claude CLI for observation")
|
|
return err
|
|
}
|
|
|
|
// Parse observations from response
|
|
observations := ParseObservations(response, sdkSessionID)
|
|
if len(observations) == 0 {
|
|
log.Info().Str("tool", toolName).Msg("No observations extracted (Claude deemed not significant)")
|
|
return nil
|
|
}
|
|
|
|
// Get existing observations for deduplication
|
|
existingObs, err := p.observationStore.GetRecentObservations(ctx, project, 50)
|
|
if err != nil {
|
|
log.Warn().Err(err).Msg("Failed to get existing observations for dedup check")
|
|
existingObs = nil // Continue without dedup
|
|
}
|
|
|
|
// Store each observation (with deduplication check)
|
|
const similarityThreshold = 0.4 // Same threshold as retrieval clustering
|
|
var storedCount, skippedCount int
|
|
|
|
for _, obs := range observations {
|
|
// Capture file modification times for staleness detection
|
|
obs.FileMtimes = captureFileMtimes(obs.FilesRead, obs.FilesModified, cwd)
|
|
|
|
// Convert to stored observation for similarity check
|
|
storedObs := obs.ToStoredObservation()
|
|
|
|
// Check if this observation is too similar to existing ones
|
|
if existingObs != nil && similarity.IsSimilarToAny(storedObs, existingObs, similarityThreshold) {
|
|
log.Debug().
|
|
Str("type", string(obs.Type)).
|
|
Str("title", obs.Title).
|
|
Msg("Skipping observation - too similar to existing")
|
|
skippedCount++
|
|
continue
|
|
}
|
|
|
|
id, createdAtEpoch, err := p.observationStore.StoreObservation(ctx, sdkSessionID, project, obs, promptNumber, 0)
|
|
if err != nil {
|
|
log.Error().Err(err).Msg("Failed to store observation")
|
|
continue
|
|
}
|
|
|
|
storedCount++
|
|
log.Info().
|
|
Int64("id", id).
|
|
Str("type", string(obs.Type)).
|
|
Str("title", obs.Title).
|
|
Int("trackedFiles", len(obs.FileMtimes)).
|
|
Msg("Observation stored")
|
|
|
|
// Sync to vector DB if callback is set
|
|
if p.syncObservationFunc != nil {
|
|
fullObs := models.NewObservation(sdkSessionID, project, obs, promptNumber, 0)
|
|
fullObs.ID = id
|
|
fullObs.CreatedAtEpoch = createdAtEpoch
|
|
p.syncObservationFunc(fullObs)
|
|
}
|
|
|
|
// Broadcast new observation event for dashboard refresh
|
|
p.broadcast(map[string]interface{}{
|
|
"type": "observation",
|
|
"action": "created",
|
|
"id": id,
|
|
"project": project,
|
|
})
|
|
|
|
// Add to existing for subsequent dedup checks within same batch
|
|
if existingObs != nil {
|
|
existingObs = append(existingObs, storedObs)
|
|
}
|
|
}
|
|
|
|
if skippedCount > 0 {
|
|
log.Info().
|
|
Int("stored", storedCount).
|
|
Int("skipped", skippedCount).
|
|
Msg("Observation processing complete (duplicates skipped)")
|
|
}
|
|
|
|
return nil
|
|
}
|
|
|
|
// ProcessSummary processes a session summary request.
|
|
func (p *Processor) ProcessSummary(ctx context.Context, sessionDBID int64, sdkSessionID, project, userPrompt, lastUserMsg, lastAssistantMsg string) error {
|
|
// Debug: log what we received
|
|
log.Debug().
|
|
Int64("sessionId", sessionDBID).
|
|
Int("lastAssistantMsgLen", len(lastAssistantMsg)).
|
|
Str("lastAssistantMsgPreview", truncateForLog(lastAssistantMsg, 200)).
|
|
Msg("ProcessSummary called")
|
|
|
|
// Skip summary generation if there's no meaningful assistant response
|
|
// This prevents generic "initial session setup" summaries
|
|
if !hasMeaningfulContent(lastAssistantMsg) {
|
|
log.Info().
|
|
Int64("sessionId", sessionDBID).
|
|
Int("msgLen", len(lastAssistantMsg)).
|
|
Msg("Skipping summary - no meaningful assistant response")
|
|
return nil
|
|
}
|
|
|
|
// Build the summary prompt
|
|
req := SummaryRequest{
|
|
SessionDBID: sessionDBID,
|
|
SDKSessionID: sdkSessionID,
|
|
Project: project,
|
|
UserPrompt: userPrompt,
|
|
LastUserMessage: lastUserMsg,
|
|
LastAssistantMessage: lastAssistantMsg,
|
|
}
|
|
prompt := BuildSummaryPrompt(req)
|
|
|
|
// Acquire semaphore slot (limits concurrent CLI calls)
|
|
select {
|
|
case p.sem <- struct{}{}:
|
|
defer func() { <-p.sem }()
|
|
case <-ctx.Done():
|
|
return ctx.Err()
|
|
}
|
|
|
|
// Call Claude Code CLI
|
|
response, err := p.callClaudeCLI(ctx, prompt)
|
|
if err != nil {
|
|
log.Error().Err(err).Int64("sessionId", sessionDBID).Msg("Failed to call Claude CLI for summary")
|
|
return err
|
|
}
|
|
|
|
// Parse summary from response
|
|
summary := ParseSummary(response, sessionDBID)
|
|
if summary == nil {
|
|
log.Info().Int64("sessionId", sessionDBID).Msg("No summary generated (skipped or empty)")
|
|
return nil
|
|
}
|
|
|
|
// Filter out summaries that describe the memory agent itself
|
|
if isSelfReferentialSummary(summary) {
|
|
log.Info().Int64("sessionId", sessionDBID).Msg("Skipping self-referential summary (describes agent, not user work)")
|
|
return nil
|
|
}
|
|
|
|
// Store the summary (promptNumber=0, discoveryTokens=0 for summaries)
|
|
id, createdAtEpoch, err := p.summaryStore.StoreSummary(ctx, sdkSessionID, project, summary, 0, 0)
|
|
if err != nil {
|
|
log.Error().Err(err).Msg("Failed to store summary")
|
|
return err
|
|
}
|
|
|
|
log.Info().
|
|
Int64("id", id).
|
|
Int64("sessionId", sessionDBID).
|
|
Msg("Summary stored")
|
|
|
|
// Sync to vector DB if callback is set
|
|
if p.syncSummaryFunc != nil {
|
|
fullSummary := models.NewSessionSummary(sdkSessionID, project, summary, 0, 0)
|
|
fullSummary.ID = id
|
|
fullSummary.CreatedAtEpoch = createdAtEpoch
|
|
p.syncSummaryFunc(fullSummary)
|
|
}
|
|
|
|
// Broadcast new summary event for dashboard refresh
|
|
p.broadcast(map[string]interface{}{
|
|
"type": "summary",
|
|
"action": "created",
|
|
"id": id,
|
|
"project": project,
|
|
})
|
|
|
|
return nil
|
|
}
|
|
|
|
// callClaudeCLI calls the Claude Code CLI with the given prompt.
|
|
func (p *Processor) callClaudeCLI(ctx context.Context, prompt string) (string, error) {
|
|
// Build the full prompt with system instructions
|
|
fullPrompt := systemPrompt + "\n\n" + prompt
|
|
|
|
// Create command with timeout
|
|
ctx, cancel := context.WithTimeout(ctx, 60*time.Second)
|
|
defer cancel()
|
|
|
|
// Use claude CLI with --print flag for non-interactive output
|
|
// and -p for prompt input
|
|
// Add --tools "" to disable tools (we only need text analysis)
|
|
// Add --strict-mcp-config to skip loading MCP servers
|
|
// Add --disable-slash-commands to skip command loading
|
|
// These flags significantly speed up processing by avoiding plugin/MCP initialization
|
|
cmd := exec.CommandContext(ctx, p.claudePath,
|
|
"--print",
|
|
"--tools", "",
|
|
"--strict-mcp-config",
|
|
"--disable-slash-commands",
|
|
"-p", fullPrompt) // #nosec G204 -- claudePath is from config, fullPrompt is internal
|
|
|
|
// Set model if specified (use haiku for cost efficiency)
|
|
if p.model != "" {
|
|
cmd.Args = append([]string{cmd.Args[0], "--model", p.model}, cmd.Args[1:]...)
|
|
} else {
|
|
// Default to haiku for processing (cheap and fast)
|
|
cmd.Args = append([]string{cmd.Args[0], "--model", "haiku"}, cmd.Args[1:]...)
|
|
}
|
|
|
|
// Run from /tmp to avoid triggering our own hooks
|
|
// (hooks are triggered based on working directory)
|
|
cmd.Dir = "/tmp"
|
|
|
|
// Disable any plugin hooks by setting an env var that our hooks can check
|
|
cmd.Env = append(os.Environ(), "CLAUDE_MNEMONIC_INTERNAL=1")
|
|
|
|
// Capture output
|
|
var stdout, stderr bytes.Buffer
|
|
cmd.Stdout = &stdout
|
|
cmd.Stderr = &stderr
|
|
|
|
// Run command
|
|
err := cmd.Run()
|
|
if err != nil {
|
|
log.Error().
|
|
Err(err).
|
|
Str("stderr", stderr.String()).
|
|
Msg("Claude CLI execution failed")
|
|
return "", fmt.Errorf("claude CLI failed: %w (stderr: %s)", err, stderr.String())
|
|
}
|
|
|
|
return stdout.String(), nil
|
|
}
|
|
|
|
// shouldSkipTool returns true for tools that aren't worth processing.
|
|
func shouldSkipTool(toolName string) bool {
|
|
// Skip tools that rarely produce meaningful observations
|
|
skipTools := map[string]bool{
|
|
// Internal tracking tools
|
|
"TodoWrite": true,
|
|
"Task": true,
|
|
"TaskOutput": true,
|
|
|
|
// File discovery tools (just listings, no insights)
|
|
"Glob": true,
|
|
"ListDir": true,
|
|
"LS": true,
|
|
"KillShell": true,
|
|
|
|
// Question/interaction tools (no code insights)
|
|
"AskUserQuestion": true,
|
|
|
|
// Plan mode tools (planning, not execution)
|
|
"EnterPlanMode": true,
|
|
"ExitPlanMode": true,
|
|
|
|
// Skill/command execution (meta-operations)
|
|
"Skill": true,
|
|
"SlashCommand": true,
|
|
}
|
|
|
|
skip, found := skipTools[toolName]
|
|
if found {
|
|
return skip
|
|
}
|
|
return false // Process remaining tools: Read, Edit, Write, Grep, Bash, WebFetch, WebSearch, NotebookEdit
|
|
}
|
|
|
|
// shouldSkipTrivialOperation performs local pre-filtering to skip trivial operations
|
|
// without making a Haiku API call. Returns true if the operation is too trivial to process.
|
|
func shouldSkipTrivialOperation(toolName, inputStr, outputStr string) bool {
|
|
// Skip if output is too small to be meaningful (less than 50 chars)
|
|
// Reduced from 100 to capture more meaningful small operations
|
|
if len(outputStr) < 50 {
|
|
return true
|
|
}
|
|
|
|
// Skip if output indicates an error or empty result
|
|
lowerOutput := strings.ToLower(outputStr)
|
|
trivialOutputs := []string{
|
|
"no matches found",
|
|
"file not found",
|
|
"directory not found",
|
|
"permission denied",
|
|
"command not found",
|
|
"no such file",
|
|
"is a directory",
|
|
"[]", // Empty array result
|
|
"{}", // Empty object result
|
|
}
|
|
for _, trivial := range trivialOutputs {
|
|
if strings.Contains(lowerOutput, trivial) || outputStr == trivial {
|
|
return true
|
|
}
|
|
}
|
|
|
|
// Tool-specific pre-filtering
|
|
switch toolName {
|
|
case "Read":
|
|
// Skip reading config files that rarely contain project-specific insights
|
|
boringFiles := []string{
|
|
"package-lock.json", "yarn.lock", "pnpm-lock.yaml",
|
|
"go.sum", "Cargo.lock", "Gemfile.lock", "poetry.lock",
|
|
".gitignore", ".dockerignore", ".eslintignore",
|
|
"tsconfig.json", "jsconfig.json", "vite.config",
|
|
"tailwind.config", "postcss.config",
|
|
}
|
|
for _, boring := range boringFiles {
|
|
if strings.Contains(inputStr, boring) {
|
|
return true
|
|
}
|
|
}
|
|
|
|
case "Grep":
|
|
// Skip grep results with too many matches (likely generic search)
|
|
if strings.Count(outputStr, "\n") > 50 {
|
|
return true
|
|
}
|
|
|
|
case "Bash":
|
|
// Skip simple status commands
|
|
boringCommands := []string{
|
|
"git status", "git diff", "git log", "git branch",
|
|
"ls ", "pwd", "echo ", "cat ", "which ", "type ",
|
|
"npm list", "npm outdated", "npm audit",
|
|
}
|
|
for _, boring := range boringCommands {
|
|
if strings.Contains(strings.ToLower(inputStr), boring) {
|
|
return true
|
|
}
|
|
}
|
|
}
|
|
|
|
return false
|
|
}
|
|
|
|
// toJSONString converts an interface to a JSON string.
|
|
func toJSONString(v interface{}) string {
|
|
if v == nil {
|
|
return ""
|
|
}
|
|
if s, ok := v.(string); ok {
|
|
return s
|
|
}
|
|
b, err := json.Marshal(v)
|
|
if err != nil {
|
|
return fmt.Sprintf("%v", v)
|
|
}
|
|
return string(b)
|
|
}
|
|
|
|
// captureFileMtimes captures current modification times for tracked files.
|
|
// Returns a map of absolute file paths to their mtime in epoch milliseconds.
|
|
func captureFileMtimes(filesRead, filesModified []string, cwd string) map[string]int64 {
|
|
mtimes := make(map[string]int64)
|
|
|
|
// Helper to get mtime for a file path
|
|
getMtime := func(path string) (int64, bool) {
|
|
// Resolve relative paths against cwd
|
|
absPath := path
|
|
if !filepath.IsAbs(path) && cwd != "" {
|
|
absPath = filepath.Join(cwd, path)
|
|
}
|
|
|
|
info, err := os.Stat(absPath)
|
|
if err != nil {
|
|
return 0, false
|
|
}
|
|
return info.ModTime().UnixMilli(), true
|
|
}
|
|
|
|
// Capture mtimes for all read files
|
|
for _, path := range filesRead {
|
|
if mtime, ok := getMtime(path); ok {
|
|
mtimes[path] = mtime
|
|
}
|
|
}
|
|
|
|
// Capture mtimes for all modified files
|
|
for _, path := range filesModified {
|
|
if mtime, ok := getMtime(path); ok {
|
|
mtimes[path] = mtime
|
|
}
|
|
}
|
|
|
|
return mtimes
|
|
}
|
|
|
|
// GetFileMtimes returns current modification times for a list of file paths.
|
|
// This is used for staleness checking when injecting context.
|
|
func GetFileMtimes(paths []string, cwd string) map[string]int64 {
|
|
return captureFileMtimes(paths, nil, cwd)
|
|
}
|
|
|
|
// GetFileContent reads file content for verification purposes.
|
|
// Returns content and ok status.
|
|
func GetFileContent(path, cwd string) (string, bool) {
|
|
absPath := path
|
|
if !filepath.IsAbs(path) && cwd != "" {
|
|
absPath = filepath.Join(cwd, path)
|
|
}
|
|
|
|
content, err := os.ReadFile(absPath) // #nosec G304 -- intentional file read for verification
|
|
if err != nil {
|
|
return "", false
|
|
}
|
|
|
|
// Limit to first 2000 chars for verification (enough context, not too expensive)
|
|
if len(content) > 2000 {
|
|
return string(content[:2000]) + "\n...[truncated]", true
|
|
}
|
|
return string(content), true
|
|
}
|
|
|
|
// VerifyObservation checks if an observation is still valid given the current file contents.
|
|
// Returns true if the observation is still accurate, false if it should be deleted.
|
|
func (p *Processor) VerifyObservation(ctx context.Context, obs *models.Observation, cwd string) bool {
|
|
// Build file content context
|
|
var fileContents []string
|
|
var paths []string
|
|
|
|
// Combine files_read and files_modified
|
|
for _, path := range obs.FilesRead {
|
|
paths = append(paths, path)
|
|
}
|
|
for _, path := range obs.FilesModified {
|
|
paths = append(paths, path)
|
|
}
|
|
|
|
// Get current content of tracked files
|
|
for _, path := range paths {
|
|
if content, ok := GetFileContent(path, cwd); ok {
|
|
fileContents = append(fileContents, fmt.Sprintf("=== %s ===\n%s", path, content))
|
|
}
|
|
}
|
|
|
|
if len(fileContents) == 0 {
|
|
// No files available to verify against - keep the observation
|
|
return true
|
|
}
|
|
|
|
// Build verification prompt
|
|
prompt := fmt.Sprintf(`You are verifying if a previously recorded observation is still accurate.
|
|
|
|
OBSERVATION:
|
|
- Type: %s
|
|
- Title: %s
|
|
- Subtitle: %s
|
|
- Narrative: %s
|
|
- Facts: %v
|
|
|
|
CURRENT FILE CONTENTS:
|
|
%s
|
|
|
|
TASK: Check if the observation is still accurate given the current file contents.
|
|
Reply with ONLY one of:
|
|
- VALID - if the observation is still accurate
|
|
- INVALID - if the observation is no longer accurate (the code/behavior changed)
|
|
- UNCERTAIN - if you can't determine validity (files might be incomplete)
|
|
|
|
Your response:`,
|
|
obs.Type,
|
|
obs.Title.String,
|
|
obs.Subtitle.String,
|
|
obs.Narrative.String,
|
|
obs.Facts,
|
|
strings.Join(fileContents, "\n\n"),
|
|
)
|
|
|
|
// Call Claude CLI for quick verification
|
|
response, err := p.callClaudeCLI(ctx, prompt)
|
|
if err != nil {
|
|
log.Warn().Err(err).Msg("Failed to verify observation, keeping it")
|
|
return true // On error, keep the observation
|
|
}
|
|
|
|
response = strings.TrimSpace(strings.ToUpper(response))
|
|
|
|
// Parse response
|
|
if strings.Contains(response, "INVALID") {
|
|
log.Info().
|
|
Int64("id", obs.ID).
|
|
Str("title", obs.Title.String).
|
|
Msg("Observation verified as INVALID - will delete")
|
|
return false
|
|
}
|
|
|
|
// VALID or UNCERTAIN - keep the observation
|
|
log.Debug().
|
|
Int64("id", obs.ID).
|
|
Str("title", obs.Title.String).
|
|
Str("result", response).
|
|
Msg("Observation verified")
|
|
return true
|
|
}
|
|
|
|
// isSelfReferentialSummary checks if a summary describes the memory agent itself
|
|
// rather than actual user work. These summaries should be filtered out.
|
|
func isSelfReferentialSummary(summary *models.ParsedSummary) bool {
|
|
// Combine all summary fields for checking
|
|
content := strings.ToLower(summary.Request + " " + summary.Completed + " " + summary.Learned + " " + summary.NextSteps + " " + summary.Investigated + " " + summary.Notes)
|
|
|
|
// Indicators that the summary is about the memory agent, not user work
|
|
selfReferentialPhrases := []string{
|
|
// Agent references
|
|
"memory extraction",
|
|
"memory agent",
|
|
"extraction agent",
|
|
"hook execution",
|
|
"hook mechanism",
|
|
// Session meta-state
|
|
"session initialization",
|
|
"session setup",
|
|
"session has just started",
|
|
"session just started",
|
|
"agent initialization",
|
|
"no technical learnings",
|
|
"no code or project work",
|
|
// Waiting states
|
|
"waiting for the user",
|
|
"waiting for user",
|
|
"awaiting actual",
|
|
"awaiting claude code",
|
|
"awaiting tool",
|
|
"awaiting user",
|
|
// Meta checkpoint references
|
|
"progress checkpoint",
|
|
"checkpoint request",
|
|
// Common no-work phrases
|
|
"no work has been completed",
|
|
"no work completed",
|
|
"no work done",
|
|
"no actual work",
|
|
"nothing has been completed",
|
|
"nothing completed",
|
|
// Role/guideline parroting
|
|
"role definition",
|
|
"operational guidelines",
|
|
"providing role",
|
|
"providing guidelines",
|
|
// System prompt echoes
|
|
"extract meaningful observations",
|
|
"meaningful learnings",
|
|
"analyze tool executions",
|
|
"observations for future sessions",
|
|
// Empty session indicators
|
|
"empty session",
|
|
"no substantive work",
|
|
"no meaningful work",
|
|
"just beginning",
|
|
"just begun",
|
|
}
|
|
|
|
matchCount := 0
|
|
for _, phrase := range selfReferentialPhrases {
|
|
if strings.Contains(content, phrase) {
|
|
matchCount++
|
|
}
|
|
}
|
|
|
|
// If the summary mentions 2+ self-referential phrases, it's about the agent
|
|
return matchCount >= 2
|
|
}
|
|
|
|
// hasMeaningfulContent checks if the assistant response contains meaningful content
|
|
// worth generating a summary for. This filters out initial greetings, empty sessions,
|
|
// and sessions where only system messages were exchanged.
|
|
func hasMeaningfulContent(assistantMsg string) bool {
|
|
// Skip if empty or too short (need substantial content)
|
|
if len(strings.TrimSpace(assistantMsg)) < 200 {
|
|
return false
|
|
}
|
|
|
|
lowerMsg := strings.ToLower(assistantMsg)
|
|
|
|
// Skip messages that are primarily about system/hook status or meta-instructions
|
|
skipIndicators := []string{
|
|
// System/hook markers
|
|
"hook success",
|
|
"callback hook",
|
|
"session start",
|
|
"sessionstart",
|
|
"system-reminder",
|
|
// Agent self-references
|
|
"memory extraction agent",
|
|
"memory agent",
|
|
"extraction agent",
|
|
// No-work indicators
|
|
"no technical learnings",
|
|
"waiting for",
|
|
"waiting to",
|
|
"no code or project work",
|
|
"no substantive",
|
|
"no work has been completed",
|
|
"no work done",
|
|
"awaiting tool",
|
|
"awaiting user",
|
|
// Meta-instruction echoes
|
|
"role definition",
|
|
"operational guidelines",
|
|
"analyze tool executions",
|
|
"extract meaningful observations",
|
|
}
|
|
|
|
skipCount := 0
|
|
for _, skip := range skipIndicators {
|
|
if strings.Contains(lowerMsg, skip) {
|
|
skipCount++
|
|
}
|
|
}
|
|
// If multiple skip indicators found, this is likely a system-only session
|
|
if skipCount >= 2 {
|
|
return false
|
|
}
|
|
|
|
// Check for indicators of actual work being done
|
|
workIndicators := []string{
|
|
// Concrete file operations (with paths)
|
|
".go", ".ts", ".js", ".py", ".md", ".json", ".yaml", ".yml",
|
|
// Code modifications
|
|
"edited", "modified", "created", "deleted", "updated", "changed",
|
|
"added", "removed", "fixed", "implemented", "refactored",
|
|
// Tool results
|
|
"```", "lines ", "function ", "const ", "var ", "let ",
|
|
"type ", "struct ", "class ", "def ", "func ",
|
|
}
|
|
|
|
matchCount := 0
|
|
for _, indicator := range workIndicators {
|
|
if strings.Contains(lowerMsg, strings.ToLower(indicator)) {
|
|
matchCount++
|
|
}
|
|
}
|
|
|
|
// Require at least 2 work indicators to generate a summary
|
|
return matchCount >= 2
|
|
}
|
|
|
|
// truncateForLog truncates a string for logging purposes.
|
|
func truncateForLog(s string, maxLen int) string {
|
|
if len(s) <= maxLen {
|
|
return s
|
|
}
|
|
return s[:maxLen] + "..."
|
|
}
|
|
|
|
const systemPrompt = `You are a memory extraction agent for Claude Code sessions. Your job is to analyze tool executions and extract meaningful observations that would be useful for future sessions.
|
|
|
|
GUIDELINES:
|
|
1. Create observations for any meaningful learnings - be generous, not restrictive
|
|
2. Focus on: decisions made, bugs fixed, patterns discovered, project structure, code changes, refactoring
|
|
3. Even small changes can be worth remembering if they reveal something about the codebase
|
|
4. Be concise but informative in your observations
|
|
5. Use appropriate type tags: decision, bugfix, feature, refactor, discovery, change
|
|
|
|
CONCEPT TAGS (use 1-3 of these):
|
|
- how-it-works, why-it-exists, what-changed, problem-solution, gotcha
|
|
- pattern, trade-off, best-practice, anti-pattern, architecture
|
|
- security, performance, testing, debugging, workflow, tooling
|
|
- refactoring, api, database, configuration, error-handling
|
|
|
|
OUTPUT FORMAT:
|
|
When you find something worth remembering, output:
|
|
<observation>
|
|
<type>decision|bugfix|feature|refactor|discovery|change</type>
|
|
<title>Short descriptive title</title>
|
|
<subtitle>One-line summary</subtitle>
|
|
<narrative>Detailed explanation</narrative>
|
|
<facts>
|
|
<fact>Specific fact 1</fact>
|
|
</facts>
|
|
<concepts>
|
|
<concept>tag1</concept>
|
|
</concepts>
|
|
<files_read>
|
|
<file>/path/to/file</file>
|
|
</files_read>
|
|
<files_modified>
|
|
<file>/path/to/file</file>
|
|
</files_modified>
|
|
</observation>
|
|
|
|
If the tool execution is truly trivial (just a directory listing, empty result, etc.), respond with:
|
|
<skip reason="trivial"/>
|
|
|
|
Prefer creating observations over skipping - memories are valuable for future context!`
|