fix(pi): heal sessions/memory schema so plugin_version column lands#264
Conversation
The pi extension hard-codes its sessions/memory CREATE TABLE inline instead of going through DeeplakeApi.ensureSessionsTable / healMissingColumns like the other agents. When `plugin_version` was added to the canonical schema (2026-05-18) the pi INSERT picked it up but the inline CREATE did not, so every pi-created sessions table was one column short from birth and every INSERT failed with `column "plugin_version" of relation "sessions" does not exist` (42703) — with no heal to recover (e.g. prod org c2d29f27 ran zero schema introspections while peers self-healed). - Add plugin_version to the inline sessCreate (and memCreate) so fresh tables match the canonical schema. - Add a session_start schema-heal: introspect information_schema once and ALTER ADD COLUMN only the genuinely-missing columns (never IF NOT EXISTS — Deeplake returns 500 when the column already exists). Best-effort so it never blocks capture. Existing 13-column pi tables self-heal on next start. - Lock an INSERT ⊆ CREATE schema-consistency invariant in the pi source test so the two can never silently drift again.
📝 WalkthroughWalkthroughAdds a Changesplugin_version Column and Schema Heal
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Coverage ReportNo Generated for commit 431012e. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/pi/pi-extension-source.test.ts (1)
61-69: ⚡ Quick winAdd a regression guard for the schema-heal contract too.
These assertions prove fresh-table sessions DDL is correct, but they stay green if
SCHEMA_HEALdropsplugin_versionor stops healingMEMORY_TABLE—which would re-break pre-existing 13-column pi tables while the suite still passes. Please add a source-level assertion that the heal config covers both tables withplugin_version TEXT NOT NULL DEFAULT ''.As per coding guidelines,
tests/**: "Prefer asserting on specific values (paths, messages) over generic substrings."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/pi/pi-extension-source.test.ts` around lines 61 - 69, The current tests verify that plugin_version exists in sessionsCreateColumns() and sessionsInsertColumns(), but they do not verify the SCHEMA_HEAL configuration itself, which could drop plugin_version or fail to heal MEMORY_TABLE while tests still pass. Add a new regression test after the existing plugin_version assertions that directly verifies the SCHEMA_HEAL config includes both the sessions table and MEMORY_TABLE with the exact column specification plugin_version TEXT NOT NULL DEFAULT '', using specific value assertions (not substring matching) to ensure the heal contract is properly enforced.Source: Coding guidelines
pi/extension-source/hivemind.ts (1)
1308-1310: 🏗️ Heavy liftKeep the heal set sourced from the full inline schema.
SCHEMA_HEALonly knows aboutplugin_version, so the next column added to the canonical memory/sessions schema will strand already-created pi tables again even ifmemCreate/sessCreateare updated. The shared path heals against the full column list for exactly this reason; please generate the CREATE SQL and the heal list from the same local column definitions so existing tables keep self-healing on future schema changes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pi/extension-source/hivemind.ts` around lines 1308 - 1310, The SCHEMA_HEAL array is hardcoded with only the plugin_version column, which means future schema changes won't be automatically healed in existing tables. Instead of maintaining SCHEMA_HEAL as a separate hardcoded list, generate it from the same column definitions used to create the SESSIONS_TABLE and MEMORY_TABLE in the memCreate and sessCreate functions. Extract the column definitions from those creation statements and dynamically build SCHEMA_HEAL to include all missing columns from the canonical schema, ensuring that any future columns added to the table creation logic will automatically be included in the heal logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pi/extension-source/hivemind.ts`:
- Around line 1314-1318: The dlQuery call in the information_schema columns
selection is directly interpolating the table name and creds.workspaceId into
the SQL string without escaping, which allows SQL injection if either value
contains a single quote. Escape both the table and creds.workspaceId values
before interpolating them into the query string, or use parameterized query
parameters if the dlQuery function supports them, to prevent the query from
becoming invalid or allowing malicious injection.
---
Nitpick comments:
In `@pi/extension-source/hivemind.ts`:
- Around line 1308-1310: The SCHEMA_HEAL array is hardcoded with only the
plugin_version column, which means future schema changes won't be automatically
healed in existing tables. Instead of maintaining SCHEMA_HEAL as a separate
hardcoded list, generate it from the same column definitions used to create the
SESSIONS_TABLE and MEMORY_TABLE in the memCreate and sessCreate functions.
Extract the column definitions from those creation statements and dynamically
build SCHEMA_HEAL to include all missing columns from the canonical schema,
ensuring that any future columns added to the table creation logic will
automatically be included in the heal logic.
In `@tests/pi/pi-extension-source.test.ts`:
- Around line 61-69: The current tests verify that plugin_version exists in
sessionsCreateColumns() and sessionsInsertColumns(), but they do not verify the
SCHEMA_HEAL configuration itself, which could drop plugin_version or fail to
heal MEMORY_TABLE while tests still pass. Add a new regression test after the
existing plugin_version assertions that directly verifies the SCHEMA_HEAL config
includes both the sessions table and MEMORY_TABLE with the exact column
specification plugin_version TEXT NOT NULL DEFAULT '', using specific value
assertions (not substring matching) to ensure the heal contract is properly
enforced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 2dce2cb9-3cac-4663-a4b4-c5304c77a3d7
📒 Files selected for processing (2)
pi/extension-source/hivemind.tstests/pi/pi-extension-source.test.ts
…S_COLUMNS The other agents (claude_code/codex/cursor/hermes + session-queue) hand-maintain their own inline sessions INSERT column lists — the same shape of drift that broke pi (INSERT wrote plugin_version, schema didn't). They derive CREATE + heal from SESSIONS_COLUMNS, so the protecting invariant is INSERT ⊆ SESSIONS_COLUMNS. They all satisfy it today; this locks it so the pi 42703 bug can't reappear in any agent via a future INSERT edit that adds a column not in the canonical set.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/shared/agent-sessions-insert-schema.test.ts (1)
6-39: ⚡ Quick winWell-structured guard against schema drift.
The regex-based extraction correctly anchors on the canonical leading columns and normalizes case. The assertion on line 33 will fail clearly if the pattern is missing, and the outer test description (line 43) will show which file was being processed.
💬 Optional: improve error message for regex mismatch
Consider adding a descriptive message to the assertion on line 33:
- expect(m).not.toBeNull(); + expect(m, 'sessions INSERT with canonical leading columns (id, path, filename, message, ...) not found').not.toBeNull();This would make failures slightly easier to debug, though the current approach is already adequate.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/shared/agent-sessions-insert-schema.test.ts` around lines 6 - 39, The assertion on line 33 in the sessionsInsertColumns function can be improved by adding a descriptive error message as the second parameter to the expect() call. This message should explain what the regex pattern is looking for (the canonical leading columns of the INSERT statement) and help developers understand why the pattern match failed. This makes test failures easier to debug when the regex fails to match the expected structure in one of the files listed in SESSIONS_INSERT_FILES.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/shared/agent-sessions-insert-schema.test.ts`:
- Around line 6-39: The assertion on line 33 in the sessionsInsertColumns
function can be improved by adding a descriptive error message as the second
parameter to the expect() call. This message should explain what the regex
pattern is looking for (the canonical leading columns of the INSERT statement)
and help developers understand why the pattern match failed. This makes test
failures easier to debug when the regex fails to match the expected structure in
one of the files listed in SESSIONS_INSERT_FILES.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c847a3ba-cf8a-4083-ac03-b4a1c318b0b4
📒 Files selected for processing (1)
tests/shared/agent-sessions-insert-schema.test.ts
…contract - Escape table + workspaceId with sqlStr in the schema-heal information_schema query (matches buildIntrospectionSql); an unescaped quote in a custom HIVEMIND_*_TABLE / workspace id would otherwise break the heal and silently disable recovery. - Add a test that locks the SCHEMA_HEAL contract (heals both sessions + memory for plugin_version), so dropping it can't silently re-break existing tables.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/pi/pi-extension-source.test.ts (1)
77-86: ⚡ Quick winConsider parsing the SCHEMA_HEAL structure instead of substring checks.
The test uses generic substring checks (lines 81-82) and a count-based assertion (line 85), which is inconsistent with the existing test helpers (
sessionsInsertColumns(),sessionsCreateColumns()) that parse the source into structured arrays. As per coding guidelines, tests should prefer asserting on specific values over generic substrings.The current approach could miss structural issues, such as:
- A table appearing in the wrong position within a tuple
- A third table accidentally added
- The healing entry paired with the wrong table
♻️ Suggested refactor to parse and verify the structure
- it("session_start SCHEMA_HEAL heals sessions + memory tables for plugin_version", () => { - const block = PI_SRC.match(/const SCHEMA_HEAL[^=]*=\s*\[([\s\S]*?)\];/); - expect(block).not.toBeNull(); - const heal = block![1]; - expect(heal).toContain("SESSIONS_TABLE"); - expect(heal).toContain("MEMORY_TABLE"); - const pluginVersionHeals = - heal.match(/\["plugin_version",\s*"TEXT NOT NULL DEFAULT ''"\]/g) ?? []; - expect(pluginVersionHeals.length).toBeGreaterThanOrEqual(2); - }); + function parseSchemaHeal(): Map<string, string[]> { + const block = PI_SRC.match(/const SCHEMA_HEAL[^=]*=\s*\[([\s\S]*?)\];/); + expect(block).not.toBeNull(); + const entries = new Map<string, string[]>(); + // Match each [TABLE_NAME, [[col, type], ...]] tuple + for (const m of block![1].matchAll(/\[([A-Z_]+),\s*\[([\s\S]*?)\]\]/g)) { + const table = m[1]; + const colDefs = m[2]; + const cols: string[] = []; + for (const col of colDefs.matchAll(/\["([^"]+)",\s*"[^"]+"\]/g)) { + cols.push(col[1]); + } + entries.set(table, cols); + } + return entries; + } + + it("session_start SCHEMA_HEAL heals sessions + memory tables for plugin_version", () => { + const heal = parseSchemaHeal(); + expect(heal.has("SESSIONS_TABLE")).toBe(true); + expect(heal.has("MEMORY_TABLE")).toBe(true); + expect(heal.get("SESSIONS_TABLE")).toEqual(["plugin_version"]); + expect(heal.get("MEMORY_TABLE")).toEqual(["plugin_version"]); + });This approach:
- Parses the actual array structure into a Map<table, columns[]>
- Asserts specific table names and their exact column lists
- Would catch if plugin_version is missing, duplicated, or paired with the wrong table
- Follows the same pattern as the existing
sessionsInsertColumns()/sessionsCreateColumns()helpers🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/pi/pi-extension-source.test.ts` around lines 77 - 86, The test currently uses generic substring checks with toContain() and count-based assertions on the SCHEMA_HEAL structure, which is inconsistent with existing test helpers like sessionsInsertColumns() and sessionsCreateColumns(). Refactor this test to parse the SCHEMA_HEAL array into a structured format (such as a Map mapping table names to their column definitions), then replace the generic substring assertions with specific assertions on the parsed structure. Verify that the SESSIONS_TABLE and MEMORY_TABLE are present with their exact column lists, and check that the plugin_version column with its specific definition appears in the correct tables, rather than relying on substring matching and length counts.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/pi/pi-extension-source.test.ts`:
- Around line 77-86: The test currently uses generic substring checks with
toContain() and count-based assertions on the SCHEMA_HEAL structure, which is
inconsistent with existing test helpers like sessionsInsertColumns() and
sessionsCreateColumns(). Refactor this test to parse the SCHEMA_HEAL array into
a structured format (such as a Map mapping table names to their column
definitions), then replace the generic substring assertions with specific
assertions on the parsed structure. Verify that the SESSIONS_TABLE and
MEMORY_TABLE are present with their exact column lists, and check that the
plugin_version column with its specific definition appears in the correct
tables, rather than relying on substring matching and length counts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 1618b4c8-9b1f-47a6-915e-f93e2730b873
📒 Files selected for processing (2)
pi/extension-source/hivemind.tstests/pi/pi-extension-source.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- pi/extension-source/hivemind.ts
Problem
The
piagent extension (pi/extension-source/hivemind.ts) hard-codes itssessions/memoryCREATE TABLEinline instead of going throughDeeplakeApi.ensureSessionsTable/healMissingColumnslike every other agent (claude-code, codex, cursor, hermes).When
plugin_versionwas added to the canonicalSESSIONS_COLUMNS(2026-05-18), the pi INSERT picked it up but the inline CREATE did not — so every pi-createdsessionstable was 13 columns from birth, and pi had no schema-heal to add the missing column. Result: every session write failed with…permanently, with no recovery. Diagnosed in prod: pi org
c2d29f27generated 230 real 42703s over 2 days and ran zeroinformation_schemaintrospections (pi never heals), while claude-code/codex/cursor/hermes orgs self-healed. The org'ssessionsdataset was literally "initialized with 13 columns" — the stale pi CREATE.Fix
plugin_versionto the inlinesessCreate(andmemCreatefor consistency) so fresh pi tables match the canonical schema.session_startschema-heal: introspectinformation_schemaonce andALTER ADD COLUMNonly the genuinely-missing columns (neverIF NOT EXISTS— Deeplake returns HTTP 500 when the column already exists). Best-effort so it never blocks capture. Existing 13-column pi tables self-heal on next start.message_embeddingcheck).Tests
tests/pi/pi-extension-source.test.ts: two new invariants — fail before the fix (missing: ['plugin_version']), pass after.🤖 Generated with Claude Code
Summary by CodeRabbit
plugin_versiontracking to session and memory tables.plugin_versioncolumns after the sessions table becomes queryable (non-blocking).INSERTcolumn lists match the canonicalCREATE TABLEdefinitions, includingplugin_version.INSERTstatements and the canonical schema.