-
Notifications
You must be signed in to change notification settings - Fork 161
Description
Summary
When using fn.select() with groupBy(), the query returns only group keys ({ __key_0: ... }) instead of the computed fields from the functional select. This prevents using $selected in having/fn.having clauses after fn.select + groupBy.
Current Behavior
const result = createLiveQueryCollection({
query: (q) =>
q
.from({ sessions: sessionsCollection })
.groupBy(({ sessions }) => sessions.customer_id)
.fn.select((row) => ({
taskId: row.sessions.customer_id,
latestActivity: max(row.sessions.date),
sessionCount: count(row.sessions.id),
}))
.orderBy(({ $selected }) => $selected.latestActivity),
})
// Returns: [{ __key_0: 1 }, { __key_0: 2 }]
// Expected: [{ taskId: 2, latestActivity: Date, sessionCount: 1 }, ...]Expected Behavior
The query should return the computed fields from fn.select, similar to how regular .select() works with groupBy.
Root Cause Analysis
The issue is in how processGroupBy() handles queries with fnSelect:
-
fn.selectruns beforegroupBy(index.ts lines 223-236): Each row gets$selectedwith computed values from the functional select. -
processGroupBy()doesn't receivefnSelect(index.ts lines 260-267):pipeline = processGroupBy( pipeline, query.groupBy, query.having, query.select, // ← Only passes regular select, not fnSelect query.fnHaving, )
-
Without
selectClause, only keys are returned (group-by.ts lines 242-247):} else { // No SELECT clause - just use the group keys for (let i = 0; i < groupByClause.length; i++) { finalResults[`__key_${i}`] = aggregatedRow[`__key_${i}`] } }
-
The
$selectedfromfnSelectis lost because the else branch ignoresselectResults(which contains the pre-groupByfnSelectvalues).
Additional Complexity
Even if we preserve $selected from fnSelect, there's a semantic issue:
- With
groupBy, multiple rows are grouped together - But
fnSelectruns before groupBy, so each row has its own$selectedvalues - Which row's
$selectedshould be used for the group? The first one?
For aggregates like count(), sum(), max() used in fn.select:
- These return expression objects when
fnSelectruns (before groupBy) - The actual aggregation happens in the
groupByoperator - But
processGroupBy()doesn't know about these aggregates because they're inside a JS function, not an expression tree it can analyze
Failing Unit Tests
These tests demonstrate the expected behavior:
test(`orderBy can reference aggregate field from fn.select`, () => {
const sessionStats = createLiveQueryCollection({
startSync: true,
query: (q) =>
q
.from({ sessions: sessionsCollection })
.where(({ sessions }) => eq(sessions.status, `completed`))
.groupBy(({ sessions }) => sessions.customer_id)
.fn.select((row) => ({
taskId: row.sessions.customer_id,
latestActivity: max(row.sessions.date),
sessionCount: count(row.sessions.id),
}))
.orderBy(({ $selected }) => $selected.latestActivity),
})
// FAILS: Returns [{ __key_0: 1 }, { __key_0: 2 }]
expect(sessionStats.toArray).toEqual([
{ taskId: 2, latestActivity: new Date(`2023-02-01`), sessionCount: 1 },
{ taskId: 1, latestActivity: new Date(`2023-03-01`), sessionCount: 3 },
])
})
test(`HAVING can reference aggregate field from fn.select`, () => {
const sessionStats = createLiveQueryCollection({
startSync: true,
query: (q) =>
q
.from({ sessions: sessionsCollection })
.where(({ sessions }) => eq(sessions.status, `completed`))
.groupBy(({ sessions }) => sessions.customer_id)
.fn.select((row) => ({
taskId: row.sessions.customer_id,
latestActivity: max(row.sessions.date),
sessionCount: count(row.sessions.id),
}))
.having(({ $selected }) => gt($selected.sessionCount, 2)),
})
// FAILS: Returns []
expect(sessionStats.toArray).toEqual([
{ taskId: 1, latestActivity: new Date(`2023-03-01`), sessionCount: 3 },
])
})
test(`fn.having can reference aggregate field from fn.select`, () => {
const sessionStats = createLiveQueryCollection({
startSync: true,
query: (q) =>
q
.from({ sessions: sessionsCollection })
.where(({ sessions }) => eq(sessions.status, `completed`))
.groupBy(({ sessions }) => sessions.customer_id)
.fn.select((row) => ({
taskId: row.sessions.customer_id,
latestActivity: max(row.sessions.date),
sessionCount: count(row.sessions.id),
totalAmount: sum(row.sessions.amount),
}))
.fn.having(({ $selected }) => $selected.sessionCount > 2),
})
// FAILS: Returns 0 results
expect(sessionStats.size).toBe(1)
expect(sessionStats.toArray[0]?.taskId).toBe(1)
})Possible Solutions
Option 1: Preserve fnSelect results for non-aggregate fields
- Pass
fnSelectflag toprocessGroupBy() - In the else branch, preserve
$selectedfrom the first row of each group - Limitation: Only works for non-aggregate computed fields
Option 2: Run fnSelect after groupBy
- Change execution order so
fnSelectruns aftergroupBycompletes - The function would receive the aggregated row with computed aggregates
- This would require changes to the pipeline structure
Option 3: Document as unsupported
- Keep
fn.select+groupByas unsupported - Users should use regular
.select()for aggregates with groupBy fn.selectworks for non-grouped transformations
Workaround
For now, use regular .select() with groupBy for aggregates:
// This works:
.groupBy(({ sessions }) => sessions.customer_id)
.select(({ sessions }) => ({
taskId: sessions.customer_id,
latestActivity: max(sessions.date),
sessionCount: count(sessions.id),
}))
.having(({ $selected }) => gt($selected.sessionCount, 2))Related
- PR Fix: Enable $selected access in orderBy after fn.select #1183 - Added
$selectedsupport forfn.selectwithorderBy/having(works without groupBy) - PR Support references to aggregated fields from SELECT in ORDERBY and HAVING #1094 - Original
$selectedimplementation for regularselect