Live content collections #1164

ascorbic · 2025-05-03T06:48:47Z

Summary

Adds support for live data to content collections. Defines a new type of content loader that fetches data at runtime rather than build time, allowing users to get the data with a similar API.

import { defineCollection } from "astro:content";

const products = defineCollection({
  type: "live",
  loader: {
    name: "store-loader",
    loadCollection: async ({ filter }) => {
      // ...
      return {
        entries: products.map((product) => ({
          id: product.id,
          data: product,
        })),
      };
    },
    loadEntry: async ({ filter }) => {
      // ...
      return {
        id: filter.id,
        data: product,
      };
    },
  };
});
export const collections = { products };

Links

ematipico · 2025-05-07T13:13:28Z

I noticed that the RFC doesn't cover error handling. Can we cover that part? Things that could wrong:

timeout while fetching a collection
invalid data
parsing error of data
more?

Do live collections provide a mean to handle the error gracefully? If so how? If not, how users can mitigate the error? Looking forward to see this part covered by the RFC

sarah11918 · 2025-05-13T19:32:52Z

I know everyone's super excited for real-time, updating data! 🎉 Just some thoughts from me that came to mind while reading!

You've mentioned user confusion between "similar but different" APIs in the drawbacks, but I worry about using identically-named helper functions that do (and take? and return?) different things.

getEntry() and getCollection() are familiar to existing content collections users. (But, not familiar to someone who hasn't used existing collections!) But with different implementations under the hood, I wonder whether the similarity is actually an advantage, or just a greater chance for confusion? (Assuming I'm understanding this properly.)

From a docs/support standpoint:

"What arguments does getCollection() take?" "Oh, depends which kind of collection you're querying. It can take a filtering function, or a query object."

"What does getCollection() return?" "Oh, it depends. There might be a cacheHint object included."

At that point, I'd probably prefer to document a getLiveEntry() and getLiveCollection() and then you can do whatever you want with those that make sense for their specific context, and not have to shoe horn them into existing functions. 😄

I could see keeping identical naming for e.g. people being able to keep their existing querying code when swapping out an existing collection for a live content loader collection. But it doesn't seem like this is a smooth (or even envisioned?) path anyway? They aren't even configured in the same file, so treating these as a completely new and different thing seems reasonable?

I'm also assuming that creating new functions probably also means we run less risk of introducing unexpected behaviour in existing projects by changing getCollection() and getEntry() to now be "smart" and know which kind of collection they are querying?

(Feel free to ignore the rest, but I would be remiss if I didn't at least bring it up...)

There's even a world where "Content Collections" remain "at build/request time" and this is an entirely new paradigm (just to open up the world of possibilities)! These functions feel like an example of being superimposed onto our existing collections where maybe they want to be something different. And if you were allowed to operate "outside the collections/loader box", maybe there are more cases where you'd feel the freedom of flexibility? (e.g. trying to follow the established pattern of a config file for something that isn't really a config...) Maybe it's not even a "loader", but a Firehose!

(Would it also be easier to launch/promote/write about? Easier for content creators not to have to update their existing "content collections" material yet again, but instead get to talk about a "new thing" now: real-time, live, updating data? Avoiding the need to further distinguish between "which version of content collections are you using?")

(Again, assuming I'm understanding everything! This just feels like some pretty significant differences that are, as you said, close, but not exactly the same. to some existing things. And from a docs / support standpoint, these are always the most challenging things to handle!)

ascorbic · 2025-05-15T09:05:08Z

@sarah11918 my rationale is that they are returning the same thing. We already do a lot of smart stuff under the hood so that the same function works for content layer collections, compat-mode collections, and legacy data and content collections. These all have very different internal implementations, but for the user they are all queried in the same way. I think the best argument for separating them is that these are "more different" than the others.

ascorbic · 2025-05-23T10:11:04Z

I've updated the RFC with two changes based on @ematipico and @sarah11918's feedback:

change to use separate exports: getLiveCollection and getLiveEntry
add explicit error handling. It now returns an object with { data, error } instead of directly returning the data or throwing

stipsan · 2025-05-27T16:14:40Z

support for user-defined Zod schemas, executed at runtime, to validate or transform the data returned by the loader

Is it possible for an Astro Integration to define generated schemas and plug it into the astro typegen pipeline?

For Sanity we're interested in providing a loader creator that can hook into our TypeGen pipeline in order to allow a minimal API that is fully typed:

// src/sanity.types.ts (generated by TypeGen, simplified here)
export type Product = {
  _id: string
  _type: 'product'
  slug: string
  title: string
  category: string[]
}


// src/live.config.ts
import {defineLiveCollection, type ExtractFilterFromType} from '@sanity/astro'

const products = defineLiveCollection({type: 'product'});

export const collections = { products };

With the inference we can type it so that TS would throw on a type that doesn't exist in the Sanity Schema:

const products = defineLiveCollection({type: 'produc'});
                                             // ^? 'produc' doesn't exist. Did you mean 'product'?

A valid type lets us infer the return type of defineLiveCollection({type: 'product'}) to be:

// somehow `@sanity/astro` feeds Astro this typegen
import {type InferData, type InferEntryFilter, type InferCollectionFilter} from '@sanity/astro'
import type { LiveLoader } from "astro/loaders";
import type {Product} from './sanity.types'

type ProductData = InferData<Product>
//   ^? {_id:string; _type:'product'; slug: string; title: string; category: string[]}
type ProductEntryFilter = InferEntryFilter<Product>
//   ^? {slug: string; title: string; category: string[]}
type ProductCollectionFilter = InferCollectionFilter<Product>
//   ^? {_id: string | string[]; slug: string | string[]; title: string | string[]; category: string[]}

interface SanityLoaders {
  'product': LiveLoader<ProductData, ProductEntryFilter, ProductCollectionFilter>
}

export function defineLiveCollection<DocumentType extends keyof SanityLoaders>({type: DocumentType}): SanityLoaders[DocumentType]

And all the usages sites would be fully typed:

const { entries: allProducts } = await getLiveCollection("products");
//               ^? {_id:string; _type:'product'; slug: string; title: string; category: string[]}

const { entries: clothes } = await getLiveCollection("products", {
  categories: ["clothes"]
  // ^? `categories` doesn't exist on `ProductCollectionFilter`, did you mean `category`?
});

const { entry: productById } = await getLiveEntry("products", Astro.params.id);
//             ^? {_id:string; _type:'product'; slug: string; title: string; category: string[]}
const { entry: productBySlug } = await getLiveEntry("products", { id: Astro.params.slug });
//                                                              ^? `id` doesn't exist on `ProductEntryFilter`, did you mean `slug`?

ascorbic · 2025-05-28T06:10:07Z

@stipsan that's interesting. Which part of that would be missing now? Have you seen the existing injectTypes helper that lets you generate a d.ts? The Zod schema is optional, so as it stands, the way you've implemented it there with the generic would type it correctly for the user. You can presumably trust the Sanity API to return the correct data?

stipsan · 2025-05-28T16:06:52Z

@ascorbic Thanks! I’ll check out that helper and give it a spin in a POC 🙌

You can presumably trust the Sanity API to return the correct data?

Yes and no. We don’t yet enforce server-side schema validation on writes to Content Lake; it’s currently handled client-side by Sanity Studio.

However, Sanity Studio isn’t the only entity that can write to Content Lake. Content Lake accepts any valid JSON document. Users frequently automate content imports from external systems (Shopify, Salesforce, Mux, etc.) directly into Content Lake.

The only strict guarantee Content Lake provides is compliance with its query language rules (GROQ). For example, if I query:

*[_type == "post"]{_type, _id, title}

We know upfront, per GROQ rules, this query returns either an array of objects with keys _type, _id, and title, or an empty array. _id and _type are special system properties (_type must be a string, and _id must be globally unique). From the filter, we know _type is 'post'. But title could be any valid JSON. At runtime, the typesafe declaration becomes:

type Json =
  | string
  | number
  | boolean
  | null
  | { [key: string]: Json | undefined }
  | Json[];

type QueryResult = {
  _id: string;
  _type: 'post';
  title: Json;
}[];

The TypeGen can analyze your sanity.config.ts from Sanity Studio to narrow down types:

type QueryResult = {
  _id: string;
  _type: 'post';
  title: string | null;
}[];

However, since this isn’t enforced, parser codegen becomes beneficial.

It seems achievable by using Astro’s createCodegenDir alongside generating Zod parsers when calling injectTypes. I’ll explore implementing Astro’s Content Loader API to verify this.

I imagine it’d be helpful for integration authors to have a dedicated API for schema codegen, similar to injectTypes, generating Zod schemas usable by both build-time and live loaders. This is particularly beneficial when dealing with live content previews during publishing workflows.

By default, our TypeGen creates types reflective of draft document states. Drafts may be incomplete or temporarily invalid—like images awaiting captions. Hence, customization options for userland are valuable. Users might be fine with handling string | null, or they might prefer coercion via parsers:

// src/sanity.types.ts (simplified example reflecting live preview states)
export type Product = {
  _id: string;
  _type: 'product';
  slug: string | null;
  title: string | null;
  category: string[] | null;
};

import { defineLiveCollection } from '@sanity/astro';

const products = defineLiveCollection({
  type: 'product',
  schema: (schema) => schema.extend({
    title: schema.shape.title.unwrap().default('Untitled'),
    category: z.preprocess((val) => (Array.isArray(val) ? val : []), schema.shape.category.unwrap()),
  }),
// ^? the type of Product is now narrowed to `{...Product, title: string | 'Untitled'; category: string[]}`
});

export const collections = { products };

We frequently see this challenge with live preview-capable applications, especially as the studio schema evolves over time.

boutell · 2025-06-04T14:01:54Z

I'm coming at this from a slightly different perspective. For ApostropheCMS, we decided our first priority was to make on-page editing possible within Astro.

To do that, we decided to invert the control flow. While Astro is of course the front end, in a combined ApostropheCMS / Astro project there is a [...slug].astro route that does a lot of the lifting. That Astro route makes a call to ApostropheCMS, which responds with the information Astro needs to render that page.

And even after Astro starts rendering a page template, it still often needs to insert user-edited content at a particular point, in a way that maintains editability. So we do that by providing an ApostropheArea astro component, which in turn invokes various widget-specific Astro components to render different types of content widgets.

This allows ApostropheCMS development to proceed as normal, but with Astro as the rendering engine, mapping the concepts of our CMS one to one to various folders of Astro templates that produce 100% of the actual markup.

Since the CMS must know and manage the structure of all of the data, it makes sense to make those representations in one place (e.g. the CMS is the model layer), and then let Astro concern itself exclusively with presentation (Astro is the view layer).

So at least for right now, we probably wouldn't use live content loaders very much.

However, it could make sense for us to support them in the future, particularly if we find ourselves talking to customers who are less interested in ApostropheCMS "driving the bus" as it were, and more interested in static builds with the addition of some dynamic content from the CMS, managed via the CMS back end.

This isn't a criticism of the new API, which looks well-suited to its purpose, as long as error handling is taken into account. I bring all this up just to create awareness of different perspectives and use cases that might not map to it as expected.

gingerchew · 2025-06-14T15:10:30Z

I just watched the vod of the TBD the other day and saw that the object passed to defineLiveCollection has a property type: 'live'. Is that redundant now that the live collections are defined in a function of their own as opposed to piggybacking on defineCollection? Feels like a footgun I would gladly end up firing multiple times myself 😅

After talking with Fryuni in the discord, I have a better understanding of why the type property is there. Ignore my previous comment :)

Alynva · 2025-06-19T14:00:11Z

I've read the announcement and the docs, but haven't tested it yet. Is this feature exclusive to SSR? I always find it confusing that the docs don't explicitly state which rendering methods a feature is compatible with. I mean, after understanding what the feature is and what its capabilities are, it's not hard to guess. But it would be way better if it was just stated at the start of the docs page, kinda like the "added in" field...

ascorbic · 2025-06-22T11:22:12Z

Having spent some time working on building loaders, I think it would be good to add lastModified to cacheHint, and possibly remove maxAge because that's not something that loaders can normally decide.

palockocz · 2025-07-07T13:31:31Z

Hi, based on a thread on Discord:
https://discord.com/channels/830184174198718474/1390746735290355723

Would it be possible to return other/custom data as well? For example:

return {
  entries: posts.map((post) => ({
    id: post.slug.current,
    data: post,
  })),
  totalPage: 10 // This is not transcribed
};

deslunes · 2025-08-27T12:51:04Z

Hi ! I'm really interested in content collections and making it live extends it to more use cases. But I'm a bit concerned about on-request fetching, depending on the collection, it could end up costing me more than my budget, or hit service limits.

I don't really know if this should be part of live content collections, or a separate suggestion, but I think that it could be a great addition to be able to fetch every X amount of time, in a CRON way, cache the data until the new fetch happens.

I think it's a great middle ground between on request fetching and build time fetching, it's not as responsive but gives the developer enough control for the kind of data they want to implement. The content collection could be update every minute, hour, or at specific time, specific day.

tumes · 2025-10-03T09:06:03Z

I apologize because I'm just now starting to wrap my head around the internals of Astro and this may be a moot question because I am just not seeing something but: Is there a mechanism to expose bindings from adapters into loaders? Or perhaps that particular burden will fall more on the adapters than the base loader implementation? Mostly curious because Cloudflare requires service bindings for calls within the same zone so the most immediate solution to get around that is to separate things with custom domains but I reckon exposing those bindings for loaders would be useful even outside of that restriction since most of the bindings are for data stores anyway.

edit: It’s super late where I am so I can’t test it until tomorrow, but my last thought when I was drifting to sleep was whether I could hack around this by passing a binding in as a filter. Consequently would it make sense to have an argument where an unexposed data source can be passed down the chain from a runtime call for a collection or entry, or would that be too clunky or infeasible?

Edit 2: Yep, super hacky but it does work to pass it in as a filter. Do you think there's merit to adding another argument for injecting a dependency like a binding to loaders?

JonathonRP · 2025-12-02T21:18:46Z

I don't see any where of any statements of if this works fine with just static build or if ssr is needed,
So does this work fine with just static build and deployment?

delucis · 2025-12-02T21:26:45Z

So does this work fine with just static build and deployment?

It does but there’s no advantages over the current content collections.

JonathonRP · 2025-12-02T21:29:55Z

It does but there’s no advantages over the current content collections.

Interesting, so should work fine like if it was just content collection? but to get the benefits of live ssr is needed, correct?

delucis · 2025-12-02T21:41:16Z

Interesting, so should work fine like if it was just content collection? but to get the benefits of live ssr is needed, correct?

Correct. There may be subtle cases where there are advantages either way, but basically that’s right, yes.

For example, maybe you have a really big data source, and only need some of its data to build the site — a live collection would avoid loading all that data up front in a traditional collection loader. On the flip side, because live loaders load content at render time, if you are building a static site it might be less efficient to repeatedly request that data across multiple pages with a live loader than to load it once upfront.

ascorbic force-pushed the feat/live-loaders branch 3 times, most recently from fcd8968 to 06aeff3 Compare May 3, 2025 07:19

Add live content loaders RFC

9502a4f

ascorbic force-pushed the feat/live-loaders branch from 06aeff3 to 9502a4f Compare May 3, 2025 07:24

ascorbic marked this pull request as draft May 3, 2025 07:25

ascorbic changed the title ~~Add live content loaders RFC~~ Live content loaders May 3, 2025

ascorbic self-assigned this May 3, 2025

ascorbic mentioned this pull request May 3, 2025

feat: live content collections withastro/astro#13685

Merged

Update types

17be057

Rename cache hint object

3257616

Update API with new function names and explicit error handling

4f201f9

ascorbic added 4 commits May 23, 2025 12:48

Rename from data to entries/entry

ac115a8

Use Error objects

3630734

Update error handling

1776b2b

More error handling

6c78e7b

Update naming and correct custom error info

7492f9b

ascorbic changed the title ~~Live content loaders~~ Live content collections Jun 11, 2025

Change to defineLiveCollection

e065697

ascorbic mentioned this pull request Jun 16, 2025

Live content loaders #1151

Closed

astrobot-houston mentioned this pull request Jun 19, 2025

[ci] release withastro/astro#13958

Merged

ascorbic mentioned this pull request Jun 22, 2025

feat: add lastModified field to live collection cache hints withastro/astro#13999

Merged

Remove type: 'live'

9c63718

Update to reflect changes in types

02ce092

Live content collections #1164

Are you sure you want to change the base?

Live content collections #1164

Uh oh!

Conversation

ascorbic commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Links

Uh oh!

ematipico commented May 7, 2025

Uh oh!

sarah11918 commented May 13, 2025

Uh oh!

ascorbic commented May 15, 2025

Uh oh!

ascorbic commented May 23, 2025

Uh oh!

stipsan commented May 27, 2025

Uh oh!

ascorbic commented May 28, 2025

Uh oh!

stipsan commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boutell commented Jun 4, 2025

Uh oh!

gingerchew commented Jun 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Alynva commented Jun 19, 2025

Uh oh!

ascorbic commented Jun 22, 2025

Uh oh!

palockocz commented Jul 7, 2025

Uh oh!

deslunes commented Aug 27, 2025

Uh oh!

tumes commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JonathonRP commented Dec 2, 2025

Uh oh!

delucis commented Dec 2, 2025

Uh oh!

JonathonRP commented Dec 2, 2025

Uh oh!

delucis commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

ascorbic commented May 3, 2025 •

edited

Loading

stipsan commented May 28, 2025 •

edited

Loading

gingerchew commented Jun 14, 2025 •

edited

Loading

tumes commented Oct 3, 2025 •

edited

Loading