| Module | Purpose | Header-only | Packaged |
|---|---|---|---|
| instinct-proto | Protobuf message definitions | No | Yes |
| instinct-core | Functional interfaces and utilities | Yes | Yes |
| instinct-llm | LLM provider integration, prompt templates, tokenizer | Yes | Yes |
| instinct-transformer | Built-in support for transformer-based models other than chat completion, e.g. reranker, embeddings | Yes | Yes |
| instinct-retrieval | RAG components, notebook for evaluation | Yes | Yes |
| instinct-server | HTTP server for chains | Yes | Yes |
| instinct-assistant | OpenAI's assistant api implementation | Yes | No |
| instinct-apps | Reference apps and other samples | No | No |
Side notes:
- Header-only: This module contains only
.hppheader files.instinct-protorelies on.cand.hgenerated byprotoc, which cannot be header-only. This issue is tracked.instinct-appscontain executables so they won't be header-only too. - Packaged: This module will have
installrule and will be built as aconanpackage as well.instinct-appscontain applications that will be shipped in Docker images or brew cask packages instead.
| IInputParser sub-class | Input Type | Notes |
|---|---|---|
| PromptValueVariantInputPaser | PromptValueVariant | PromptValueVariant is a variant that accepts any possible values as a prompt value including std::string, Message, MessageList and more. |
LLM and ChatModel are not StepFunctions on their own. Related StepFunctionPtrs can be created using AsModelFunction method on them.
These model classes expect a PromptValue message type from context and produce a Generation message type.
Current available model classes are listed in following table.
| Model sub-class | Notes |
|---|---|
| OllamaLLM | Text completion API of Ollama |
| ChatLLM | Chat completion API of Ollama |
| ChatOpenAI | Chat completion API of OpenAI. It can be used with any OpenAI compatible server like nitro. |
Subclasses of IPromptTemplate are StepFunction. You can directly create chain with them. They are expecting JSONMappingContext from context and producing PromptValue message, and are thus often put before model functions.
| IPromptTemplate subclass | Notes |
|---|---|
| PlainPromptTemplate | To create a single message prompt using string template. |
| PlainChatPromptTemplate | To create a list of message prompts with string template. |
| FewShotPromptTemplate | To create a single message containing few-shot examples. |
Subclasses of IOutputParser<T> are not StepFunction. But specials operator overrides are implemented so that they can be concatenated with StepFunctionPtr.
Output parsers expect a Generation message from context. Some output parser may relay on certain format instruction as part of prompt value. A StepFunction for producing a string of format instruction can be generated with AsInstructorFunction, and the output should be used in a prompt template. To put these pieces together, common practice should be constructing a MappingStepFunction that containing format instruction, which is followed by a prompt template instance consuming map data.
| IOutputParser subclass | Output type | Note |
|---|---|---|
| StringOutputParser | std::string | To extract string value from a Generation message. |
| MultilineGenerationOutputParser | MultilineGeneration | To extract multi-line texts from a Generation message. |
More output parsr classes will be added in near future.
Subclass of IChatMemory are short memory for LLMs. They persist conversation records, which will be used in next round of prompting.
As there are two aspects regarding memory, two distinct StepFunctionPtr can be created with a single memory class. That's AsLoadMemoryFunction to obtain a StepFunctionPtr that load message list to context, and AsSaveMemoryFunction to save prompt and answer pair into internal storage.
| IChatMemory subclass | Note |
|---|---|
| EphemeralChatMemory | To save data in memory with no persistence involved |
More chat memory implementation will be added in near future.
IDocStore and IVectorStore are primary database for documents and their embeddings. A IVectorStore instance is similar to a IDocStore instance in terms of CURD operations with documents, but it has extra capabilities to embed documents as vectors and perform vector search on vector data.
Currently, only DuckDBDocStore and DuckDBVectorStore are implemented for local usage. DuckDBVectorStore handles documents recall in a brute-force way that scan entire table with embeddings using array_cosine_simliarity function. Performance would be degraded if database is large enough. In benchmarks with SIFT-1M dataset, a single recall would result round trip of more than three seconds on my local machine with 10-core M1 MAX CPU. Of course, optimizations could be done using ANNs like faiss, but it's totally reasonable for dataset less than 1M with limited concurrency.
The most important role in RAG pipeline is Retriever. Some common retrieving patterns are supported already.
| What are stored in DocStore | Query handling | What are stored in VectorStore | Returned Docs | |
|---|---|---|---|---|
| VectorStoreRetriever | original doc | raw query | embedding of original doc | original doc |
| CMV, ChunkedMultiVectorRetriever | original doc or chunked parent parts | raw query | embedding of chunked children of a parent part | original doc or chunked parent parts |
| CMV with summary as guidance | original doc | raw query | embedding of doc summary | original doc |
| CMV with hypothetical queries as guidance | original doc | raw query | embedding of generated questions | original doc |
| MultiQueryRetriever | original doc | multiple generated queries | embedding of original doc | original doc |
| MultiPathRetriever | original doc | raw query | embedding of original doc | original docs are re-ordered or filtered |
Please refer to following links for more exemplary usage for retrievers.
Ingestor are utility classes that load document data from data sources into retrievers.
| IIngestor sub-class | Data source | Notes |
|---|---|---|
| SingleFileIngestor | single plain text file | TXT,MD,HTML are supported with corresponding configurations. |
| PDFFileIngestor | single pdf file | Text are extracted and merged from text page objects. |
| ParquetFileIngestor | single source of parquet file | Both local file and remote url are supported. Columns mapping should be provided. |
| DirectoryTreeIngestor | a local directory | Files are globbed, recursively if requested. Selected files are then passed to another ingestor for single file handling. |