feat: add langfuse plugin for LLM observability#13095
feat: add langfuse plugin for LLM observability#13095sihyeonn wants to merge 7 commits intoapache:masterfrom
Conversation
|
This PR contains three fundamentally different features:
Apache APISIX community practice requires each PR to focus on a single feature. These three features have different review dimensions, testing requirements, and regression risks; please separate them into independent PRs. |
|
Hi @sihyeonn, following up on the previous review comments. Please let us know if you have any updates. Thank you. |
187a719 to
d3003c0
Compare
|
Hi @sihyeonn, my suggestion is to only keep the feature-related code in the current PR, and submit other changes in separate PRs. |
Add a new langfuse plugin that sends AI request traces to Langfuse via its ingestion API using batch-processor-manager pattern. Features: - Auto-detect AI endpoints (configurable patterns) - Generate trace-create and generation-create batch items per request - W3C traceparent header parsing and propagation - Token usage extraction (ai-proxy ctx > nginx vars > response body) - Support for X-Langfuse-Tags and X-Langfuse-Metadata headers - Basic auth with encrypt_fields for secret key protection - Connection keepalive for Langfuse API calls Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
Move langfuse_host, langfuse_public_key, langfuse_secret_key, ssl_verify, timeout, detect_ai_requests, and ai_endpoints from per-route schema to metadata_schema (plugin_attr), following the opentelemetry plugin pattern. Per-route schema now only contains include_metadata. Remove encrypt_fields (custom feature, not in OSS). Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
Remove /v1-prefixed entries since has_suffix matching already covers them via shorter suffixes (e.g. /chat/completions matches both /chat/completions and /v1/chat/completions). Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
d3003c0 to
9c4b25a
Compare
Baoyuantop
left a comment
There was a problem hiding this comment.
Please add plugin doc
|
|
||
| local _M = { | ||
| version = 0.1, | ||
| priority = 398, |
There was a problem hiding this comment.
clickhouse-logger has same priority with langfuse
| if not trace_id then | ||
| local request_id = core.request.header(ctx, "X-Request-Id") | ||
| if request_id then | ||
| trace_id = request_id |
There was a problem hiding this comment.
According to the W3C traceparent specification, the trace_id must consist of 32 hex characters. The X-Request-Id is typically in UUID format (with hyphens, 36 characters); using it directly as the trace_id will result in a traceparent header that does not comply with the specification. Recommendation: Either remove the hyphens during conversion, or do not use the X-Request-Id as the trace_id.
Add English documentation for the langfuse plugin covering: - Plugin description and purpose - Plugin metadata attributes (global config via plugin_attr) - Per-route schema attributes - Enable/disable instructions with examples - Register in docs config.json under Loggers Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
- Change priority from 398 to 396 to avoid conflict with clickhouse-logger - Strip hyphens from X-Request-Id and UUID-generated IDs when used as trace_id, ensuring compliance with W3C traceparent spec which requires trace_id to be exactly 32 lowercase hex characters - Validate that the resulting trace_id matches the expected format, falling back to a freshly generated UUID if it does not Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
content_by_lua blocks do not inherit globals — add local core require in TEST 3 and TEST 4 to fix 'attempt to index global core (a nil value)'. Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
| local plugin_name = "langfuse" | ||
| local batch_processor_manager = bp_manager_mod.new("langfuse logger") | ||
|
|
||
| local metadata_schema = { |
There was a problem hiding this comment.
Missing encrypt_fields: langfuse_secret_key in plugin_metadata will be stored in plaintext in etcd. Please either add encrypt_fields support or document this risk clearly. See lago.lua L111 for reference.
| if schema_type == core.schema.TYPE_METADATA then | ||
| return core.schema.check(metadata_schema, conf) | ||
| end | ||
| return core.schema.check(schema, conf) |
There was a problem hiding this comment.
check_schema uses unwrapped schema: _M.schema is wrapped via batch_processor_manager:wrap_schema(schema), but check_schema validates against the raw schema for non-metadata types. This means batch processor config fields (e.g., batch_max_size) will fail validation. Please change to:
return core.schema.check(_M.schema, conf)| - udp-logger # priority: 400 | ||
| - file-logger # priority: 399 | ||
| - clickhouse-logger # priority: 398 | ||
| - langfuse # priority: 396 |
There was a problem hiding this comment.
langfuse (priority 396) should be listed after tencent-cloud-cls (priority 397) to maintain descending priority order
- Add encrypt_fields for langfuse_secret_key in metadata_schema to prevent plaintext storage in etcd - Fix check_schema to validate against _M.schema (batch-processor-wrapped) instead of raw schema, so batch_max_size and other batch fields pass validation - Fix plugin list order in config.yaml.example to maintain descending priority order (tencent-cloud-cls:397 before langfuse:396) Signed-off-by: Sihyeon Jang <sihyeon.jang@navercorp.com>
Description
Add a new
langfuseplugin that sends LLM request traces to Langfuse for AI observability. The plugin uses the batch-processor-manager pattern (same as http-logger) to send trace data via Langfuse's ingestion API.Key features:
trace-createandgeneration-createbatch items per requesttraceparentheader parsing and propagation for distributed tracingctx.ai_token_usage(ai-proxy) > nginx vars > response body parsingX-Langfuse-TagsandX-Langfuse-Metadatacustom headersencrypt_fieldsfor secret key protection in etcdWhich issue(s) this PR fixes:
Fixes #
Checklist