The following is copy/pasted from backend repo Any changes made here should be copied back.

RAG SSE Events Reference

Purpose: Complete reference of Server-Sent Events (SSE) emitted by the RAG edge function Last Updated: 2025-12-19 Source: rag/index.ts

Overview

The RAG edge function streams responses using Server-Sent Events (SSE). Each event is a JSON object with a type field indicating the event category. Events are wrapped with timestamps for client-side timing.

Event Wrapper Format

All events are wrapped with a timestamp:

{
  "type": "<event-type>",
  "<payload-fields>": "...",
  "timestamp": "2025-11-26T15:29:40.910Z"
}

Progress Events

Progress events track the lifecycle of a RAG query, from initialization to completion.

`progress` - Pipeline Progress

Location: index.ts:717-731

{
  "type": "progress",
  "phase": "<phase-name>",
  "detail": {
    /* phase-specific data */
  },
  "timestamp": "2025-11-26T15:29:40.910Z"
}

Phase Values

Phase	Detail Fields	Description
`first_event`	`{ ttf_ms: number }`	Time to first response - Server-measured duration from request start to first SSE event
`stream_initializing`	`{}`	Stream setup starting
`vectorstore_initializing`	`{ method: string }`	Vector store setup starting (method: "SimilaritySearch", "SelfQuery", etc.)
`vectorstore_ready`	`{ method: string }`	Vector store ready for queries
`starting_rag_chain`	`{}`	RAG chain execution starting
`rag_chain_started`	`{}`	RAG chain is running
`completed`	`{ duration_ms: number }`	Total duration - Query completed with total execution time

Example - First Event (Time to First Response):

{
  "type": "progress",
  "phase": "first_event",
  "detail": { "ttf_ms": 1704 },
  "timestamp": "2025-11-26T15:29:40.910Z"
}

Example - Completed (Total Duration):

{
  "type": "progress",
  "phase": "completed",
  "detail": { "duration_ms": 6500 },
  "timestamp": "2025-11-26T15:29:47.410Z"
}

Chain Events

LangChain RAG chain events stream the actual query processing.

`event` - Chain Lifecycle Events

Location: index.ts:910

{
  "type": "event",
  "event": "<langchain-event-type>",
  "data": {
    /* event-specific data */
  },
  "timestamp": "..."
}

Event Types

Event	Data	Description
`on_chain_start`	`{ input: { input: string } }`	Chain started with user question
`on_chain_stream`	`{ chunk: { answer?, context?, chatHistory?, sourceRelevance?, _pairs? } }`	Streaming chunk with partial data
`on_chain_end`	`{ output: { ... } }`	Chain completed with full output

Example - Chain Stream with Answer Chunk:

{
  "type": "event",
  "event": "on_chain_stream",
  "data": {
    "chunk": { "answer": "Rosie's" }
  },
  "timestamp": "..."
}

Example - Chain Stream with Source Relevance:

{
  "type": "event",
  "event": "on_chain_stream",
  "data": {
    "chunk": {
      "sourceRelevance": {
        "sourcesContributed": false,
        "confidence": 70,
        "reasoning": "The final answer provides specific information... but there is no clear evidence that this information was derived from the retrieved documents."
      }
    }
  },
  "timestamp": "..."
}

Answer Events

`results` - Streaming Answer Text

Location: index.ts:926, 940, 949

{
  "type": "results",
  "chunk": "<answer-text-fragment>",
  "timestamp": "..."
}

Example:

{
  "type": "results",
  "chunk": "Rosie's Ethos revolves around",
  "timestamp": "..."
}

Source & Confidence Events

`events-answer-confidence` - Overall Answer Confidence

Location: index.ts:1340-1343

KPI Bar Metric: Answer Confidence %

{
  "type": "events-answer-confidence",
  "chunk": "{\"confidence\":70,\"reasoning\":\"...\",\"sourcesFound\":15,\"sourcesContributed\":false,\"llmAssessmentConfidence\":100}"
}

Parsed Chunk:

{
  "confidence": 70,
  "reasoning": "Answer based on general knowledge only. The final answer provides specific information but there is no clear evidence that this information was derived from the retrieved documents.",
  "sourcesFound": 15,
  "sourcesContributed": false,
  "llmAssessmentConfidence": 100
}

Field	Type	Description
`confidence`	number	Overall confidence 10-95%
`reasoning`	string	Human-readable explanation
`sourcesFound`	number	Documents retrieved from vector store
`sourcesContributed`	boolean	Key indicator - Did sources contribute to answer?
`llmAssessmentConfidence`	number	LLM's confidence in its own assessment

Confidence Ranges:

75-95%: Sources contributed meaningful information
65-74%: Sources found, moderate contribution
30-40%: Sources didn't contribute or none found
10-29%: No relevant sources in knowledge base

`events-sources-attention` - Source Attention Weights

Location: index.ts:1064-1070

{
  "type": "events-sources-attention",
  "chunk": "{\"bySource\":{...},\"asPercent\":{...}}"
}

`events-sources-data` - Source Contribution Data

Location: index.ts:1379-1401

{
  "type": "events-sources-data",
  "chunk": "{\"sources\":[{\"index\":1,\"percent\":45.2,\"title\":\"Document Title\",\"source\":\"https://example.com/doc\",\"chunkCount\":3}],\"usedAttention\":true,\"totalSources\":5}"
}

Parsed Chunk:

{
  "sources": [
    {
      "index": 1,
      "percent": 45.2,
      "title": "Document Title",
      "source": "https://example.com/doc",
      "chunkCount": 3
    }
  ],
  "usedAttention": true,
  "totalSources": 5
}

Field	Type	Description
`sources[].index`	number	1-based index of the source
`sources[].percent`	number	Contribution percentage (0-100) based on attention weights
`sources[].title`	string \| null	Document title from vector metadata
`sources[].source`	string	Source URL
`sources[].chunkCount`	number	Number of chunks from this source used in the response
`usedAttention`	boolean	Whether attention-based scoring was used
`totalSources`	number	Total number of contributing sources

`events-sources-no-contribution` - Sources Did Not Contribute

Location: index.ts:1244-1255

{
  "type": "events-sources-no-contribution",
  "chunk": "{\"sourcesContributed\":false,\"documentsRetrieved\":15,\"confidence\":70,\"reasoning\":\"...\",\"message\":\"Sources were retrieved but did not contribute meaningful information to the answer\"}"
}

`events-sources-message` - Source Explanation Message

Location: index.ts:1010-1038

{
  "type": "events-sources-message",
  "chunk": "{\"message\":\"...\",\"reason\":\"llm_self_reflection\",\"confidence\":70,\"details\":\"...\"}"
}

`events-metadata-summary` - Document Metadata Summary

Location: index.ts:1179-1182

{
  "type": "events-metadata-summary",
  "chunk": "{\"https://example.com/doc\":{\"title\":\"...\",\"count\":16}}"
}

Transaction Events

`transaction-summary` - Query Transaction Data

Location: index.ts:1404-1407

KPI Bar Metrics: Rosie Cost, RoC Distributed

{
  "type": "transaction-summary",
  "chunk": "{\"platform_fee\":0.02,\"roc_credits_distributed\":0.01,\"usage_duration_seconds\":2.5,...}"
}

Parsed Chunk:

{
  "platform_fee": 0.0236,
  "roc_credits_distributed": 0.0142,
  "usage_duration_seconds": 3.4,
  "transaction_type": "query_usage",
  "transaction_id": "uuid-...",
  "balance_before": 99.9764,
  "balance_after": 99.9528,
  "sources": [
    {
      "source_url": "https://example.com/doc",
      "source_title": "Document Title",
      "contributor_id": "uuid-...",
      "contributor_name": "Jane Doe",
      "portion": 0.45,
      "roc_earned": 0.0064,
      "chunks_used": 3,
      "dublin_core": {
        "dc_title": "Document Title",
        "dc_creator": "Jane Doe",
        "dc_publisher": "Acme Corp",
        "dc_date": "2025-01-15",
        "dc_rights": "Confidential"
      }
    }
  ]
}

Field	Type	Description
`platform_fee`	number	Rosie Cost - Amount charged for query
`roc_credits_distributed`	number	RoC Distributed - Credits to content contributors
`usage_duration_seconds`	number	Query processing duration
`transaction_id`	string	Unique transaction identifier
`balance_before`	number	User balance before query
`balance_after`	number	User balance after query
`sources`	array	Per-source RoC breakdown (see Source Fields below)

Source Fields

Each source in the sources array contains:

Field	Type	Description
`source_url`	string	Document URL
`source_title`	string \| null	Document title from vector metadata
`contributor_id`	string	User ID who contributed the document
`contributor_name`	string \| null	Full name of the contributor
`portion`	number	Content portion (0-1) based on attention weights
`roc_earned`	number \| null	RoC credits earned (null if self-use)
`chunks_used`	number	Number of chunks from this source used
`dublin_core`	object \| null	Dublin Core metadata for popover display

Dublin Core Metadata

The dublin_core object (when present) contains ISO 15836 metadata fields:

Field	Type	Description
`dc_title`	string \| null	Document title
`dc_creator`	string \| null	Author/creator
`dc_publisher`	string \| null	Publishing entity
`dc_date`	string \| null	Publication date
`dc_rights`	string \| null	Copyright/license info
`dc_description`	string \| null	Brief summary
`dc_source`	string \| null	Original source URL
`dc_identifier`	string \| null	DOI, ISBN, etc.

Note: dublin_core may be null if no Dublin Core metadata was provided during document upload.

Model Information Events

`model_info` - LLM Provider and Model Information

Location: index.ts:850-856

Sent at the start of streaming to inform the client which LLM provider and model are being used.

{
  "type": "model_info",
  "provider": "openai",
  "model": "gpt-4o-mini",
  "mode": "platform",
  "timestamp": "2025-12-13T10:30:00.000Z"
}

Field	Type	Values	Description
`provider`	string	`"openai"`, `"anthropic"`	LLM provider being used
`model`	string	e.g., `"gpt-4o-mini"`, `"claude-sonnet-4-20250514"`	Specific model name
`mode`	string	`"byok"`, `"platform"`	Whether user provided their own API key (BYOK) or using platform key

Mode Values:

"byok": User provided their own API key via x-openai-key or x-anthropic-key header
"platform": Using platform's default API key (model is enforced to platform default)

Example - BYOK Mode (User's API Key):

{
  "type": "model_info",
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "mode": "byok",
  "timestamp": "..."
}

Example - Platform Mode (Platform API Key):

{
  "type": "model_info",
  "provider": "openai",
  "model": "gpt-4o-mini",
  "mode": "platform",
  "timestamp": "..."
}

Utility Events

`heartbeat` - Keep-Alive

Location: index.ts:743

Sent every 15 seconds to keep connection alive.

{
  "type": "heartbeat",
  "timestamp": "2025-11-26T15:29:55.910Z"
}

`error` - Error Messages

Location: index.ts:801, 864, 1437

{
  "type": "error",
  "message": "Detailed error message"
}

KPI Bar Event Mapping (Phase 3.5.2)

KPI Metric	SSE Event	Field Path	Display
Duration to First	`progress` (phase=first_event)	`detail.ttf_ms`	`1.7s`
Total Duration	`progress` (phase=completed)	`detail.duration_ms`	`6.5s`
Rosie Cost	`transaction-summary`	`platform_fee`	`$0.02`
RoC Distributed	`transaction-summary`	`roc_credits_distributed`	`$0.01`
IPR Costs	N/A (placeholder)	-	`$0.00`
Answer Confidence	`events-answer-confidence`	`confidence`, `sourcesContributed`	`70% ⚠`

Warning Icon Logic: Show ⚠ when sourcesContributed: false

Client Implementation Example

const eventSource = new EventSource("/functions/v1/rag", {
  /* ... */
});

// Or with fetch for POST requests:
const response = await fetch("/functions/v1/rag", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ question: "..." }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

let kpiData = {
  ttfMs: null,
  totalDurationMs: null,
  platformFee: null,
  rocDistributed: null,
  confidence: null,
  sourcesContributed: null,
};

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const lines = decoder.decode(value).split("\n");
  for (const line of lines) {
    if (!line.startsWith("data: ")) continue;

    const event = JSON.parse(line.slice(6));

    switch (event.type) {
      case "progress":
        if (event.phase === "first_event") {
          kpiData.ttfMs = event.detail.ttf_ms;
        } else if (event.phase === "completed") {
          kpiData.totalDurationMs = event.detail.duration_ms;
        }
        break;

      case "results":
        // Append to answer display
        appendAnswer(event.chunk);
        break;

      case "events-answer-confidence":
        const conf = JSON.parse(event.chunk);
        kpiData.confidence = conf.confidence;
        kpiData.sourcesContributed = conf.sourcesContributed;
        break;

      case "transaction-summary":
        const tx = JSON.parse(event.chunk);
        kpiData.platformFee = tx.platform_fee;
        kpiData.rocDistributed = tx.roc_credits_distributed;
        break;
    }
  }
}

// Update KPI bar
updateKpiBar(kpiData);

How-RAG-Works.md - RAG pipeline overview
DASHBOARDS_README.md - Dashboard system documentation
TRANSACTION_MINI_DASHBOARD.md - Transaction dashboard component
PHASE3.5_PLAN.md - KPI bar requirements

Change Log

Date	Change
2026-01-16	Added `title` and `chunkCount` fields to `events-sources-data` sources array
2025-12-19	Added `dublin_core` metadata to `transaction-summary` sources array; documented all source fields
2025-12-13	Added `model_info` event for multi-provider LLM support
2025-11-25	Created SSE events reference documentation

RAG SSE Events Reference

Overview

Event Wrapper Format

Progress Events

progress - Pipeline Progress

Phase Values

Chain Events

event - Chain Lifecycle Events

Event Types

Answer Events

results - Streaming Answer Text

Source & Confidence Events

events-answer-confidence - Overall Answer Confidence

events-sources-attention - Source Attention Weights

events-sources-data - Source Contribution Data

events-sources-no-contribution - Sources Did Not Contribute

events-sources-message - Source Explanation Message

events-metadata-summary - Document Metadata Summary

Transaction Events

transaction-summary - Query Transaction Data

Source Fields

Dublin Core Metadata

Model Information Events

model_info - LLM Provider and Model Information

Utility Events

heartbeat - Keep-Alive

error - Error Messages

KPI Bar Event Mapping (Phase 3.5.2)

Client Implementation Example

Related Documentation

Change Log

`progress` - Pipeline Progress

`event` - Chain Lifecycle Events

`results` - Streaming Answer Text

`events-answer-confidence` - Overall Answer Confidence

`events-sources-attention` - Source Attention Weights

`events-sources-data` - Source Contribution Data

`events-sources-no-contribution` - Sources Did Not Contribute

`events-sources-message` - Source Explanation Message

`events-metadata-summary` - Document Metadata Summary

`transaction-summary` - Query Transaction Data

`model_info` - LLM Provider and Model Information

`heartbeat` - Keep-Alive

`error` - Error Messages