The following is copy/pasted from backend repo Any changes made here should be copied back.

RAG Edge Function

Purpose: Streaming Retrieval-Augmented Generation (RAG) endpoint for knowledge base Q&A with RoC (Return on Contribution) credit distribution.

Status: Active Last Updated: 2026-01-20 Version: 2.2.0

Overview

The RAG edge function is the core Q&A endpoint that:

Streams LLM-generated answers via Server-Sent Events (SSE)
Retrieves relevant documents from Supabase vector store
Calculates source attention weights for citation accuracy
Distributes RoC credits to content contributors
Supports multiple LLM providers (OpenAI, Anthropic)
Maintains conversation history via Supabase (per-org tables)
Persists metadata with assistant responses (model, confidence, sources, transaction, timing)
Extracts topic tags for Hot Topics analytics (privacy-preserving query analytics)

Base URL

https://<project-ref>.supabase.co/functions/v1/rag

Authentication

All requests require JWT authentication via the Authorization header:

Authorization: Bearer <user-jwt>

The JWT must contain app claims with:

authId - User's auth UUID
orgId - Organization ID
accessLevel - User's access level

Request Format

Method: POST Content-Type: application/json

Request Body

{
  // Required
  chatMessages: {
    system: string;   // System prompt context
    user: string;     // User's question
  };

  // Optional
  sessionId?: string;          // Conversation session ID (for memory)
  metadataFilter?: object;     // Vector search filter
  specificity?: "1" | "2" | "3"; // Retrieval method (default: "1")
  modelVersion?: string;       // LLM model name (default: platform default)
  provider?: "openai" | "anthropic"; // LLM provider (default: "openai")
  apiKey?: string;             // BYOK API key (optional)
  verbose?: boolean;           // Enable verbose logging
}

Specificity Values

Value	Method	Description
`"1"`	SimilaritySearch	Standard vector similarity search (default)
`"2"`	Self-Query	LLM-structured query (temporarily disabled)
`"3"`	Parent	Parent document retrieval (not implemented)

API Key Options

Platform Key (default): Uses platform's OpenAI/Anthropic key
Header Key: x-openai-key or x-anthropic-key header
Body Key: apiKey field in request body

When using platform keys, the model is enforced to the platform default (e.g., gpt-4.1-mini).

Response Format

The response is a Server-Sent Events (SSE) stream. See SSE_EVENTS_REFERENCE.md for complete event documentation.

Key SSE Events

Event Type	Description
`model_info`	LLM provider/model being used
`progress`	Pipeline progress phases
`results`	Streaming answer text chunks
`events-answer-confidence`	Answer confidence score
`events-sources-data`	Source contribution breakdown
`transaction-summary`	Query cost and RoC distribution
`error`	Error messages

Transaction Summary Structure

The transaction-summary event includes Dublin Core metadata for each source:

{
  transaction_id: string;
  platform_fee: number;
  duration_seconds: number;
  duration_ms: number;
  sources_used: number;
  roc_credits_distributed: number;
  contributors: number;
  hourly_rate: number;
  roc_split_percent: number;
  balance_before?: number;
  balance_after?: number;
  sources: [
    {
      source_url: string;
      source_title?: string;
      contributor_id: string;
      contributor_name?: string;
      portion: number;
      roc_earned?: number;
      chunks_used?: number;
      dublin_core?: {           // Dublin Core metadata for popover display
        dc_title?: string;
        dc_creator?: string;
        dc_publisher?: string;
        dc_date?: string;
        dc_rights?: string;
        dc_description?: string;
        dc_source?: string;
        dc_identifier?: string;
      } | null;
    }
  ];
}

Example Request

curl -X POST \
  'https://<project>.supabase.co/functions/v1/rag' \
  -H 'Authorization: Bearer <jwt>' \
  -H 'Content-Type: application/json' \
  -d '{
    "chatMessages": {
      "system": "You are a helpful assistant.",
      "user": "What is the company mission?"
    },
    "sessionId": "session-123",
    "specificity": "1"
  }'

With BYOK (Bring Your Own Key)

curl -X POST \
  'https://<project>.supabase.co/functions/v1/rag' \
  -H 'Authorization: Bearer <jwt>' \
  -H 'Content-Type: application/json' \
  -H 'x-anthropic-key: sk-ant-...' \
  -d '{
    "chatMessages": {
      "system": "You are a helpful assistant.",
      "user": "Explain the architecture."
    },
    "provider": "anthropic",
    "modelVersion": "claude-sonnet-4-20250514"
  }'

Environment Variables

Variable	Required	Description
`OPENAI_API_KEY`	Yes	Platform OpenAI API key
`ANTHROPIC_API_KEY`	No	Platform Anthropic API key
`OPENAI_DEFAULT_MODEL`	No	Default model for platform (default: `gpt-4.1-mini`)
`OPENAI_DEFAULT_MODEL_BYOK`	No	Default model for BYOK users
`ANTHROPIC_DEFAULT_MODEL_BYOK`	No	Default Anthropic model for BYOK
`RETRIEVER_K`	No	Number of documents to retrieve (default: 5)
`SUPABASE_DISTANCE`	No	Distance metric: `cosine`, `inner`, `euclidean`
`ATTENTION_THRESHOLD`	No	Min attention % to include source (default: 0)
`VERBOSE`	No	Enable verbose logging (default: true)

File Structure

rag/
├── README.md                      # This file
├── index.ts                       # Main RAG endpoint
├── _deps.ts                       # Dependency re-exports
├── deno.json                      # Deno configuration
├── SSE_EVENTS_REFERENCE.md        # Complete SSE event documentation
├── test.html                      # Browser-based test client
├── rag.test.js                    # Integration tests
├── helpers-utils.test.ts          # Unit tests
│
├── # Dashboard-related docs (deprecated - see dashboardApi)
├── DASHBOARDS_README.md
├── DASHBOARD_SESSION_SUMMARY.md
├── TRANSACTION_MINI_DASHBOARD.md
├── transaction-mini-dashboard.js
└── dashboard-account-summary.html

Development

Run Locally

deno task run:rag

Run with Debug Inspection

deno task run:rag:inspect

Deploy

deno task deploy:rag

Architecture

Request Flow

1. JWT Validation → Extract user claims (authId, orgId)
2. API Key Resolution → Platform vs BYOK
3. Vector Store Init → Supabase vector store with RLS
4. Document Retrieval → Similarity search with scores
5. Attention Calculation → Source contribution weights
6. LLM Generation → Streaming answer via LangChain
7. Source Relevance → LLM self-reflection on source contribution
8. Topic Extraction → Extract topic tags (if sources contributed)
9. Transaction Logging → RoC credit distribution + topic tags
10. Balance Update → Deduct platform fee, credit contributors

Key Dependencies

Module	Purpose
`@langchain/openai`	OpenAI LLM integration
`@langchain/anthropic`	Anthropic LLM integration
`@langchain/core`	LangChain core abstractions
`@langchain/community`	Supabase vector store
`@xata.io/client`	Conversation history storage
`@supabase/supabase-js`	Database client

Internal Modules

Import	Purpose
`@core/deps.ts`	Centralized dependency exports
`@utils/helpers-utils.ts`	Utility functions
`@stores/supabase-store.ts`	Vector store operations
`@stores/xata.ts`	Conversation memory
`@business-model/services/`	Transaction logging, RoC distribution

RoC Credit Distribution

When sources contribute to an answer:

Platform Fee: Charged to querying user (based on query duration)
RoC Split: 60% of platform fee distributed to contributors
Portion-Based: Credits weighted by source attention scores
Self-Use Filter: Users don't receive RoC for their own content

Dublin Core Metadata

Source documents may include Dublin Core metadata (ISO 15836) extracted from vector metadata:

dc_title - Document title
dc_creator - Author/creator
dc_publisher - Publishing entity
dc_date - Publication date
dc_rights - Copyright/license info
dc_description - Brief summary
dc_source - Original source URL
dc_identifier - DOI, ISBN, etc.

Frontend clients can display this metadata in popovers for source attribution.

Hot Topics (Query Analytics)

The RAG function extracts topic tags from user queries for privacy-preserving analytics. This enables the "Hot Topics" dashboard visualization without exposing raw query text.

How It Works

Topic Extraction: After the RAG chain completes successfully, an LLM extracts 2-5 topic tags from the query
Smart Gating: Topics are only extracted when sources actually contribute to the answer
- Skipped when resultsFound === false (no documents retrieved)
- Skipped when sourcesContributed === false (LLM didn't use sources)

Storage: Topics are stored in transactions.metadata.query_topics:

{
  query_topics: {
    topics: ["budget-planning", "q4-forecast"],
    extracted_at: "2026-01-19T...",
    confidence: 0.9,
    version: "v1"
  }
}

Tag Structure Rules

Tags follow consistent semantic structure for aggregation:

Document types: [type]-[subject] ordering
- transcript-meeting (not meeting-transcript)
- report-financial (not financial-report)
- policy-hr (not hr-policy)
Actions: [action]-[subject] ordering
- planning-budget (not budget-planning)
- audit-compliance (not compliance-audit)
Time periods: Standalone
- q4, annual, monthly, 2025

Canonical Terms

The LLM is instructed to use canonical terms for consistency:

transcript (not: transcription, notes, minutes)
meeting (not: call, session, standup)
report (not: document, file, summary)
policy (not: guideline, rule, procedure)

Environment Variables

Variable	Default	Description
`HOT_TOPICS_ENABLED`	`true`	Set to `false` to disable topic extraction

Privacy Considerations

Raw queries never stored - Only derived topic tags
Minimum threshold - Dashboard only shows topics with ≥3 queries
No user attribution - Topics not linked to individual users
Time boundary - Topics older than 90 days excluded from visualization

dashboardApi/README.md - Hot Topics API endpoint
_shared/core/topic-extraction.ts - Extraction module
migrations/20260119000001_hot_topics_support.sql - Database indexes and functions

SSE_EVENTS_REFERENCE.md - Complete SSE event documentation
How-RAG-Works.md - RAG pipeline deep dive
dashboardApi/README.md - Dashboard API documentation
Business Model Implementation - Transaction system

Change Log

2026-01-20

FEATURE: Hot Topics query analytics
- Extracts 2-5 topic tags from user queries using LLM
- Tags stored in transactions.metadata.query_topics
- Smart gating: skips extraction when sources don't contribute to answer
- Consistent tag structure with [type]-[subject] ordering
- Canonical terms for better aggregation
- Can be disabled via HOT_TOPICS_ENABLED=false
- New module: _shared/core/topic-extraction.ts
- New tests in rag.test.js for extraction behavior

2026-01-16

FEATURE: Added metadata persistence with chat history
- Assistant responses now include model, confidence, sources, transaction, and timing metadata

2025-12-18

FEATURE: Added Dublin Core metadata to transaction-summary SSE event
- New dublin_core field on each source in the sources[] array
- Includes: dc_title, dc_creator, dc_publisher, dc_date, dc_rights, dc_description, dc_source, dc_identifier
- Extracted from vector metadata at query time
- Enables frontend to display metadata popovers in real-time query results

2025-12-13

FEATURE: Added multi-provider LLM support (OpenAI, Anthropic)
FEATURE: Added model_info SSE event for provider/model transparency
FEATURE: BYOK (Bring Your Own Key) support via headers

2025-12-10

FEATURE: Added LLM self-reflection for source contribution assessment
FEATURE: Answer confidence scoring based on source relevance

2025-11-25

FEATURE: Added attention-based source attribution
FEATURE: RoC credit distribution with per-source breakdown
FEATURE: Transaction logging with balance tracking

2025-09-10

Initial streaming RAG implementation
Supabase vector store integration
Xata conversation memory

RAG Edge Function

Overview

Base URL

Authentication

Request Format

Request Body

Specificity Values

API Key Options

Response Format

Key SSE Events

Transaction Summary Structure

Example Request

With BYOK (Bring Your Own Key)

Environment Variables

File Structure

Development

Run Locally

Run with Debug Inspection

Deploy

Architecture

Request Flow

Key Dependencies

Internal Modules

RoC Credit Distribution

Dublin Core Metadata

Hot Topics (Query Analytics)

How It Works

Tag Structure Rules

Canonical Terms

Environment Variables

Privacy Considerations

Related

Related Documentation

Change Log

2026-01-20

2026-01-16

2025-12-18

2025-12-13

2025-12-10

2025-11-25

2025-09-10