Backend Requirements: Chat Metadata Persistence

Date: 2026-01-16 Status: Draft From: Frontend Claude To: Backend Claude

Problem Statement

Currently, rich metadata (sources, confidence, transaction details) is only available for the most recent chat query. When a user has multiple exchanges in a session, clicking the copy button on an older message only copies the prompt and response text - not the metadata.

Additionally, when users switch sessions via the session history dropdown, they cannot see previous conversations or their associated metadata.

What We Need

1. Persist Metadata with Each Assistant Response

When the RAG endpoint saves a chat message, please also persist the metadata that is currently streamed via SSE events. This metadata is already generated - it just needs to be stored.

Metadata to persist (per assistant response):

Category	SSE Event Source	Why We Need It
Model info	`model_info`	Show which model answered (for copy output, audit)
Confidence	`events-answer-confidence`	Include confidence % and reasoning in copied content
Sources	`events-sources-data`	Generate source attribution with Mermaid charts
Transaction	`transaction-summary`	Include cost, RoC distribution, Dublin Core provenance
Timing	`progress` events	Include response timing in copied content

2. API Endpoint to Fetch Session History with Metadata

We need an endpoint to retrieve a full conversation with metadata when a user switches sessions.

Request:

GET /api/chat/history?sessionId={sessionId}

Expected Response:

{
  "sessionId": "my-session",
  "messages": [
    {
      "id": "msg-1",
      "role": "user",
      "content": "What is Smart Data?",
      "timestamp": "2026-01-16T10:00:00Z",
      "metadata": null
    },
    {
      "id": "msg-2",
      "role": "assistant",
      "content": "Smart Data is...",
      "timestamp": "2026-01-16T10:00:05Z",
      "metadata": {
        "model": { "provider": "openai", "model": "gpt-4o-mini", "mode": "platform" },
        "confidence": { "confidence": 85, "sourcesContributed": true, "reasoning": "...", "sourcesFound": 5 },
        "sources": { "totalSources": 5, "usedAttention": true, "sources": [...] },
        "transaction": { "platform_fee": 0.02, "roc_credits_distributed": 0.01, "sources": [...] },
        "timing": { "ttf_ms": 1500, "duration_ms": 5000 }
      }
    }
  ]
}

Why We Need This

1. Copy Any Conversation (not just the latest)

Users want to copy older Q&A exchanges with full source attribution and confidence scores - not just the most recent one.

2. Session Recall

When users select a previous session from the history dropdown, we want to restore the full conversation with all metadata, so they can continue where they left off.

3. Content Provenance

For audit and trust purposes, users need to trace any answer back to its original sources, including Dublin Core metadata and contributor attribution.

How Frontend Will Use This

On session switch (session-history-btn dropdown selection):
- Call the history endpoint
- Render messages in chat UI
- Cache metadata client-side for copy functionality
On copy button click:
- Look up metadata from client cache
- Generate markdown with sources chart, confidence, etc.
Future: Copy entire conversation button:
- Iterate all messages and generate combined markdown

Backwards Compatibility

Existing messages without metadata should still load (return null for metadata)
Frontend will handle missing metadata gracefully

Questions

What endpoint structure works best for your architecture?
Any concerns about storage size with full metadata per message?
Should metadata be filtered based on user role (e.g., hide contributor_id from non-admins)?

Acceptance Criteria

[ ] Assistant responses are saved with associated metadata
[ ] History endpoint returns messages with metadata for a given session
[ ] Existing messages (without metadata) continue to work
[ ] Response time for history fetch is acceptable (< 500ms for typical session)