
To Do List
-
Metadata Filters Bug
- netlify/edge-functions/utils/helpers.generateJsonFilter will not work for multiple orgIds, only the last to be loaded
- Modify /home/steven/github/collabventures/llm-meeting-minutes/netlify/functions/utils/cli/loadDocsToVectorStore.mjs to accept paramerter for orgId and dynamically import the correct file
- There should be one loadDocSources file for each OrgId
-
loadSource: "title": "2024-01-11 GSWG undefined Transcripts", replace undefined with ""
-
Streaming not working
-
https://js.langchain.com/docs/modules/model_io/llms/subscribing_events is supposed to return tokens used which could be used in business model, but it's not working. Tested with retrievalAugmentedGeneration & _xata
-
Create Xata table for status line to be retrieved by chat_script.js to be displayed in chat console to users know if there's a problem with Langchain, Supabase, or Xata.
-
Semantic Corrections
- Potentially use GitHub Actions Suggest Changes "Using AI and LLMs in docs-as-code pipelines" 2024-01-24 Zoom call
- Load: Summarization
- Create a load contents page
- Metadata form or load from URL (meeting page agenda), document type like transcripts, meeting page, document, etc.
- Upload files
- Load from URL, check for youtube
- Clear vector store
- Cleanse: Plan & Execute Chain
- Create a cleanse document page descibing the process
- Create chain (Conversation Retrieval Agent with Tools) to:
- Load vector store
- Use Chat History
- Create a table with rows number, incorrect text, semantic correction
- Prompt User to select all, none, or individual rows to correct
- Save corrected text in mardown format with appropriate tags and metadata
- Load new markdown file into vector store
- Summarize document
- Save summary in markdown format with appropriate tags and metadata
- Load new markdown file into vector store
-
Keys
- Use Chat Messages div to display errors or instructions if no keys provided
- BYOK for OpenAI
- BYOK for vector store? (teams for shared vector store?)
- Use hashed key as prefix for SessionId
- Not securely Save Settings to Local Storage
-
Rosie Docs
- Change chat.njk to chatContainer to markdown
- Create tool to save output in markdown and push to Rosie-Core
- Create tool to translate to another language like Esperanto, Chinese, Japanese, Spanish, French
-
Misc
- Write Pricing Guide
- Create Rosie Help Bot
- Usage by organization member See community thread
Done List
- Redirect handled errors to console UI
- Implement GPT-4.0 variable
- Query: Conversational. See also langchain_playground: tools_AgentWebBrowserChat
- Add Filter UI by metadata
- Implement Messages Roles
- Chat History to vector store Xata
- Load text documents
- Load vtt text transcripts and convert them to json
- Add text to vectore store
- Q&A from vector stores HNSWLib (local), Pinecone, Xata
- Ingest youtube transcripts
- Implement LangChain meeting minutes
- BUG: New AI role repeats last User message
- Change Inputs to Datalists. Using json for now.
- Implement LangChain
Notes
-
Python in Node.js
from langchain.chains import VectorDBQA
from langchain.chat_models import ChatOpenAI
qa = VectorDBQA.from_chain_type(llm=ChatOpenAI(), chain_type="stuff", vectorstore=db, k=1)
query = "What is the document about"
qa.run(query)
- https://medium.com/@imicknl/how-to-improve-your-chatgpt-on-your-data-solution-d1e842d87404
- https://medium.com/@imicknl/how-to-improve-your-chatgpt-on-your-data-solution-d1e842d87404
- https://js.langchain.com/docs/modules/agents/tools/how_to/agents_with_vectorstores
- Executive Meetings AI - HackMD
Legend
To-do WIP Done Important
Napkin Business Canvas
Rosie-AI
Generating reputation.
is a AI Assistant
for organizations
fearful of AI-generated content that may cause reputational harm.
Unlike the commonly used ChatGPT on the web, which cannot access current information , or add-ons that create clueless meeting minutes, drafts or summaries,
Rosie-AI trains itself with industry and organization specific resources. It then assists you in creating authentic context-driven content, thereby continuously training iteslf with complete transparancy and attribution of sources.
This is worth price of an entry-level employee for our customers,
which we find through channels like Trust Over IP.
Parking Lot
- Working on authenticate content
- of the belief there's more knowledge being shared in meetings than being captured and capitalized on.
- Langchain Syncing data sources to vector stores
- Test deleting docs from Xata
- Need vector store meets Indexing API requirements
- Xata should work
- Pinecone should work
- 2023-09-13: Chroma which will be offering hosted vector stores for serverless apps
- 2023-09-21: Indexing API is not available in JavaScript yet.
- See Pinecone doc to query by metadata - like source and then delete all docs before adding new ones
- Finding information in long documents with AI using vector databases and MapReduceChain from Langchain
- Create Summarization Provider option
- Add Model Verion to Chat Settings to allow for 16k
- Added env RETRIEVER_K
- Added RETRIEVER_K slider for Chat Settings
- Is AssemblyAI transcripts better than zoom's? Can "Plan & Execute" map Speaker A to Zoom participant? If so, then use AssemblyAI transcripts.
- AssemblyAI: Live transcript and adding Speaker A's name in realtime. See also https://picovoice.ai/docs/quick-start/eagle-web/
- (Integrate Audio into LangChain.js apps in 5 Minutes - YouTube
- https://www.assemblyai.com/docs/Models/speech_recognition#custom-vocabulary
- Not good enough?
- Create /.netlify/functions/webhook.mjs?testing=123 to call background function and avoid timeout
- Implement AssemblyAI webhook.
- Not possible with AssemblyAI Loader
- https://www.assemblyai.com/docs/Models/speech_recognition#custom-vocabulary