telequest's Issues
Add handler for questions
We have a start handler for the bot, so it responds when someone types /start. Add a question handler, maybe /q
, so when someone sends a message, for example /q what time is the event?
the bot knows it's a question and can respond.
Add function(s) to store multiple messages to the database.
In the database code, we have a function to store a single message to MongoDB. Add a function to take in a list of messages and store everything to MongoDB. The messages will all belong to the same group chat. The function should take in the chat_id of the group chat as well.
Create initial telegram bot
MongoDB (or other NoSQL) for storing chat info
We need to store chat information in databases to be able to use them. For a single message, we likely need to store
- ID of the group it was sent in
- User ID
- date/time it was sent
- Text content of message
- ID of the message it was replying to, if it is a reply
- Other metadata eg links in message or if it's solely media
We could also use a Relational DB, but Non-relational / NoSQL might be better. See [https://www.mongodb.com/nosql-explained/nosql-vs-sql](MongoDB NoSQL vs SQL)
Read chat history
Once the bot is added to a telegram group newly, it needs to read the last 30,000 messages and store them in MongoDB. It should also convert them to embeddings and store them in Pinecone.
We need a listener that can determine when bot is added to a group and read the chat history.
Create contributing doc
Vector databse
These are used to store embeddings for LLMs. Vector embeddings are representations of text as lists of numbers that help capture semantic similarities. We will use it to store embeddings of telegram messages.
Pinecone is a popular Vector database, but there might be others. Research and suggest a good one @makoohara .
Fix ask function
Use the correct create chat completion in ask function in get_answers.py
Create landing page
Create a first page that gives info about the application. Use simple styles with white as the main background color. @andriikorotun24 and @casscruzz are working on this.
Test out frontend for additions
Run the landing page and make any additions you deem needed. If you do not have any additions to make, update @Liet3 promptly so you can also look at the backend code.
Test out rate limits
OpenAI sets some strict rate limits on their APIs. After telegram bot is created, use it to test out rate limits on the embedding and GPT APIs.
OpenAI for embeddings
Create a basic setup of using OpenAI's APIs for embeddings and Q/A.
Store vector embeddings from chat history to Pinecone
In vectordb.py, we have a function upload_vectors
that can upload a list of 100 embeddings max to the vector database.
Write a function that can take in a very large list of embedding data, break this into batches of maximum size 100 and call the upload_vectors
function on each batch.
Assume the structure of embedding data is the same as in upload_vectors
: https://github.com/minerva-university/TeleQuest/blob/dev/db/vectordb.py#L33
Docker
ChatCompletion endpoint randomly freezes.
Add function(s) to read message from database
Given a list of message IDs, we want to get those messages from MongoDB database. Add this (or these) functions to the database code.
The answer in this StackOverflow post is a nice starting point: https://stackoverflow.com/questions/23137870/mongodb-check-if-field-is-one-of-many-values
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.