microsoft / azuredataretrievalaugmentedgenerationsamples Goto Github PK
View Code? Open in Web Editor NEWSamples to demonstrate pathways for Retrieval Augmented Generation (RAG) for Azure Data
License: MIT License
Samples to demonstrate pathways for Retrieval Augmented Generation (RAG) for Azure Data
License: MIT License
In order to protect and secure Microsoft, private
or internal
repositories in GitHub for Open Source which are not related to open source projects or require collaboration with 3rd parties (customer, partners, etc.) must be migrated to GitHub inside Microsoft a.k.a GitHub Enterprise Cloud with Enterprise Managed User (GHEC EMU).
✍️ Please RSVP to opt-in or opt-out of the migration to GitHub inside Microsoft.
❗Only users with admin
permission in the repository are allowed to respond. Failure to provide a response will result to your repository getting automatically archived.🔒
Reply with a comment on this issue containing one of the following optin
or optout
command options below.
✅ Opt-in to migrate
@gimsvc optin --date <target_migration_date in mm-dd-yyyy format>
Example:
@gimsvc optin --date 03-15-2023
OR
❌ Opt-out of migration
@gimsvc optout --reason <staging|collaboration|delete|other>
Example:
@gimsvc optout --reason staging
Options:
staging
: This repository will ship as Open Source or gopublic
collaboration
: Used for external or 3rd party collaboration with customers, partners, suppliers, etc.delete
: This repository will be deleted because it is no longer needed.other
: Other reasons not specified
I am following this tutorial about taking advantage of Azure Cosmos DB for Mongo DB vCore's vector similarity search functionality. To do so, I created a Cosmos DB resource using "Try Azure Cosmos DB" with a resource group located in East US.
I connected to the database using this connection string:
import urllib
import pymongo
COSMOS_MONGO_USER = 'cosmosrgeastus3xxxxxxxxxxxxxxxxxxxxxb'
COSMOS_MONGO_PWD = 'zxxxxxxxxxxxxxxxxxxxxxxxxxxxxx='
COSMOS_MONGO_SERVER = 'cosmosrgeastus318282c5-ac03-48af-82f4db.mongo.cosmos.azure.com'
COSMOS_MONGO_PORT = '10255'
mongo_conn = "mongodb://"+urllib.parse.quote(COSMOS_MONGO_USER)+":"+urllib.parse.quote(COSMOS_MONGO_PWD)+"@"+COSMOS_MONGO_SERVER+':'+COSMOS_MONGO_PORT+"?ssl=true&replicaSet=globaldb&retrywrites=false&maxIdleTimeMS=120000&appName=@cosmosrgeastus318282c5-ac03-48af-82f4db@"
mongo_client = pymongo.MongoClient(mongo_conn)
Despite a warning ("You appear to be connected to a CosmosDB cluster"), the client seems to be created successfully.
Note:
According to the tutorial, the connection string is supposed to be
mongo_conn = "mongodb+srv://"+urllib.parse.quote(COSMOS_MONGO_USER)+":"+urllib.parse.quote(COSMOS_MONGO_PWD)+"@"+COSMOS_MONGO_SERVER+"?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
However, using that raises an exception "ConfigurationError: The DNS query name does not exist: _mongodb._tcp.cosmosrgeastus318282c5-ac03-48af-82f4db.mongo.cosmos.azure.com."
That is why I changed it to the actual connection string provided by the Azure CosmosDB resource alongside the user, password and server values.
Then I created a database and a collection
# create a database called TutorialDB
db = mongo_client['TutorialDB']
# Create collection if it doesn't exist
COLLECTION_NAME = "CarrierManualCollection"
collection = db[COLLECTION_NAME]
if COLLECTION_NAME not in db.list_collection_names():
# Creates a unsharded collection that uses the DBs shared throughput
db.create_collection(COLLECTION_NAME)
print("Created collection '{}'.\n".format(COLLECTION_NAME))
else:
print("Using collection: '{}'.\n".format(COLLECTION_NAME))
Which results as expected printing Created collection 'CarrierManualCollection'.
Then, I try to create an IVF index, since "IVF is supported on all cluster tiers, including the free tier".
db.command({
'createIndexes': COLLECTION_NAME,
'indexes': [
{
'name': 'VectorSearchIndex',
'key': {
"contentVector": "cosmosSearch"
},
'cosmosSearchOptions': {
'kind': 'vector-ivf',
'numLists': 1,
'similarity': 'COS',
'dimensions': 1536
}
}
]
})
But I got this error message:
OperationFailure: cosmosSearchOptions, full error: {'ok': 0.0, 'errmsg': 'cosmosSearchOptions', 'code': 197, 'codeName': 'InvalidIndexSpecificationOption'}
The expected behavior is to get a success message that allows me to continue with the tutorial adding data to the collection.
What am I missing?
Hi, I'm trying the tutorial notebook for CosmosDB-MongoDB-vCore.
I have no problems connecting with:
mongo_conn = "mongodb+srv://"+urllib.parse.quote(COSMOS_MONGO_USER)+":"+urllib.parse.quote(COSMOS_MONGO_PWD)+"@"+COSMOS_MONGO_SERVER+"?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
mongo_client = pymongo.MongoClient(mongo_conn)
But then when creating the the DB and listing collection names with:
db = mongo_client['ExampleDB']
COLLECTION_NAME = "ExampleCollection"
collection = db[COLLECTION_NAME]
if COLLECTION_NAME not in db.list_collection_names():
db.create_collection(COLLECTION_NAME)
the call of:
db.list_collection_names()
doesn't go through and return the error:
ServerSelectionTimeoutError: c.cosmos-db-openai-explore.mongocluster.cosmos.azure.com:10260: timed out (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: <TopologyDescription id: 6602e5685fa5c332a820d706, topology_type: Unknown, servers: [<ServerDescription ('c.cosmos-db-openai-explore.mongocluster.cosmos.azure.com', 10260) server_type: Unknown, rtt: None, error=NetworkTimeout('c.cosmos-db-openai-explore.mongocluster.cosmos.azure.com:10260: timed out (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>
Any advice? Thanks in advance!
does you have future plan for support more language on this feautre add-your-data vCore? If so, what's the ETA? Thanks for reply :)
Postgres notebook has a rewrite issue. On my side, notebook cell keeps running. Discussed with Hossein. A different but related issue occurs on his side. Data keeps appending.
python repo:
CosmosDB-MongoDB-vCore
CosmosDB-NoSQL_CognitiveSearch
I'm trying to setup the connection for my MongoDb vCore cluster using pymongo, here is the code I have used
mongo_conn = "mongodb+srv://"+COSMOS_MONGO_USER+":"+COSMOS_MONGO_PWD+"@"+COSMOS_MONGO_SERVER+"?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
mongo_client = pymongo.MongoClient(mongo_conn)
Following is the error I have received.
ConfigurationError: nameserver is not a dns.nameserver.Nameserver instance or text form, IP address, nor a valid https URL
COSMOS_MONGO_SERVER values was in this format :
sample-db.mongocluster.cosmos.azure.com/
Environment:
Python: 3.11.5
Pymongo : 4.5
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.