Giter VIP home page Giter VIP logo

docs's People

Contributors

arnoldligtvoet avatar atroyn avatar beggers avatar cakecrusher avatar carolinedlu avatar davelak avatar davidateg avatar dbasch avatar dooart avatar fluder-paradyne avatar giuliohome avatar hammadb avatar helgesverre avatar jeffchuber avatar joanfm avatar kishkath avatar levand avatar mithleshupadhyay avatar nirga avatar nxp4code avatar russell-pollari avatar satyamdalai avatar sourishkrout avatar sudhcha avatar sweep-ai[bot] avatar swyxio avatar tazarov avatar timothycarambat avatar tuanacelik avatar weiligu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

docs's Issues

Document float comparison

Document best practices for comparing floats using where filters. I.E Don't use $eq and use $gt and $lt instead with a tolerance.

Performance tips

Help users understand the perf and memory req of the system at various scales

have a "standard" format for embeddings integrations

right now they are all very different. maybe we include info like

  • description
  • dimensionality
  • max input tokens
  • tokenizer
  • cost
  • languages (english, other?)
  • performance stats? on MTEB/BEIR?
  • decoder
  • how to get an API Key
  • link to their docs / marketing

[DOCS] Incorrect claim about uniqueness of user provided document id's

The User Guide states:

Each document must have a unique associated id. If you try to .add a document with an id that already exists in the collection, it will raise an exception.

The below example demonstrates that this is not the case:

import chromadb

client = chromadb.Client()
collection = client.create_collection(name="my_collection")
collection.add(
    documents=['doc1', 'doc2'],
    ids=['id1', 'id1'] # <-- duplicate IDs
)

collection.get(
    ids=["id1"]
)

Result:

{'ids': ['id1', 'id1'],
 'embeddings': None,
 'documents': ['doc1', 'doc2'],
 'metadatas': [None, None]}

Inspecting the implementation for inserting new records in the add() implementation for duckdb and the add() implementation for clickhouse appear to support the statement that uniqueness is not enforced and no exception will be raised.

Typo in docs/embeddings.md

I’m new to this project and wanted to create Embeddings using val = default_ef("foo") and was getting absolutely massive embedding sizes. I think the string “foo” gets turned into a list of its chars and then each char gets an embedding. I think it should be val = default_ef(["foo"]) instead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.