Introducing Chroma Database

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

Introducing Chroma Database

Understanding vectors paves the way for comprehending vector databases. There’s a wide variety available, and LangChain seamlessly integrates with many. You can explore the full list of supported vector databases at https://python.langchain.com/v0.1/docs/integrations/vectorstores/.

Getting Started With Chroma

Chroma is an open-source vector database designed with developer productivity in mind. Chroma is available in Python and JavaScript as of this writing. In your notebook, you can install it with:

pip install chromadb
import chromadb

chroma_client = chromadb.Client()
from langchain_chroma import Chroma

db = Chroma(
  embedding_function=embeddings_model,
)
client = chromadb.PersistentClient(path="/path/to/storage/directory")
collection = chroma_client.create_collection(name="olympics_collection")

collection.add(
    documents=[
        "The 2024 Olympics had the most gender-balanced field of play in 
          history, with equal numbers of male and female athletes.",
        "The United States won the most medals, with 40 gold and 126 total 
          medals. China came in second with 40 gold medals and 91 total medals.",
        "France spent around $10 billion to host the games, which was more than
          three times less than the cost of the 2020 Tokyo Olympics."
    ],
    ids=["id-1", "id-2", "id-3"]
)

results = collection.query(
  query_texts=["Which country won the most medals?"],
  n_results=2
)
print(results)
{
  "ids": [
    [
        "id-2",
        "id-1"
    ]
  ],
  "distances": [
    [
      0.5161427855491638,
      1.2563385963439941
    ]
  ],
  "metadatas": [
    [
        None,
        None
    ]
  ],
  "embeddings": None,
  "documents": [
    [
      "The United States won the most medals, with 40 gold and 126 total medals. 
        China came in second with 40 gold medals and 91 total medals.",
      "The 2024 Olympics had the most gender-balanced field of play in history, 
        with equal numbers of male and female athletes."
    ]
  ],
  "uris": None,
  "data": None,
  "included": [
    "metadatas",
    "documents",
    "distances"
  ]
}
See forum comments
Download course materials from Github
Previous: Vector Embeddings Demo Next: Chroma Demo