RAG with Azure AI Search

Nov 15 2024 · Python 3.12, Microsoft Azure, JupyterLab

Lesson 02: Vector Search in Azure AI Search

Demo 02

Episode complete

Play next episode

Next

Heads up... You’re accessing parts of this content for free, with some sections shown as obfuscated text.

Heads up... You’re accessing parts of this content for free, with some sections shown as obfuscated text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

In this demo, you’ll create an app, embed and index textual data, and search through it using vector search.

Project Setup

Open the starter project in Visual Studio Code to get started.

Data Preparation

In the next cell, under Generate Embeddings, you initialize an Azure OpenAI instance with your credentials — this will enable you to manage your Azure OpenAI resource from your app.

Embedding Creation

Your data is now ready for embedding! This is a necessary step to vectorize your data prior to indexing.

Search Client Configuration

Now, you can go ahead and create a search client. This is what you’ll use to perform your search on Azure AI Search. You’ll create one by initializing an instance of SearchIndexClient. In the next cell, under Setup Fields, that’s exactly what your app is doing. To initialize this client, you provide the fields you want your client to search. Execute this cell to create the search client.

Vector Search Setup

It’s now time to configure a vector search! To do this, you’ll create an instance of VectorSearch. In the next cell, you’re configuring a vector search to use the Hierarchical Navigable Small World (HNSW) algorithm. The name argument is simply a name you’ll use to identify your algorithm.

Semantic Search Configuration

To enhance your search even further, you’ll configure a semantic search in the next cell — this will give you the opportunity to specify specific fields to apply semantic search to your data. Uncomment the code under # TODO: Configure semantic search in the next cell, and execute it to create a configuration for the semantic search. Name it my-semantic-config and specify the title, category, and content fields as the prioritized fields to search.

Indexing and Querying

With all configurations properly set up, you’re ready to index your data. You’ve named your index vectest. Uncomment the code in the next cell, execute it, and wait for your index to be created. Check the output as it displays “vectest created” — this shows that your index was created successfully.

Cleanup

You can try out a few more queries if you want, but don’t forget to clean up after yourself when you’re done! That’s precisely what the last cell does — it deletes your index to preserve resources and avoid incurring unnecessary costs.

See forum comments
Cinema mode Download course materials from Github
Previous: Instruction 02 Next: Conclusion