Instruction 01
Why Microsoft Azure?
Microsoft Azure is an open and flexible computing platform. It offers a wide range of products and services capable of handling global and large-scale enterprise workloads. As of this writing, you can create an account to try out over two hundred of their offerings for a limited 30-day period.
Azure AI Search
Azure AI Search (or Azure Cognitive Search, as it was formerly known), provides information retrieval services for both traditional and generative AI searches. Like all other Azure products and services, you’re guaranteed to benefit from many other features, like security, scalability, monitoring, integration with other services, customer support, and more.
Azure AI Search also includes indexing custom data, support for different types of queries, query performance tuning, and many other AI features.
As the AI buzz keeps growing, it’s a good idea to learn more, and to make use of it in your own apps!
So, what are the offerings of Azure AI Search — and why isn’t traditional search enough? Read on to learn about the core features and capabilities of Azure AI Search.
Exploring Core Features of Azure AI Search
At the heart of Azure AI Search is information retrieval. With Azure AI Search, you get a full search engine capable of performing full-text search, vector search, and hybrid search.
Azure AI Search performs full-text search with Apache Lucene — a powerful search algorithm that produces quick and relevant results for complex textual searches. A vector search is what makes retrieval search possible, as Machine Learning (ML) and Large Language Models (LLMs) store and work with data stored as vectors. Vectors are a special representation of data which indexes data based on semantics or conceptual similarities. This type of representation works with text, media formats, documents, graphs, and many other kinds of data. A hybrid search combines the power of both textual and vector search.
Each of these kinds of searches comes with its own unique extra capabilities you’ll find useful in a number of situations. For example:
- Lucene search means you can search any part of a text and still get good results.
- Vector search means you can get an idea of how good your search results are with regards to your queries and the indexed data. This gives you the opportunity to apply the many advanced AI techniques out there to enhance your searches for your use case.
To access this search capability, Azure AI Search supports textual queries, fuzzy search, autocomplete, geo or spatial search, vector search, and more.
Before a search is possible, however, you will need to index the data to be searched on. Thankfully, Azure AI Search includes indexing for all the kinds of search capabilities described above! Internally, it includes all the necessary data preparation for each type of indexing, like chunking, enrichment, caching, and embedding.
Another huge benefit of Azure AI Search is simply that it’s a Microsoft Azure service — this means you get scalability, security, and global reach. With servers across the globe, you can get these offerings close to the geographical location of your target market.
Finally, you get easy integrations with the many Azure services and products available. Some integrations, such as Azure OpenAI service, come with other pre-customized features, with simplified and richer customizations you can use right away. This gives you the opportunity to build rich and unique apps.
Understanding RAGs
Retrieval-Augmented Generation (RAG) is an aspect of AI where LLMs are enhanced with custom data. LLMs are built by training on existing information. Besides knowing how to generate human-readable responses, it also means the responses are limited to the training data and how it’s done. RAGs rely on LLM’s generative capabilities to produce relevant results based on custom dynamic data. It accepts new data, but leverages existing generative knowledge to generate new, relevant responses.
Be aware that there are limitations to LLMs. For example, ChatGPT’s training data cut-off year is (at this time of writing) 2021, which means that it won’t be able to answer questions beyond that date unless it searches online.
Another limitation is that not every piece of information is available online. So, if ChatGPT needs to search online for something it doesn’t have data on, it will still not gain private information — for example, about your business’ financials, or internal knowledge base — so it can’t answer any questions on those.
RAGs overcome this limitation easily, by allowing you to index custom data and search over them. This is one of the most popular usages of AI worldwide — businesses are constantly integrating RAGs and other variations of it in their apps and other services.
Comparing Azure AI Search with custom LangChain RAG solutions
Despite all of Azure AI Search’s capabilities, in this module, the emphasis is on RAG with Azure AI Search. This means you’ll use more of the vector and hybrid indexing, querying, and optimization features of Azure AI Search. Compared with LangChain, there are some noteworthy points to keep in mind.
LangChain is an open-source framework for building LLM-based apps. It has a host of components that provide integrations with lots of other tools and frameworks in the AI/ML ecosystem, providing simple unified APIs which make building complex AI apps a breeze. By chaining these components together, you can build RAG apps with minimal effort, all from within an editor. It’s for good reason that its growth is so rapid!
Here’s how Azure AI Search compares with LangChain:
-
Azure AI Search is a cloud service. While it has SDKs for different programming languages that allow you to work with it offline, you need to create an account online to use it. You also need to create the various resources needed and assign the necessary permissions. This makes Azure AI Search more complicated to use, as it involves a number of things you may not even need. LangChain, on the other hand, doesn’t require account creation, but you’ll have to find the other components from the LangChain community and follow their requirements. You only include the features you need, thus making it simpler and more flexible to use than Azure AI Search.
-
Azure AI Search is purposely for search. The good thing is that it has easy integrations with many other tools which are also part of the Microsoft Azure platform. LangChain isn’t built for search — it’s built for everything LLM, and also offers integrations with many other products and services via community components.
-
Azure AI Search is a proprietary product of Microsoft. This means you’re limited to what the platform provides and also limited by the pricing models and architectural infrastructure. However, LangChain is free and open-source. You can access the source code online and provide customizations to suit your use case. It can get pretty costly to use Azure AI Service, but the open-source nature of LangChain means it’s possible to build the same rich app with LangChain at no cost at all.
-
Azure AI Search is much older and hence more reliable than LangChain. As of this writing, LangChain has yet to reach a 1.0 or stable version. This doesn’t mean LangChain isn’t stable, but at version 0.3, it’s quite far away from where the maintainers expect it to be for large-scale production-grade usage. While there’s lots of content and support for Azure AI Search because it’s been around for much longer and backed by Microsoft, it’s much less for LangChain, although there’s sufficient information out there for LangChain too.
-
Because of Microsoft Azure’s rich ecosystem and Azure AI Services’ feature-rich capabilities, it can be overwhelming for beginners. There’re many ways to use it, and lots of documentation explaining many technical terms, which can sometimes be intimidating to find what you’re looking for. For LangChain, it’s made up of a relatively small core and focused community libraries that provide integrations with other products and services outside LangChain.
-
Azure AI Search is easily scalable — it can be deployed in many regions, and most things you’ll ever need are built for you. You only need to identify them and follow the rules. With LangChain, you’ll have to build everything manually. While this means you get to design things your way, it could also mean lots of required time to build a similar solution with Azure AI Search.
In the next segment, you’ll see various ways to access and use Azure AI Search.