Working with Azure Search in C#.NET

Standard

What is Azure Search

Azure Search is a service in Microsoft Azure Cloud Platform providing indexing and querying capabilities. Azure Search is intended to provide developers with complex search capabilities for mobile and web development while hiding infrastructure requirements and search algorithm complexities.

I like the idea of having an index service provider being part of my solution as it allows developers to perform searches quickly and effortlessly without going through the pain of writing essay-length SQL query.

The key capabilities of Azure Search include scalable full-text search over multiple languages, geo-spatial search, filtering and faceted navigation, type-ahead queries, hit highlighting, and custom analyzers.

azure-search-daniel-foo

Why Azure Search (or index provider in general)

  1. It’s Brilliant – Having an index provider sitting in between application and database is a great way to shield the database from unnecessary or inefficient Read requests. This allows database I/O and computing power to be reserved for operations that truly matters.
  2. It’s Modern – Traditional search approach requires developers to write essay-length SQL query to retrieve simple aggregated data. Index provider such as Azure Search allows developers to initiate a search request without writing complicated search algorithm (for example geo-spatial search) and no complex SQL query is required (for example faceted navigation).
  3. It Scales – Solutions that gain the most benefited from index provider are relatively larger enterprise system. Such system often require scaling to certain extain. Scaling Azure Search in Microsoft Azure Cloud Platform is several clicks away compared to having an on-premise service provider such as Solr.
  4. It’s FREE – Microsoft Azure provides a free tier for Azure Search. Developers can create and delete Azure Search service for development and self-learning purpose.

Use Case

A car classified site is a good example to make use of index service provider such as Azure Search. I will use a local car classified site, Carlist.my to illustrate several features the system can potentially tap into.

Type-ahead Queries allow developer to implement auto suggestions as user types. In the following example, as I was typing “merce”, the site returns me a list of potential Mercedes-Benz car model that I might be interested in.

type-ahead-queries

Facet allow developers to retrieve aggregated data (such as car make, car model, car type, car location, car price range, etc) without writing writing complex SQL query, which also means saving the Read load at database. In the following example, the site returns me a list of Mercedes-Benz model and the count in the bracket to indicate how many classified ads are available for the specific model.

facet

Filter allows developer to retrieve documents that fit the searching criteria without writing complex SQL queries with endless INNER JOIN and WHERE clauses. In the following example, I specified that I want all the used Mercedes-Benz, model of E-Class E200, variant of Avantgarde from Kuala Lumpur with the price range of RM100,000 to RM250,000. You can imagine the kind of INNER JOIN and WHERE clauses the poor developer has to design dynamically if this were to be retrieved from a database directly.

filter

Another feature the car classified site can potentially tap into is Geo-spatial Search although it is not seen implemented. For example if I were to search for a specific car in a dealership, the portal can suggest similar cars from other dealerships nearby to the dealership I’m looking at. That way when I make a trip to visit a dealership, I can also visit other nearby dealerships that have the similar cars.

Using C#.NET to work with Azure Search

Let’s roll our sleeves and hack some codes. I will be using a C#.NET console application to illustrate how we can design and create an index, upload documents into the index and perform several types of searches on the index. This solution will be used to simulate some of the potential codes required by a car classified portal.

First, we create a solution name AzureSearchSandbox.

We will need “Microsoft.Azure.Search” NuGet package from NuGet.org. Open your Package Manager Console and run the following command:

Upon successful installation, you will see several NuGet packages are added into your packages.config file in your solution.

Note that you will only need to install “Microsoft.Azure.Search”, the other packages are dependencies. The dependencies are resolved automatically. 

In your solution, add a new class Car.cs

This Car object will be used to represent your Azure Search document.

Next we create a static Helper.cs class to take care of the initialization of the index in Azure Search.

Initialize() is the main method to kick start our Azure Search index. Unlike other on-premise index service that require certain amount of setup such as Solr, it doesn’t take long to have our index up and running. In Solr, I have to install Solr using NuGet, install the right Java version, set the environment variable and finally create a core in Solr admin portal. With Azure Search, no upfront set up is required.

The index is created in CreateIndex() method, where we tell Azure Search client SDK that we want an index with the fields we define in our Index object.

To ensure this set of code is running on a fresh index, we have DeleteIfIndexExist() method to ensure the previous index is removed. We call this right before CreateIndex() method.

Next, we add a new class Uploader.cs to deal with the documents we are about to upload into our Azure Search index.

PrepareDocuments() is a simple method to construct a list of dummy Car object for our searches later on.

Upload() method gets the dummy Car objects from PrepareDocuments() and pass these object into Azure Search client SDK to upload the documents into index in a batch. Note that we added a 2000 millisecond sleep time to allow our service to upload and process car documents properly before moving on to next part of the code, which is search. However in practical sense, we would not want to add sleep time in our upload code. Instead, the component that takes care of searching should expect index is not available immediately. We also catch IndexBatchException implicitly to handle the index in case the batch upload of index failed. In this example, we merely output the index key. In practical sense, we should implement a retry or at least logging the failed index.

Once the index upload operation is completed, we will add another class Searcher.cs to take care of the searching capability.

SearchDocuments() method is to handle the searching mechanism on the index we created earlier. No fancy algorithm, only passing specific instruction to Azure Search client SDK on what we are looking for and display them. In this method, we take care simple text search, filter and facets. There are much more capabilities Azure Search client SDK can provide. Feel free to explore the SearchParameters and response object on your own.

Putting them all together in Program.cs

First we define index service name and API key to create a search index client instance. The instance is returned by Helper.Initialize(). We will make use of this instance for both search and upload later.

After initializing the index, we call Upload() method to upload some dummy car documents to the index.

Next, we perform the following searches:

  1. Simple text search. We will search for the text “Perodua” in the documents.

The result as following. Azure Search index returns 2 documents which contains the keyword “Perodua”

perodua

2. Using Filter. A more targeted and efficient approach to look for documents. In the following example, first we look for Category field which is equal to ‘Hatchback’; second we look for Price field which is greater than 100,000 and is a ‘Sedan’ category. More details on how to write expression syntax in Azure search.

The result as following: Car with the category of Hatchback and car cost more than 100,000 and is a Sedan.

filter

3. Searching facets. With facets, developer will no longer need to write long query that combines Count() and endless GROUP BY clauses.

If we have a traditional database table that represent the dummy car data, this is equivalent to “SELECT Category, Count(*) FROM Car GROUP BY Category”.

Result as following:

facets

This example might not look like a big deal if you have a small data set or simple data structure. Facet is super handy and fast when you have large number of data and when your query is getting more complex. The ability to define which facet is required in C# codes make the”query” much cleaner and easier to maintain.

One last thought…

You can clone the above source code from my GitHub repo. I will be happy to accept Pull Request if you want to demonstrate how to make use of other capabilities in Azure Search client SDK. Remember to change the API key and service name before compiling. Have fun hacking Azure Search!