Azure WebJob with Azure Queue

Standard

Cron job is essential part of complex systems to execute of certain script or program at a specific time interval. Traditionally, developer or system administrator create Windows Scheduled Task to execute scheduled job within the operating system.

In one project, I used to have multiple .exe programs scheduled to update the production database during mid night for various use cases such as expiring user credit. This gets the job done within application context but this is not the cleanest way when my system administrator need to take care of 20 other cron jobs coming from different machines and different operating systems.

The next thing I have implemented is to expose a cron job through WCF API endpoint. For example, I opened a WCP API endpoint to be triggered on a functionality in sending email notification. This end point will map user’s saved criteria and business inventory on a daily basis. (Yes, this is the annoying 9.00AM email spam notification you get everyday. Sorry!) The WCF API endpoint does not do anything if no one hits it. It is a simple HTTP endpoint waiting for something to tell him to get up and work.

The reason to expose the cron job as WCF API endpoint is to allow my system administrator to have a centralized system to trigger and monitor all the cron jobs in one place rather than logging into multiple servers (operating systems) to monitor and troubleshoot. This works alright except that now I have my cron job stuck in a WCF project instead of simple script or a lightweight .exe program.

Azure WebJob

The next option is Azure WebJob. Azure WebJob enables me to run programs or scripts in web app context as background processes. It runs and scales as part of Azure Web Apps. With Azure WebJob, now I can write my cron job as simple script or a light-weight .exe rather than WCF. With Azure WebJob, my system administrator can also have a centralized interface, Azure Portal to monitor and configure all the cron jobs. In fact, it’s pretty cool that I can trigger a .exe program using a public HTTP URL using the Web Hook property in WebJob.

Azure WebJob goes beyond the traditional cron job definition (timer based). Azure WebJobs can run continuously, on demand or on a schedule.

The following file types are accepted:

  • .cmd, .bat, .exe (using windows cmd)
  • .ps1 (using powershell)
  • .sh (using bash)
  • .php (using php)
  • .py (using python)
  • .js (using node)
  • .jar (using java)

I will use C#.NET to create a few .exe to demonstrate how Azure WebJob works.

Prerequisites (nice to have)

Preferably, you should have Microsoft Azure SDK installed on your machine. At the point of writing this, I’m running Visual Studio 2015 Update 3, so I have the SDK for VS2015 installed using Web Platform Installer.

Note that this is NOT a MUST have. You can still write your WebJob in any above-mentioned language and upload you job manually in Azure Portal. The reason I recommend you to install it is to make your development and deployment much easier.

Working with Visual Studio

If you have Microsoft Azure SDK installed for Visual Studio, you will see Azure WebJob template. We are going to start with a simple Hello World program to get started.

Your WebJob project will be pre-loaded with some sample codes. As usual, you will still need to build it once for NuGet to resolve the packages. 

The following packages are part of your packages.config if you started the project with Azure WebJobs template. No big deal if you didn’t, you can install them manually, although it’s a little tedious.

For now, we will ignore the fancy built-in SDK support for various Azure services. We will create a simple Hello World program and deploy it to Azure to get an end-to-end experience.

Remove everything else and left with Program.cs writing a simple message:

Go to your Azure Portal. Click on Get publish profile to download your publishing profile. You will need it to import into your Visual Studio later when publishing your WebJob.

Import your publish profile into your project in Visual Studio. The Import dialog will kick in at the first time when you publish your project as WebJob.

Right click on your project and select “Publish as Azure WebJob”, you will see the following dialog to set up your publishing profile.

Import the earlier downloaded publishing profile setting file into your WebJob project in Visual Studio.

Validate your connection.

Click on “Publish” to ship your application to Azure.

Upon successful publishing, you should be seeing a message “Web App was published successfully…”

Go to your Azure portal and verify that your WebJob is indeed listed.

Select your WebJob, click on the Logs button on top to see the following page.

The impressive part about using WebJob in Azure is the following WebJobs monitoring page. You can use this page to monitor multiple WebJobs status and drill down deeper into the respective logs. No extra cost, coding or configuration, all work out of the box!

Now we have our first Hello World application running in Azure. We have deployed our WebJob to run continuously, which means it will get triggered automatically every 60 seconds. Once the first round is completed, status will change to PendingRestart and wait for the next 60 seconds to kick in.

WebJob SDK sample project in GitHub demonstrates comprehensively how you can work with WebJob through Azure Queue, Azure Blob, Azure Table, Azure Service Bus. In this article, we will do a little bit more coding by using WebJob to interact with Azure Queue.

Azure WebJob with Azure Queue Storage

Microsoft.Azure.WebJobs namespace provides QueueTriggerAttribute. We will use it to trigger a method in our WebJob.

This works by whenever a new message is added into the queue, the WebJob will be triggered to pick up the message in the queue.

Before we continue in our codes, we first need to create a Azure Storage account to host the queue. Here, I have a storage account name “danielfoo”.

We will use Microsoft Azure Storage Explorer to get visual on our storage account. It’s a handy tool to visualize your data. If you do not have it, no worry, just imagine the queue message in your mind 🙂

Let’s add a new console application project in our solution to put some messages in our queue.

There are two packages that you’ll need to install into your queue project:

We will write the following simple codes to initialize a queue and put a message into the queue.

Of course, you will have to configure your StorageConnectionString in app.config for codes to recognize the connection string.

You can get your account name and key from Azure Portal.

Let’s execute our console application to test if our queue can be created and whether a message can be placed into the queue properly.

After execution, look at Storage Explorer to verify if the message is already in the queue.

Now we will dequeue this message so that it will not interfere with the actual QueueTrigger in our exercise later.

Next, we will create a new WebJob project that get triggered whenever a message is added into the queue by using QueueTriggerAttribute under Microsoft.Azure.WebJobs namespace.

This time we do not remove Functions.cs nor modify Program.cs.

Make sure that your Functions.cs method parameter contains the same queue name as what you defined earlier in your Queue.MessageGenerator project. In this example, we are using the name “danielqueue”.

Program.cs

Remember to fill up your App.config on the following connection string. This is to allow the WebJob to know which storage account to monitor.

Now, let’s start WebJob.QueueTrigger project as a new instance and allow it to wait for a new message add into “danielqueue”.

Then, we will start Queue.MessageGenerator project as a new instance to drop a message into the queue for WebJob.QueueTrigger to pick up.

Yes! Our local debug is has detected a new message is added into “danielqueue” hence hit the ProcessQueueMessage function.

Let’s publish our WebJob.QueueTrigger to Azure to see it processing the queue message in Azure context instead of local machine. After successful publishing, we now have 2 WebJobs.

Select QueueTrigger (the WebJob we just published) and click on Logs button on top. You will see the following log on queue message processing.

If you drill down into particular message, you will be redirected to the Invocation Details page

We have just setup our WebJob to work with Azure Queue!

That wraps up everything I want to show you in working with Azure WebJob and Azure Queue.

Obviously in reality you will write something more complex than simply output the log in your WebJob. You may write some logic to perform certain task. You may even use this to trigger another more complex job sitting in another service.

In the queue, obviously you also wouldn’t write a real “message” like I did. You will probably create one queue for very specific purpose. For example, you will create a queue to store a list of ID, where each of the ID is required for another type of process such as indexing. The queue will index the entity (represented by the ID) in batches (let’s say 4 messages at a time) instead of having a large surge of load in a short period of time.

Few more thoughts…

  1. By default, JobHostConfiguration.QueuesConfiguration.BatchSize handles 16 queue messages concurrently. I recommend you to override the default value with a smaller value (let’s say, 4) to ensure the other end which does the more heavy processing (for example indexing a document in Solr or Azure Search) is able to handle the load. The maximum value for JobHostConfiguration.QueuesConfiguration.BatchSize is 32. If having WebJob to handle 32 message at a go is not sufficient for you, you can further tweak the performance by setting a short JobHostConfiguration.QueuesConfiguration.MaxPollingInterval time to make sure you do not accumulate too many message before the processing kicks in.
  2. If for whatever reason you have max out the built-in configuration (such as BatchSize, MaxPollingInterval) and yet it is not good enough, a quick win will be to scale up your WebApp. Note that you cannot scale your WebJob alone because WebJob sits under the context of WebApp. If scaling up WebApp for the sake WebJob sound like an inefficient way, consider migrating your jobs to Worker Role.
  3. WebJobs are good for lightweight processing. They are good for tasks that only need to be run periodically, scheduled, or triggered. They are cheap and easy to setup and run. Worker Roles are good for more resource intensive workloads or if you need to modify the environment where they are running (for example .NET framework version). Worker Roles are more expensive and slightly more difficult to setup and run, but they offer significantly more power when you need to scale. There is a pretty comprehensive blog post by kloud comparing WebJob and Worker Role.
  4. Azure Storage Queue has no guarantee on message Ordering. In other words, a message get placed into the queue first does not necessary get processed first. Delivery for Azure Queue is At-Least-Once but not At-Most-Once. In other words, a message potentially get processed more than once. The application codes will need to handle the duplication of what happens after a message is picked up. If this troubles you, you should consider Service Bus Queue. The Ordering is First-In-First-Out (FIFO) and delivery is At-Least-Once and At-Most-Once. If you are wondering then why people still use Azure Storage Queue, it is because Storage Queue is designed to handle super large scale queuing. For example, maximum queue size for Storage Queue is 200 TB while Service Bus Queue is 1 GB to 80 GB; maximum number of queues for Storage Queue is Unlimited while for Service Bus Queue is 10,000. For complete comparison reference, please refer to Microsoft doc.

I hope you have enjoyed reading this article. If you find it useful, please share it with your friends who might benefit from this. Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *