“Attach to process…” tricks

Standard

When you are working on a large system, where your code base (project or solution) is only a subset of the entire system you will often use “Attach to process…” to tap into the execution of the application pool for debugging.

“Attach to process…” might not the most convenient approach for debugging, however at times it can be more efficient than launching whole application from your code base. There are still value in using “Attach to process…” although this approach could be painful occasionally.

Here are a few tricks to make the debugging process less painful.

Trick No.1 – Identify your AppPool

When you are working with a little more complex application where your application have dependencies on other applications, you might have multiple w3wp.exe running like following:

Which process should you attach to?

All of them? Of course that would work but obviously that’s not the kindest thing you can do to the poor machine memory and CPU…

The following command help you identify the AppPool name and the process ID so that your Visual Studio will only attach to the relevant w3wp.exe

C:\Windows\System32\inetsrv>appcmd list wp

Base on the AppPool name, you can now know the process ID you should attach to in your Visual Studio.

Trick No.2 – ReAttach (Visual Studio extension)

Imagine you have to use “Attach to process…” for debugging purpose in your project. You modify your code a little to change the logic or add some validation and want to continue debugging. It is annoying to keep launching the above “Attach to Process” dialog to select the w3wp.exe.

ReAttach is a Visual Studio extension to help you to attach to the process ID that you have attached a moment ago.

Download and install it then you will have a new option in your Visual Studio menu, Reattach to Process… instead of just “Attach to Process…”

The extension basically helps you to reattach your debugging to the process that you attached earlier. It will continue to work as long as your process ID did not change / disappear.

If for whatever reason you need to run iisreset and your process ID changed. By using “Reattach to process…” it will help filtering out all other irrelevant process and only show you the available w3wp.exe

Hope these 2 tricks help your debugging by attaching to your AppPool. Until next time!

Technical Interview Part 2

Standard

What I do instead

Before technical interview session, a technical assignment given to the candidate on Friday, so that they can work on it over the weekend. The technical assignment typically takes few hours to complete. I have a range of questions from demonstrating a design pattern, build a simple application with database interaction, to SEO analysis algorithm.

The objectives of the technical assignment are:

  1. Ensure the candidate can codes.
  2. Evaluate how modern his development approach (for example, whether the candidate will use Elmah or Log4Net over writing a custom logger class)
  3. Evaluate how serious he treats his codes (if a candidate deliver a half-hearted solution, it indicates the same for his work codes)
  4. Evaluate whether he goes the extra miles (such as implementing unit tests and proper exception handling)

During the technical interview, I will do the following:

  1. Have the candidate to explain the core of the solution. I will then ask a few questions base on his implementation. For example if I want to implement certain change, where should I modify the code. This is to ensure I’m talking to the person who wrote the codes.
  2. Find a flaw in the system and press on it again and again – in a professional and respectful manner. This is to evaluate how well the candidate respond to criticism, whether the candidate get defensive and whether the candidate is open to feedback.
  3. Challenge the candidate on how he can upgrade his solution to be production ready on both code and infrastructure level. This is to evaluate how much thought he has given to his solution and how much of exposure the candidate has dealing with production system.

Next, I will move on to a list of generic topic on software development. Example of the topics I cover.

Source Control

Every developer uses source control to certain extend. I will normally ask what kind of source control has the candidate use. The top 3 answers are TFS, Git and SVN. I will ask the candidate to share with me, what are the differences between the top 2 source controls he is familiar with. The idea here is to discuss about what the candidate is familiar with so that he can show his best thoughts. Depending on what the candidate bring up to the table, I will get a sense of what kind of developer the candidate is.

For example, if a candidate tells me checking out a branch in TFS is downloading a whole new copy of the code; while Git is merely applying the delta difference on the same copy of the code, it indicates this candidate used to work with giant code base with some level of branching experience and he appreciates Git is much more efficient on client side storage.

Another example, if the candidate brought up terms such as Rebase, I will follow up by asking what is the difference between Rebase and Merge on theoretical level and when is a good scenario to use Rebase over Merge on practical level. Depending on the scenario given, I might (or might not) have further question to validate the usages. The idea here is I’m following up on the topics suggested by the candidate himself. If a candidate cannot provide solid evidence on how familiar is he with the topic suggested by himself, that indicates the candidate is throwing fancy terms around hoping to impress the interviewer.

Design Pattern

Despite the challenges I highlighted earlier on design pattern, I still think design pattern is a good topic to cover during technical interview because the right application of design pattern indicate the complexity of codes the candidates has dealt with, hence the need of design pattern.

Ever since I take the role of being an interviewer, I do make it a point to read up on additional design patterns that I have never used. The good news is, most candidates consistently brought up only a handful of design pattern. The top 3 are such as Singleton, Abstract Factory and Repository.

Although Dependency Injection is not strictly a design pattern, a lot of candidates did mention Dependency Injection as something they know under the design pattern topic. I do not dismiss this answer just because it did not fit into the definition. My objective here is to assess the candidate’s ability to design his code structure, not a competition of giving definition.

Singleton

What do I look out for during Design Pattern discussion? Take Singleton for example. After the candidate mentioned he knows Singleton, I will follow up with question “What is a good use case to use Singleton?”. The typical answer I got is something along the line “when you only need to have a single instance of the class”. Good. At this point, I know that the candidate is aware of the definition of Singleton although did not provide the use case I asked for. I will rephrase my question slightly differently to remind the candidate I’m looking for a use case.

One “interesting” answer always pop up is to apply Singleton in data layer (CRUD operation to database). I call this interesting because anyone who give a little more thoughts or have done some research in Singleton will realize it’s a bad idea to apply the pattern in data layer. However, this misconception comes up very frequently.

I will take this opportunity to explain to the candidate the kind of problem will surface for applying Singleton in data layer. Why do I do that? Having the right skill or information is important, but having the right attitude is equally (if not more) important. You can teach someone new skill but it is extremely challenging change someone’s attitude. At this point if the candidate appears to be enlighten with the new information, I know the candidate is coachable. In most situation, I would rather to have a coachable new hire (although not having the top notch skill set) over someone with superstar skill set but a poor attitude.

Database

Working with database will come across a developer’s path very frequently. It is an unspoken rule that a developer must be able to work with database. With the amount of storage options in the market, it is difficult to discuss all of them but we will stick with the most popular option for most .NET developer – SQL Server in this article.

When hiring a junior developer, the candidate will have to prove his ability in writing T-SQL. Insert, Update, Delete and different kind of joins. No big deal. For senior developer, I would normally ask the candidate what other exposure does he has apart from T-SQL. Asking the actual involvement in SQL Server gives me very good indicator what kind of system the candidate has dealt with.

For example, if the candidate claimed he takes care of database backup, I will follow up with what is the backup cycle and types of back up he was using. If all the candidate did was doing a full back up on daily basis, it indicates the database size he was dealing with was not very large and the data lost does not seem like a big deal which means the data is not extremely critical.

If the candidate mentioned he scales SQL Server, I will follow up with what type of replication he applied and what is the rational behind the decision. I will also ask what other strategies he has considered before using replication because replication is an expensive option. If the candidate brought up Redis cache and index provider such as Solr or Azure Search, it shows the candidate has looked beyond SQL Server context which indicates he is someone having very broad skill set across technologies.

Once a candidate told me he implemented table partitioning in his database. I asked what is the logical condition he applied his partition base on. He said primary key which is GUID data type. That was an interesting answer because the generally approach to create partition is to base on date or some other logical conditions. I explained to him how I would implement table partitioning instead and the reason behind it. His eyes were brighten up.

Notice that I did not say “This is wrong. The correct way is this”. Instead, I make it as a discussion on “This is what I will do instead”. The same information was delivered across, but the outcome will be very different.

The candidate impressed me because he knows about table partitioning that most developer don’t. It suggests that the candidate is someone who took the extra effort to learn new skills to solve problems. Most importantly, the way he responded to the information I shared with him suggests he is someone coachable. I took this candidate into my team and he has proven to be a star team member.

Few final thoughts…

There are a lot of other topics that I cover during the interview. Most of them are generic topics such as tweaking software performance and security. The purpose of having a standardized list of topic is to ensure I use the similar benchmark for candidates for the same position. The reason to start with generic topic and drill further down is to allow the candidate to talk about areas that they are familiar with so that they can showcase their sharpest thoughts.

When candidate brought up certain topic for discussion, I’m assuming he knows about the topic very well. I’m handing over the power to drive the discussion to the candidate to certain extend. I prefer to talk about what the candidate is familiar with (instead of mine) so that I can truly assess his level of technical competency. Frankly, there is very little value to talk about a topic the candidate has only read an articles on 6 months ago. However, whichever topics that the candidate brought up, I will drill really deep to ensure he indeed knows about them rather than just throwing some fancy words around. 

During technical interview, I’m looking at more than just technical skills. Technical skills is learnable. What really interest me are:

  • Whether the candidate is coachable?
  • How big of passion the candidate has over technology?
  • What is the candidate’s approach in solving problems?
  • What is the candidate’s attitude dealing with technology and PEOPLE?
  • How much of potential the candidate has so that the company can groom him to be a superstar developer and beyond?

The technical topics I have for the candidate were merely for me to expose those areas I’m interested to learn about the candidate. I’m never interested to know the difference between a clustered index vs non-clustered index or the difference between Azure Web Job vs Azure Worker Role vs Azure Function. Given a laptop with internet, anyone can Google them in 5 seconds. What I am interested to discover is whether this candidate is coachable, his passion, his approach, his attitude and his potential!

Ideally, we should hire the right person with the right skill. However such angels rarely come by. If I have to choose between the right person or the right skill, I will choose the right person any day. Of course, provided the candidate still has reasonable level of skill set on the role he is applying. New skills are learnable and very often it is very quick to learn a new skill. Coaching a person takes a much more time, energy and cha-ching – if you are lucky.

If you are not lucky, a bad apple not only bring down productivity but also break the current harmonious team. It is much more effective to filter the potential troublemaker than to “coach” or “develop” him later. There is no point hiring bad apples just to hit headcount. With people, slow is fast.

Some companies practice having a couple strong technical guys to interview candidates whom they might not eventually work with. The interviewers are hiring for the company wide. Some companies practice having the Team Lead / Architect within the team to interview the candidates whom they will eventually work with. They are hiring for the team. I have been in both the situations and personally I prefer to the latter.

Being able to work with the person whom I interviewed earlier will give me additional consideration and deeper thoughts into whether the candidate will be a good fit into my team. Another good reason is to allow me to validate and refine my interview techniques. Interview is all about perception and assumption made on the candidate. I have made good decisions and I have made bad decisions. However, in the situation where I made a wrong assumption base on a wrong perception, I can adjust my interview technique on a continuous basis if I have first hand experience working with the candidate I interviewed.

Finally, I don’t claim what I’m doing is the only way or the best way. We live and we learn 🙂 I found this approach to be working quite well hence I continue practicing. If you have any thought on this, please leave me a comment. Hope you have found something useful in this article. Until next time. Cheers!

Technical Interview Part 1

Standard

Technical interview is both an exciting and stressful moment. It is exciting because there is a potential career opportunity ahead of you. It is stressful because you subconsciously aware that the people in the room are there to judge you.

I have been on the both side of the table – being an interviewer and an interviewee. It is stressful to be an interviewee for obvious reason. We need to try hard to sell ourselves and “sales” is not a skill that come naturally to technical people like us. Furthermore, you never know what kind of psychopath you might meet, asking you to find a bug in his rocket science algorithm. It is equally stressful to be an interviewer. Now your shoulder carries the responsibility of evaluating a candidate whether the candidate will be a right fit to the organization for long term. Being too lenient, you might get a bad apple into the existing harmonious team; being too strict, you might lose a black horse who might just need a little polishing.

As an interviewee

Let’s deal with the stressful problem for the interviewee first. Throughout my experience, I noticed I perform best when I’m not feeling nervous. The key to not feeling nervous is not to feel desperate for a job. Always look for a new job when you least needed it. When you don’t “need” the job, you are going into the interview room as an equal. Low need, high power and vice versa. Did you notice the term interview basically means “viewing each other”. You go in as an equal to evaluate the company as much as the company is evaluating you. The outcome of having this mentality allow you to feel more confident. Again, from my personal experience when I go into a technical interview with this mindset, I often have a pleasant technical discussion with the interviewer.

As an interviewer

Now for the interviewer. I’m not sure how many interviewer will feel stressful. I did not feel being an interviewer is a stressful task until I’m conducting interview for the 3rd year. Interviewee will usually be polite and humble. Most of the time, interviewee will do their best not to offend or make it difficult for the interviewer. I always felt I have an upper hand while conducting interview, hence I never thought there was a problem. It was only until I pull myself out of the technical interviewer’s role and give a more holistic insight from the organization perspective. I realized there are so many other aspects I need to put into consideration while conducting technical interview.

For example, during one interview I found out that talking to me in a technical interview is the 7th round of interview the candidate has gone through. He has taken online technical test and other technical interviews prior to talking to me. My final feedback on the candidate is a clear ‘No’. However, that got me thinking how and why did the candidate was able to pass the previous 6 rounds of interview but not my technical interview. Is there something wrong with the way I asked technical questions? Or does it simply means the previous 6 rounds of interview were not done effectively?

Another example, the organization has an expansion plan is to grow another 100 headcount in 1.5 year. That is equivalent to approximately 6 new hires in a month. Aggressive? Definitely! However base on the current hiring rate, we will not be able to hit the number. What need to be done differently? Should I lower my technical benchmark? Should I say we can’t meet this number simply because we cannot find the talents? How big (or small) the impact is to the projects if we do not meet the numbers? Most importantly, where should I find the balance?

The nature of software development skill set has both breadth and depth. Ideally it will be perfect to pair an interviewer and interviewee who have the identical technical domain experience. Reality is due to the today’s technology breadth, developers often focus on very different vertical skill set. For example, the interviewer might be an expert in Azure Web WebJob, Azure Storage and MVC but the interviewee has been working on Angular, Web API and SQL Server. Both of them are expert in their respective full-stack domain but there is very little common ground.

Let’s face it, both the interviewer and interviewee would not know every topics in great depth, even just within the Microsoft stack of technologies. How can the technical interview being conducted in a fruitful manner with this breadth and depth nature? Do I dismiss a candidate just because they don’t share the similar background with me even though he is talented, passionate and willing to learn?

What is the solution? Should the interviewer ask something more generic and academic like object-oriented concept? Something more holistic yet sophisticated like design pattern? Or something more brainy like algorithm?

Popular topics interviewer use

Object-oriented concepts

In my previous job, my technical interview is the 1st round of interview after the candidate has passed a codility test. The online test involves assessing candidates basic programming skill (fixing a logical operator in a small function) and writing a basic SQL query with some joins. I think it was necessary to cover the basic of object-oriented concept for a C#.NET developer. So I ended up with asking questions like:

  • Explain to me what are method overloading and method overriding?
  • What are the differences between interface and abstract class?

I was under the assumption these questions were alright until one day I have a candidate who answered me so fluently as if he was reading it out from a book – except he didn’t have a book in front of him. This suggests the candidate have rehearsed these answers a thousand times before talking to me.

Well, the reality is at first I thought the candidate was such a bright developer that knows these concept so well. I decided to give him a little more challenging question to see how far he could go. The question was base on what he has explained earlier where an abstract class can contain both method with empty implementation and concrete implementation; while interface can only contain method signature without implementation. Great!

My next question was, if an abstract class can do both methods with empty implementation and concrete implementation, why do we still need interface? I was expecting him to explain something along the line where a child class can only inherit 1 abstract class but multiple interfaces. I would be happy to accept the answer and prepared move on to the next topic even if he just give me a one liner answer. To my surprised he kept quiet and could not provide any explanation.

From there, I realized there are candidates who really put a lot effort in preparing for technical interview like rehearsing answers for those top 50 interview questions from Google result. Ahem… the truth was, I was too lazy to craft any original interview question back then so I ended up using questions from those top 50 interview questions where candidates can easily prepare for. The problem with this was I ended up evaluating how much preparation work a candidate has done rather than how strong his technical capability is. It was a bad idea to use top 50 interview questions.

The top 50 interview questions

When you use those top 50 interview questions, not only you cannot accurately assess the candidate, you will push away those developers who really know their stuff. Remember interview is about viewing each other between the interviewer and the interviewee. Under normal circumstances, a company will put one of their best guys to conduct the interview. If the best guy in the company can only conduct interview base on top 50 interview questions, it will really make me think twice whether I want to join the company when the company offers me a job.

In fact, I encountered this once. I was talking to an interviewer in a MNC who has prepared a long list of technical questions. We covered those questions in approximately 30 minutes instead of his normal 60 minutes. At one point, after he asked question A, I knew he will follow up with question B, so I explained the answer for question B along with the answer in question A. At the end of the interview, his feedback was it was as if I already have the list of question that he was holding. The truth was, I have gone through those questions 5629 times when I was preparing interview questions for my candidates.

Eventually, I did not take up the offer in the MNC. There are many factors that influenced the decision. One of them is knowing the best technical guy in the team could only do what I did 2 years ago, it wasn’t very motivating.

I have stopped using those top 50 interview questions. They are for amatures 🙂

Design pattern

Design pattern seems like a favorite topic for discussion during technical interview in the past few years. This topic got so popular to the point that a recruiter without a computer science background will start asking candidates to explain design patterns. It took me by surprised when two HR looking ladies (they were recruiters) were asking me to explain the design patterns I have worked with. I got a feeling they did not understand 9 out of 10 sentences came out from my mouth because they never ask any follow up question base on what I explained. They probably just wanted to see how clearly I can articulate my ideas.

Design pattern is something you implement it once and it becomes a second nature in your project. Developers do not apply 7 patterns at a go and revisiting them every 3 weeks to evaluate whether they are still appropriate. If they are not, revamp them and apply another 5 new patterns. This simply do not happen for any software with real delivery timeline. Most developers will be working with 1-2 patterns on a daily basis. This will be a breadth and depth issue. The interviewer might be an expert with Adapter and Abstract Factory while the interviewee is an expert in Observer and Singleton. It is not always possible to have an in-depth discussion on all design patterns.

Shouldn’t a good developer know a few more patterns at least on the theoretical level? Yes, I think it’s a valid point. However there will still be a gap between interviewer and interviewee’s level of understanding. For example, the interviewer has been working with Adapter for the last 3 years and the interviewee only read 3 articles on Adapter pattern (or vice versa). The level of discussion between interviewer and interviewee on Adapter pattern is going to be shallow.

The bad news is, some interviewers doesn’t seem to recognize the breadth and depth gap. Some interviewers insist on discussing rigid details on specific design pattern. It will end up being an unpleasant experience for both interviewer and interviewee. Interviewee feeling inferior for not being able to provide an answer; while interviewer feeling not satisfied because he cannot have a meaningful technical discussion with interviewee to assess his technical skill.

The good news is, when design pattern base questions are done right, it gives both the interviewer and interviewee a good discussion to explore areas they both might not have thought of before as an individual.

Algorithm

This is a very safe approach to use during technical interviews because all programmers are expected to have solid logical thinking. Algorithm is all about combining programming techniques and logical thinking to solve specific problem. It is a very suitable approach to assess interviewee’s ability to solve a problem using codes.

Interview questions base on algorithm could be as simple as writing a function to print a few asterisks (*) on the screen, to detect whether the input is an odd or even number, to sorting a series of numbers, to printing a calendar. Usually the company who uses algorithm base questions will have 3 level questions such as easy, medium, hard. If you want to secure a job, you should at least get it right on easy and medium. The hard question is there for the interviewer to identify a grand-master coder over a senior coder.

The ironic part about algorithm base question is a lot of candidates tend to shy away from them.

Example 1:  About 8 years back it was still pretty common to have the candidate to write down the solution on paper. The question was about a simple string manipulation function. Unfortunately, the candidate who appeared to be an experienced developer handed me empty paper and left with an apologetic tone saying this job might not be right for him.

Example 2: One company that I know of is asking the candidate to code a function to detect an integer input whether is an odd or even number and display an appropriate message – using the provided laptop with Visual Studio on it. The answer is surprisingly simple which is to use a modulus (%) and put an If check at the remainder. However this took a candidate who is applying a senior developer position 20 minutes to type a few keystrokes and a few backspaces, type a few keystrokes and a few backspaces.

Example 3: Codility has been an handy tool for conducting programming test online to save everyone’s time. I recently found out a friend who applied for Tech Lead position. He was asked to write a function to work with zero-based index in Codility. To my surprise, he could not understand the question. He did not even attempt to write the solution and closed the browser.

It appears that interviewee feels very stressful when the technical interview involve writing algorithm. In the above examples, the question was not complicated, the answer was not complex. I believe all 3 candidates in the above examples can do reasonably well if they are not in an “technical interview” mode.

In the next article, I will discuss more about how I conduct technical interview instead…

Azure WebJob with Azure Queue

Standard

Cron job is essential part of complex systems to execute of certain script or program at a specific time interval. Traditionally, developer or system administrator create Windows Scheduled Task to execute scheduled job within the operating system.

In one project, I used to have multiple .exe programs scheduled to update the production database during mid night for various use cases such as expiring user credit. This gets the job done within application context but this is not the cleanest way when my system administrator need to take care of 20 other cron jobs coming from different machines and different operating systems.

The next thing I have implemented is to expose a cron job through WCF API endpoint. For example, I opened a WCP API endpoint to be triggered on a functionality in sending email notification. This end point will map user’s saved criteria and business inventory on a daily basis. (Yes, this is the annoying 9.00AM email spam notification you get everyday. Sorry!) The WCF API endpoint does not do anything if no one hits it. It is a simple HTTP endpoint waiting for something to tell him to get up and work.

The reason to expose the cron job as WCF API endpoint is to allow my system administrator to have a centralized system to trigger and monitor all the cron jobs in one place rather than logging into multiple servers (operating systems) to monitor and troubleshoot. This works alright except that now I have my cron job stuck in a WCF project instead of simple script or a lightweight .exe program.

Azure WebJob

The next option is Azure WebJob. Azure WebJob enables me to run programs or scripts in web app context as background processes. It runs and scales as part of Azure Web Apps. With Azure WebJob, now I can write my cron job as simple script or a light-weight .exe rather than WCF. With Azure WebJob, my system administrator can also have a centralized interface, Azure Portal to monitor and configure all the cron jobs. In fact, it’s pretty cool that I can trigger a .exe program using a public HTTP URL using the Web Hook property in WebJob.

Azure WebJob goes beyond the traditional cron job definition (timer based). Azure WebJobs can run continuously, on demand or on a schedule.

The following file types are accepted:

  • .cmd, .bat, .exe (using windows cmd)
  • .ps1 (using powershell)
  • .sh (using bash)
  • .php (using php)
  • .py (using python)
  • .js (using node)
  • .jar (using java)

I will use C#.NET to create a few .exe to demonstrate how Azure WebJob works.

Prerequisites (nice to have)

Preferably, you should have Microsoft Azure SDK installed on your machine. At the point of writing this, I’m running Visual Studio 2015 Update 3, so I have the SDK for VS2015 installed using Web Platform Installer.

Note that this is NOT a MUST have. You can still write your WebJob in any above-mentioned language and upload you job manually in Azure Portal. The reason I recommend you to install it is to make your development and deployment much easier.

Working with Visual Studio

If you have Microsoft Azure SDK installed for Visual Studio, you will see Azure WebJob template. We are going to start with a simple Hello World program to get started.

Your WebJob project will be pre-loaded with some sample codes. As usual, you will still need to build it once for NuGet to resolve the packages. 

The following packages are part of your packages.config if you started the project with Azure WebJobs template. No big deal if you didn’t, you can install them manually, although it’s a little tedious.

For now, we will ignore the fancy built-in SDK support for various Azure services. We will create a simple Hello World program and deploy it to Azure to get an end-to-end experience.

Remove everything else and left with Program.cs writing a simple message:

Go to your Azure Portal. Click on Get publish profile to download your publishing profile. You will need it to import into your Visual Studio later when publishing your WebJob.

Import your publish profile into your project in Visual Studio. The Import dialog will kick in at the first time when you publish your project as WebJob.

Right click on your project and select “Publish as Azure WebJob”, you will see the following dialog to set up your publishing profile.

Import the earlier downloaded publishing profile setting file into your WebJob project in Visual Studio.

Validate your connection.

Click on “Publish” to ship your application to Azure.

Upon successful publishing, you should be seeing a message “Web App was published successfully…”

Go to your Azure portal and verify that your WebJob is indeed listed.

Select your WebJob, click on the Logs button on top to see the following page.

The impressive part about using WebJob in Azure is the following WebJobs monitoring page. You can use this page to monitor multiple WebJobs status and drill down deeper into the respective logs. No extra cost, coding or configuration, all work out of the box!

Now we have our first Hello World application running in Azure. We have deployed our WebJob to run continuously, which means it will get triggered automatically every 60 seconds. Once the first round is completed, status will change to PendingRestart and wait for the next 60 seconds to kick in.

WebJob SDK sample project in GitHub demonstrates comprehensively how you can work with WebJob through Azure Queue, Azure Blob, Azure Table, Azure Service Bus. In this article, we will do a little bit more coding by using WebJob to interact with Azure Queue.

Azure WebJob with Azure Queue Storage

Microsoft.Azure.WebJobs namespace provides QueueTriggerAttribute. We will use it to trigger a method in our WebJob.

This works by whenever a new message is added into the queue, the WebJob will be triggered to pick up the message in the queue.

Before we continue in our codes, we first need to create a Azure Storage account to host the queue. Here, I have a storage account name “danielfoo”.

We will use Microsoft Azure Storage Explorer to get visual on our storage account. It’s a handy tool to visualize your data. If you do not have it, no worry, just imagine the queue message in your mind 🙂

Let’s add a new console application project in our solution to put some messages in our queue.

There are two packages that you’ll need to install into your queue project:

We will write the following simple codes to initialize a queue and put a message into the queue.

Of course, you will have to configure your StorageConnectionString in app.config for codes to recognize the connection string.

You can get your account name and key from Azure Portal.

Let’s execute our console application to test if our queue can be created and whether a message can be placed into the queue properly.

After execution, look at Storage Explorer to verify if the message is already in the queue.

Now we will dequeue this message so that it will not interfere with the actual QueueTrigger in our exercise later.

Next, we will create a new WebJob project that get triggered whenever a message is added into the queue by using QueueTriggerAttribute under Microsoft.Azure.WebJobs namespace.

This time we do not remove Functions.cs nor modify Program.cs.

Make sure that your Functions.cs method parameter contains the same queue name as what you defined earlier in your Queue.MessageGenerator project. In this example, we are using the name “danielqueue”.

Program.cs

Remember to fill up your App.config on the following connection string. This is to allow the WebJob to know which storage account to monitor.

Now, let’s start WebJob.QueueTrigger project as a new instance and allow it to wait for a new message add into “danielqueue”.

Then, we will start Queue.MessageGenerator project as a new instance to drop a message into the queue for WebJob.QueueTrigger to pick up.

Yes! Our local debug is has detected a new message is added into “danielqueue” hence hit the ProcessQueueMessage function.

Let’s publish our WebJob.QueueTrigger to Azure to see it processing the queue message in Azure context instead of local machine. After successful publishing, we now have 2 WebJobs.

Select QueueTrigger (the WebJob we just published) and click on Logs button on top. You will see the following log on queue message processing.

If you drill down into particular message, you will be redirected to the Invocation Details page

We have just setup our WebJob to work with Azure Queue!

That wraps up everything I want to show you in working with Azure WebJob and Azure Queue.

Obviously in reality you will write something more complex than simply output the log in your WebJob. You may write some logic to perform certain task. You may even use this to trigger another more complex job sitting in another service.

In the queue, obviously you also wouldn’t write a real “message” like I did. You will probably create one queue for very specific purpose. For example, you will create a queue to store a list of ID, where each of the ID is required for another type of process such as indexing. The queue will index the entity (represented by the ID) in batches (let’s say 4 messages at a time) instead of having a large surge of load in a short period of time.

Few more thoughts…

  1. By default, JobHostConfiguration.QueuesConfiguration.BatchSize handles 16 queue messages concurrently. I recommend you to override the default value with a smaller value (let’s say, 4) to ensure the other end which does the more heavy processing (for example indexing a document in Solr or Azure Search) is able to handle the load. The maximum value for JobHostConfiguration.QueuesConfiguration.BatchSize is 32. If having WebJob to handle 32 message at a go is not sufficient for you, you can further tweak the performance by setting a short JobHostConfiguration.QueuesConfiguration.MaxPollingInterval time to make sure you do not accumulate too many message before the processing kicks in.
  2. If for whatever reason you have max out the built-in configuration (such as BatchSize, MaxPollingInterval) and yet it is not good enough, a quick win will be to scale up your WebApp. Note that you cannot scale your WebJob alone because WebJob sits under the context of WebApp. If scaling up WebApp for the sake WebJob sound like an inefficient way, consider migrating your jobs to Worker Role.
  3. WebJobs are good for lightweight processing. They are good for tasks that only need to be run periodically, scheduled, or triggered. They are cheap and easy to setup and run. Worker Roles are good for more resource intensive workloads or if you need to modify the environment where they are running (for example .NET framework version). Worker Roles are more expensive and slightly more difficult to setup and run, but they offer significantly more power when you need to scale. There is a pretty comprehensive blog post by kloud comparing WebJob and Worker Role.
  4. Azure Storage Queue has no guarantee on message Ordering. In other words, a message get placed into the queue first does not necessary get processed first. Delivery for Azure Queue is At-Least-Once but not At-Most-Once. In other words, a message potentially get processed more than once. The application codes will need to handle the duplication of what happens after a message is picked up. If this troubles you, you should consider Service Bus Queue. The Ordering is First-In-First-Out (FIFO) and delivery is At-Least-Once and At-Most-Once. If you are wondering then why people still use Azure Storage Queue, it is because Storage Queue is designed to handle super large scale queuing. For example, maximum queue size for Storage Queue is 200 TB while Service Bus Queue is 1 GB to 80 GB; maximum number of queues for Storage Queue is Unlimited while for Service Bus Queue is 10,000. For complete comparison reference, please refer to Microsoft doc.

I hope you have enjoyed reading this article. If you find it useful, please share it with your friends who might benefit from this. Cheers!

Tech Talk 2016

Standard

2016 has been a fruitful year for me in software development. Apart from my day job in Sitecore as a lead developer, I also have a lot fun with services in Azure cloud in my spare time.

Another new “adventure” I tried in 2016 is speaking for Tech Talk in local tech communities – from my own office, university, Microsoft office to being an online panelist in Google Hangout discussion.

Microsoft Malaysia Level 26, Tower 3

Scaling SQL Server @ Microsoft Malaysia Level 26, Tower 3

multimedia-university

Software Industry Career Advice @ Multimedia University, Computing Faculty Lecture Hall, Cyberjaya

Sitecore Malaysia - Daniel Foo on Azure Search

Working with Azure Search @ Sitecore Malaysia, Level 18

Standardizing DevOps Across Organization @ Continuous Discussions (#c9d9) Google Hangout

It has been a rewarding experience to contribute and making a difference in tech communities through Tech Talk. It amazed me when some audiences asked me follow up questions based on what I have shared earlier. It is truly satisfying to learn that I have inspired some audiences with useful information or ideas in general which they can benefit from.

It has been an honor to share the stages with many of the knowledgeable speakers. As much as I enjoyed sharing, I have learned equally a lot from them. Thanks to those who invited me over to speak, those who helped me out to make the Tech Talks possible and those who came and supported me. Thank you! I am truly lucky to have you all amazing people as friends in the tech communities.

Signing off for 2016… It has been a fantastic year, looking forward to 2017!

Continuous Integration and Continuous Delivery with NuGet

Standard

Continuous Integration (CI) is a development practice that requires developers to integrate codes into a shared repository. Each commit will then be verified by an automated build and sometimes with automated tests.

Why Continuous Integration is important? If you have been programming in a team, you probably encountered situation where one developer committed codes that cause every developer’s code base to break. It could be extremely painful to isolate the codes that broke the code base. Continuous Integration serves as a preventive measurement by building the latest code base to verify whether there is any breaking changes. If there is, raise an alert perhaps by sending out an email to the developer who last committed the codes or perhaps notify the whole development team or even to reject the commit. If there isn’t any breaking change, CI will proceed to run a set of unit test to ensure the last commit has not modify any logic in an unexpected manner. This process sometimes also known as Gated CI, which guarantees the sanity of the code base in a relatively short period of time (usually within few minutes).

CI

The idea of Continuous Integration goes beyond validating the code base in a team of developers working on. If the code base utilizes other development teams’ components, it is also about continuously pulling the latest components to build against the current code base. If the code base utilizes other micro-services, then it is about continuously connecting to the latest version of the micro-services. On the other hand, if the code base output is being utilized by other development teams, it is also about continuously delivering the output so that other development teams can pull the latest to integrate with. If the code base output is a micro-service, then it is about continuously exposing the latest micro-service so that other micro-services can connect and integrate to the latest version. The process of delivering the output for other teams to utilize leads us to another concept known as Continuous Delivery.

Continuous Delivery (CD) is a development practice where development team build software in a manner the latest version of software can be released to production at any time. The delivery could mean the software being delivered to a staging or pre-production server or simply a private development NuGet feed.

Why Continuous Delivery is important? In today software development fast pace of change, stakeholders and customers wanted all the features yesterday. Product Managers do not want to wait 1 week for the team to “get ready” to release. Business expectation is as soon as the codes are written and functionalities are tested, software should be READY to ship. Development teams must establish an efficient delivery process where delivering software is as simple as pushing a button. A good benchmark is the delivery can be accomplished by anyone in the team. Perhaps to be done by a QA after he has verified the quality of the deliverable or by Product Manager when he thinks the time is right. In complex enterprise system, it is not always possible to ship codes to production quickly. Therefore complex enterprise system is often broken into smaller components or micro-services. In this case, the components or micro-services must be ready to be pushed to a shared platform so that other components or micro-services can consume the deliverable as soon as available. This delivery process must be at READY state at all time. The decision of whether to deliver the whole system or the smaller component should be a matter of business decision.

Note that Continuous Delivery does not necessary mean Continuous Deployment. Continuous Deployment is where every change goes through the pipeline and automatically gets pushed into production. This could lead to several production deployments every day, which is not always desirable. Continuous Delivery allows development team to do frequent deployments but may choose not to do it. In today’s standard for .NET development, NuGet package is commonly used for either delivering a component or a whole application.

NuGet is the package manager for the Microsoft development platform. A NuGet package is a set of well-managed library and the relevant files. NuGet packages can be installed and be added to .NET solution from GUI or command line. Instead of referencing to individual library in the form of .dll, developers can reference to a NuGet package which provides much better management in handling dependencies and assemblies versions. In a more holistic view, a NuGet package can even be an application deliverable by itself.

Real life use cases

Example 1: Micro-services

In a cloud based (software as a service) solution, domains are encapsulated in the respective micro-service. Every development team is responsible for their own micro-services.

microservices

Throughout the Sprint, developers commit codes into TFS. After every commit, TFS will build the latest code base. Once the building process is completed, unit tests will be executed to ensure existing logic are still intact. Several NuGet packages are then generated to represent several micro-services (WCF, Web application, etc). These services will be deployed by a deployment tool known as Octopus Deploy to a Staging environment (hosted in AWS EC2) for QA to perform testing. This process continues until the last User Story is completed by the developers.

In a matter of clicks, the earlier NuGet package can also be deployed to Pre-production environment (hosted in AWS EC2) for other types of testing. Lastly, with the blessing from Product Manager, DevOps team will use the same deployment tool to Promote the same NuGet packages that were tested by QA earlier into Production. Throughout this process, it is very important that there is no manual intervention (such as copying a dll, changing a configuration, etc) by hands to ensure the integrity of the NuGet package and deployment process. The entire delivery process must be pre-configured or pre-scripted to ensure the process is consistent, replicatable, and robust.

Example 2: Components

In a complex enterprise application, functionalities are split into components. Each component is a set of binary (dll) and other relevant files. A component is not a stand-alone application. The component has no practical usage until it sits on the larger platform. Development teams are responsible for their respective component.

Throughout the Sprint, developers commit codes into a Git repository. The repository is monitored by Team City (build server). Team City will pull the latest changes and execute a set of Powershell script. From the Powershell script, an instance of the platform is setup. The latest code base will be built and the output is placed on top of the platform. Various tests are executed on the platform to ensure the component functionality is intact. Then, a set of NuGet package will be generated from the Powershell script to be published as the artifacts. These artifacts will be used by QA to run other forms of tests. This process continues until the last User Story is completed by the developers.

When QA gives the green light and with the blessing from Product Manager, the NuGet packages will be promoted to ProGet (an internal NuGet feeds). This promotion process happens in a matter of clicks. No manual intervention (modifying the dependencies, version, etc) should happen to ensure the integrity of the NuGet package.

Once the NuGet package is promoted / pushed into ProGet, other components update this latest component into their components. In Scaled Agile, a release train is planned on frequent and consistent time frame. Internal release happens on weekly basis. This weekly build will always pull all of the latest components from ProGet to generate a platform installer.

Summary

From the examples, we can tell that Continuous Integration and Continuous Delivery are a fairly simple concepts. There is neither black magic nor rocket science in both the use cases. The choice of tools and approaches to accomplish largely depend on the nature of the software we are building. While designing software, it is always a good idea to keep Continuous Integration and Continuous Delivery in mind to maximize team productivity and to have quick and robust delivery.

Json Data in SQL Server

Standard

The rise of NoSQL database such as mongoDb is largely due to the agility to store data in non-structured format. A fixed schema is not required like traditional relational databases such as SQL Server.

However, NoSQL database such as mongoDb is not a full-fledged database system. It is designed for very specific use cases. If you don’t know why you need to use NoSQL in your system, chances are you don’t need to. For those who find it essential to use a NoSQL database, often they only use NoSQL database for certain portion of their system and then use another RDBMS for the remaining part of their system that have more traditional business use cases.

Wouldn’t it be nice if RDBMS is able to support similar data structure – having the ability to store flexible data format without altering database tables?

Yes, it is possible. For years, software developers have been storing various JSON data in one table column. Then, developers will make use of library such as Newtonsoft.Json within the application (data access layer) to deserialize the data to make sense out of the JSON data.

Reading / Deserializing JSON

This works. However “JsonConvert.DeserializeObject” method is working extremely hard to deserialize the whole JSON data to only retrieve a simple field such as Name.

Imagine there is a requirement for searching certain Genres on a table that has 1 million row of records, the application codes will have to read 1 million row of records, then perform filtering on the application side. Bad for performance. Now imagine if you have a more complex data structure than the example above…

The searching mechanism will be much efficient if developers can pass a query (SQL statement) for database to handle the filtering. Unfortunately SQL Server does not support querying JSON data out of the box.

It is impossible to directly query JSON data in SQL Server until the introduction of a library known as JSON SelectJSON Select allows you to write SQL statement to query JSON data directly from SQL Server.

How JSON Select Works

First you need to download an installer from their website. When you run the installer, you need to specify the database you wish to install this library at:

JsonSelect

What this installer essentially does is to create 10 functions in the database you have targeted. You can see the functions at:

SSMS > Databases > [YourTargetedDatabase] > Programmability > Functions > Scalar-valued Functions

functions

Next, you can start pumping in some JSON data in your table to test it out.

I create a Student table with the following structure for my experiment:

table

In my StudentData column, I enter multiple rows of records in the following structure:

For demonstrating the query purpose, I have entered multiple rows as following:

all-rows

If you want to write a simple statement to read the list of student names in JSON data, you can simply write:

You will get result as following in SSMS:

name

How about more complex query? Does it work with Aggregate Functions?

If you want to find out about how many students come from each city and what is their average age, you can write your SQL Statement as following:

You will get result as following in SSMS:

complex-query

It appears the library allows you to query any JSON data in your table column using normal T-SQL syntax. The only difference is you need to make use of the predefined scalar-valued functions to wrap around the values you want to retrieve.

Few Last Thoughts…

  1. The good about this library is it allows developers to have hybrid version of storage (NoSQL & relational database) under one roof – minus the deserialize code at application layer. Developer can continue using the classical RDBMS for typical business use cases and leverage on the functions provided in the library to deal with JSON data.
  2. The bad about this library is it lacks proven track record and commercial use cases to demonstrate the robustness and stability.
  3. Although the library is not free, the license cost is relatively affordable at $AU 50. However the library is free for evaluation.
  4. SQL Server 2016 provides native support for JSON data. This library is only useful for SQL Server 2005 to 2014 where upgrading to 2016 is not a feasible option.