“Attach to process…” tricks

Standard

When you are working on a large system, where your code base (project or solution) is only a subset of the entire system you will often use “Attach to process…” to tap into the execution of the application pool for debugging.

“Attach to process…” might not the most convenient approach for debugging, however at times it can be more efficient than launching whole application from your code base. There are still value in using “Attach to process…” although this approach could be painful occasionally.

Here are a few tricks to make the debugging process less painful.

Trick No.1 – Identify your AppPool

When you are working with a little more complex application where your application have dependencies on other applications, you might have multiple w3wp.exe running like following:

Which process should you attach to?

All of them? Of course that would work but obviously that’s not the kindest thing you can do to the poor machine memory and CPU…

The following command help you identify the AppPool name and the process ID so that your Visual Studio will only attach to the relevant w3wp.exe

C:\Windows\System32\inetsrv>appcmd list wp

Base on the AppPool name, you can now know the process ID you should attach to in your Visual Studio.

Trick No.2 – ReAttach (Visual Studio extension)

Imagine you have to use “Attach to process…” for debugging purpose in your project. You modify your code a little to change the logic or add some validation and want to continue debugging. It is annoying to keep launching the above “Attach to Process” dialog to select the w3wp.exe.

ReAttach is a Visual Studio extension to help you to attach to the process ID that you have attached a moment ago.

Download and install it then you will have a new option in your Visual Studio menu, Reattach to Process… instead of just “Attach to Process…”

The extension basically helps you to reattach your debugging to the process that you attached earlier. It will continue to work as long as your process ID did not change / disappear.

If for whatever reason you need to run iisreset and your process ID changed. By using “Reattach to process…” it will help filtering out all other irrelevant process and only show you the available w3wp.exe

Hope these 2 tricks help your debugging by attaching to your AppPool. Until next time!

Technical Interview Part 2

Standard

What I do instead

Before technical interview session, a technical assignment given to the candidate on Friday, so that they can work on it over the weekend. The technical assignment typically takes few hours to complete. I have a range of questions from demonstrating a design pattern, build a simple application with database interaction, to SEO analysis algorithm.

The objectives of the technical assignment are:

  1. Ensure the candidate can codes.
  2. Evaluate how modern his development approach (for example, whether the candidate will use Elmah or Log4Net over writing a custom logger class)
  3. Evaluate how serious he treats his codes (if a candidate deliver a half-hearted solution, it indicates the same for his work codes)
  4. Evaluate whether he goes the extra miles (such as implementing unit tests and proper exception handling)

During the technical interview, I will do the following:

  1. Have the candidate to explain the core of the solution. I will then ask a few questions base on his implementation. For example if I want to implement certain change, where should I modify the code. This is to ensure I’m talking to the person who wrote the codes.
  2. Find a flaw in the system and press on it again and again – in a professional and respectful manner. This is to evaluate how well the candidate respond to criticism, whether the candidate get defensive and whether the candidate is open to feedback.
  3. Challenge the candidate on how he can upgrade his solution to be production ready on both code and infrastructure level. This is to evaluate how much thought he has given to his solution and how much of exposure the candidate has dealing with production system.

Next, I will move on to a list of generic topic on software development. Example of the topics I cover.

Source Control

Every developer uses source control to certain extend. I will normally ask what kind of source control has the candidate use. The top 3 answers are TFS, Git and SVN. I will ask the candidate to share with me, what are the differences between the top 2 source controls he is familiar with. The idea here is to discuss about what the candidate is familiar with so that he can show his best thoughts. Depending on what the candidate bring up to the table, I will get a sense of what kind of developer the candidate is.

For example, if a candidate tells me checking out a branch in TFS is downloading a whole new copy of the code; while Git is merely applying the delta difference on the same copy of the code, it indicates this candidate used to work with giant code base with some level of branching experience and he appreciates Git is much more efficient on client side storage.

Another example, if the candidate brought up terms such as Rebase, I will follow up by asking what is the difference between Rebase and Merge on theoretical level and when is a good scenario to use Rebase over Merge on practical level. Depending on the scenario given, I might (or might not) have further question to validate the usages. The idea here is I’m following up on the topics suggested by the candidate himself. If a candidate cannot provide solid evidence on how familiar is he with the topic suggested by himself, that indicates the candidate is throwing fancy terms around hoping to impress the interviewer.

Design Pattern

Despite the challenges I highlighted earlier on design pattern, I still think design pattern is a good topic to cover during technical interview because the right application of design pattern indicate the complexity of codes the candidates has dealt with, hence the need of design pattern.

Ever since I take the role of being an interviewer, I do make it a point to read up on additional design patterns that I have never used. The good news is, most candidates consistently brought up only a handful of design pattern. The top 3 are such as Singleton, Abstract Factory and Repository.

Although Dependency Injection is not strictly a design pattern, a lot of candidates did mention Dependency Injection as something they know under the design pattern topic. I do not dismiss this answer just because it did not fit into the definition. My objective here is to assess the candidate’s ability to design his code structure, not a competition of giving definition.

Singleton

What do I look out for during Design Pattern discussion? Take Singleton for example. After the candidate mentioned he knows Singleton, I will follow up with question “What is a good use case to use Singleton?”. The typical answer I got is something along the line “when you only need to have a single instance of the class”. Good. At this point, I know that the candidate is aware of the definition of Singleton although did not provide the use case I asked for. I will rephrase my question slightly differently to remind the candidate I’m looking for a use case.

One “interesting” answer always pop up is to apply Singleton in data layer (CRUD operation to database). I call this interesting because anyone who give a little more thoughts or have done some research in Singleton will realize it’s a bad idea to apply the pattern in data layer. However, this misconception comes up very frequently.

I will take this opportunity to explain to the candidate the kind of problem will surface for applying Singleton in data layer. Why do I do that? Having the right skill or information is important, but having the right attitude is equally (if not more) important. You can teach someone new skill but it is extremely challenging change someone’s attitude. At this point if the candidate appears to be enlighten with the new information, I know the candidate is coachable. In most situation, I would rather to have a coachable new hire (although not having the top notch skill set) over someone with superstar skill set but a poor attitude.

Database

Working with database will come across a developer’s path very frequently. It is an unspoken rule that a developer must be able to work with database. With the amount of storage options in the market, it is difficult to discuss all of them but we will stick with the most popular option for most .NET developer – SQL Server in this article.

When hiring a junior developer, the candidate will have to prove his ability in writing T-SQL. Insert, Update, Delete and different kind of joins. No big deal. For senior developer, I would normally ask the candidate what other exposure does he has apart from T-SQL. Asking the actual involvement in SQL Server gives me very good indicator what kind of system the candidate has dealt with.

For example, if the candidate claimed he takes care of database backup, I will follow up with what is the backup cycle and types of back up he was using. If all the candidate did was doing a full back up on daily basis, it indicates the database size he was dealing with was not very large and the data lost does not seem like a big deal which means the data is not extremely critical.

If the candidate mentioned he scales SQL Server, I will follow up with what type of replication he applied and what is the rational behind the decision. I will also ask what other strategies he has considered before using replication because replication is an expensive option. If the candidate brought up Redis cache and index provider such as Solr or Azure Search, it shows the candidate has looked beyond SQL Server context which indicates he is someone having very broad skill set across technologies.

Once a candidate told me he implemented table partitioning in his database. I asked what is the logical condition he applied his partition base on. He said primary key which is GUID data type. That was an interesting answer because the generally approach to create partition is to base on date or some other logical conditions. I explained to him how I would implement table partitioning instead and the reason behind it. His eyes were brighten up.

Notice that I did not say “This is wrong. The correct way is this”. Instead, I make it as a discussion on “This is what I will do instead”. The same information was delivered across, but the outcome will be very different.

The candidate impressed me because he knows about table partitioning that most developer don’t. It suggests that the candidate is someone who took the extra effort to learn new skills to solve problems. Most importantly, the way he responded to the information I shared with him suggests he is someone coachable. I took this candidate into my team and he has proven to be a star team member.

Few final thoughts…

There are a lot of other topics that I cover during the interview. Most of them are generic topics such as tweaking software performance and security. The purpose of having a standardized list of topic is to ensure I use the similar benchmark for candidates for the same position. The reason to start with generic topic and drill further down is to allow the candidate to talk about areas that they are familiar with so that they can showcase their sharpest thoughts.

When candidate brought up certain topic for discussion, I’m assuming he knows about the topic very well. I’m handing over the power to drive the discussion to the candidate to certain extend. I prefer to talk about what the candidate is familiar with (instead of mine) so that I can truly assess his level of technical competency. Frankly, there is very little value to talk about a topic the candidate has only read an articles on 6 months ago. However, whichever topics that the candidate brought up, I will drill really deep to ensure he indeed knows about them rather than just throwing some fancy words around. 

During technical interview, I’m looking at more than just technical skills. Technical skills is learnable. What really interest me are:

  • Whether the candidate is coachable?
  • How big of passion the candidate has over technology?
  • What is the candidate’s approach in solving problems?
  • What is the candidate’s attitude dealing with technology and PEOPLE?
  • How much of potential the candidate has so that the company can groom him to be a superstar developer and beyond?

The technical topics I have for the candidate were merely for me to expose those areas I’m interested to learn about the candidate. I’m never interested to know the difference between a clustered index vs non-clustered index or the difference between Azure Web Job vs Azure Worker Role vs Azure Function. Given a laptop with internet, anyone can Google them in 5 seconds. What I am interested to discover is whether this candidate is coachable, his passion, his approach, his attitude and his potential!

Ideally, we should hire the right person with the right skill. However such angels rarely come by. If I have to choose between the right person or the right skill, I will choose the right person any day. Of course, provided the candidate still has reasonable level of skill set on the role he is applying. New skills are learnable and very often it is very quick to learn a new skill. Coaching a person takes a much more time, energy and cha-ching – if you are lucky.

If you are not lucky, a bad apple not only bring down productivity but also break the current harmonious team. It is much more effective to filter the potential troublemaker than to “coach” or “develop” him later. There is no point hiring bad apples just to hit headcount. With people, slow is fast.

Some companies practice having a couple strong technical guys to interview candidates whom they might not eventually work with. The interviewers are hiring for the company wide. Some companies practice having the Team Lead / Architect within the team to interview the candidates whom they will eventually work with. They are hiring for the team. I have been in both the situations and personally I prefer to the latter.

Being able to work with the person whom I interviewed earlier will give me additional consideration and deeper thoughts into whether the candidate will be a good fit into my team. Another good reason is to allow me to validate and refine my interview techniques. Interview is all about perception and assumption made on the candidate. I have made good decisions and I have made bad decisions. However, in the situation where I made a wrong assumption base on a wrong perception, I can adjust my interview technique on a continuous basis if I have first hand experience working with the candidate I interviewed.

Finally, I don’t claim what I’m doing is the only way or the best way. We live and we learn 🙂 I found this approach to be working quite well hence I continue practicing. If you have any thought on this, please leave me a comment. Hope you have found something useful in this article. Until next time. Cheers!

Technical Interview Part 1

Standard

Technical interview is both an exciting and stressful moment. It is exciting because there is a potential career opportunity ahead of you. It is stressful because you subconsciously aware that the people in the room are there to judge you.

I have been on the both side of the table – being an interviewer and an interviewee. It is stressful to be an interviewee for obvious reason. We need to try hard to sell ourselves and “sales” is not a skill that come naturally to technical people like us. Furthermore, you never know what kind of psychopath you might meet, asking you to find a bug in his rocket science algorithm. It is equally stressful to be an interviewer. Now your shoulder carries the responsibility of evaluating a candidate whether the candidate will be a right fit to the organization for long term. Being too lenient, you might get a bad apple into the existing harmonious team; being too strict, you might lose a black horse who might just need a little polishing.

As an interviewee

Let’s deal with the stressful problem for the interviewee first. Throughout my experience, I noticed I perform best when I’m not feeling nervous. The key to not feeling nervous is not to feel desperate for a job. Always look for a new job when you least needed it. When you don’t “need” the job, you are going into the interview room as an equal. Low need, high power and vice versa. Did you notice the term interview basically means “viewing each other”. You go in as an equal to evaluate the company as much as the company is evaluating you. The outcome of having this mentality allow you to feel more confident. Again, from my personal experience when I go into a technical interview with this mindset, I often have a pleasant technical discussion with the interviewer.

As an interviewer

Now for the interviewer. I’m not sure how many interviewer will feel stressful. I did not feel being an interviewer is a stressful task until I’m conducting interview for the 3rd year. Interviewee will usually be polite and humble. Most of the time, interviewee will do their best not to offend or make it difficult for the interviewer. I always felt I have an upper hand while conducting interview, hence I never thought there was a problem. It was only until I pull myself out of the technical interviewer’s role and give a more holistic insight from the organization perspective. I realized there are so many other aspects I need to put into consideration while conducting technical interview.

For example, during one interview I found out that talking to me in a technical interview is the 7th round of interview the candidate has gone through. He has taken online technical test and other technical interviews prior to talking to me. My final feedback on the candidate is a clear ‘No’. However, that got me thinking how and why did the candidate was able to pass the previous 6 rounds of interview but not my technical interview. Is there something wrong with the way I asked technical questions? Or does it simply means the previous 6 rounds of interview were not done effectively?

Another example, the organization has an expansion plan is to grow another 100 headcount in 1.5 year. That is equivalent to approximately 6 new hires in a month. Aggressive? Definitely! However base on the current hiring rate, we will not be able to hit the number. What need to be done differently? Should I lower my technical benchmark? Should I say we can’t meet this number simply because we cannot find the talents? How big (or small) the impact is to the projects if we do not meet the numbers? Most importantly, where should I find the balance?

The nature of software development skill set has both breadth and depth. Ideally it will be perfect to pair an interviewer and interviewee who have the identical technical domain experience. Reality is due to the today’s technology breadth, developers often focus on very different vertical skill set. For example, the interviewer might be an expert in Azure Web WebJob, Azure Storage and MVC but the interviewee has been working on Angular, Web API and SQL Server. Both of them are expert in their respective full-stack domain but there is very little common ground.

Let’s face it, both the interviewer and interviewee would not know every topics in great depth, even just within the Microsoft stack of technologies. How can the technical interview being conducted in a fruitful manner with this breadth and depth nature? Do I dismiss a candidate just because they don’t share the similar background with me even though he is talented, passionate and willing to learn?

What is the solution? Should the interviewer ask something more generic and academic like object-oriented concept? Something more holistic yet sophisticated like design pattern? Or something more brainy like algorithm?

Popular topics interviewer use

Object-oriented concepts

In my previous job, my technical interview is the 1st round of interview after the candidate has passed a codility test. The online test involves assessing candidates basic programming skill (fixing a logical operator in a small function) and writing a basic SQL query with some joins. I think it was necessary to cover the basic of object-oriented concept for a C#.NET developer. So I ended up with asking questions like:

  • Explain to me what are method overloading and method overriding?
  • What are the differences between interface and abstract class?

I was under the assumption these questions were alright until one day I have a candidate who answered me so fluently as if he was reading it out from a book – except he didn’t have a book in front of him. This suggests the candidate have rehearsed these answers a thousand times before talking to me.

Well, the reality is at first I thought the candidate was such a bright developer that knows these concept so well. I decided to give him a little more challenging question to see how far he could go. The question was base on what he has explained earlier where an abstract class can contain both method with empty implementation and concrete implementation; while interface can only contain method signature without implementation. Great!

My next question was, if an abstract class can do both methods with empty implementation and concrete implementation, why do we still need interface? I was expecting him to explain something along the line where a child class can only inherit 1 abstract class but multiple interfaces. I would be happy to accept the answer and prepared move on to the next topic even if he just give me a one liner answer. To my surprised he kept quiet and could not provide any explanation.

From there, I realized there are candidates who really put a lot effort in preparing for technical interview like rehearsing answers for those top 50 interview questions from Google result. Ahem… the truth was, I was too lazy to craft any original interview question back then so I ended up using questions from those top 50 interview questions where candidates can easily prepare for. The problem with this was I ended up evaluating how much preparation work a candidate has done rather than how strong his technical capability is. It was a bad idea to use top 50 interview questions.

The top 50 interview questions

When you use those top 50 interview questions, not only you cannot accurately assess the candidate, you will push away those developers who really know their stuff. Remember interview is about viewing each other between the interviewer and the interviewee. Under normal circumstances, a company will put one of their best guys to conduct the interview. If the best guy in the company can only conduct interview base on top 50 interview questions, it will really make me think twice whether I want to join the company when the company offers me a job.

In fact, I encountered this once. I was talking to an interviewer in a MNC who has prepared a long list of technical questions. We covered those questions in approximately 30 minutes instead of his normal 60 minutes. At one point, after he asked question A, I knew he will follow up with question B, so I explained the answer for question B along with the answer in question A. At the end of the interview, his feedback was it was as if I already have the list of question that he was holding. The truth was, I have gone through those questions 5629 times when I was preparing interview questions for my candidates.

Eventually, I did not take up the offer in the MNC. There are many factors that influenced the decision. One of them is knowing the best technical guy in the team could only do what I did 2 years ago, it wasn’t very motivating.

I have stopped using those top 50 interview questions. They are for amatures 🙂

Design pattern

Design pattern seems like a favorite topic for discussion during technical interview in the past few years. This topic got so popular to the point that a recruiter without a computer science background will start asking candidates to explain design patterns. It took me by surprised when two HR looking ladies (they were recruiters) were asking me to explain the design patterns I have worked with. I got a feeling they did not understand 9 out of 10 sentences came out from my mouth because they never ask any follow up question base on what I explained. They probably just wanted to see how clearly I can articulate my ideas.

Design pattern is something you implement it once and it becomes a second nature in your project. Developers do not apply 7 patterns at a go and revisiting them every 3 weeks to evaluate whether they are still appropriate. If they are not, revamp them and apply another 5 new patterns. This simply do not happen for any software with real delivery timeline. Most developers will be working with 1-2 patterns on a daily basis. This will be a breadth and depth issue. The interviewer might be an expert with Adapter and Abstract Factory while the interviewee is an expert in Observer and Singleton. It is not always possible to have an in-depth discussion on all design patterns.

Shouldn’t a good developer know a few more patterns at least on the theoretical level? Yes, I think it’s a valid point. However there will still be a gap between interviewer and interviewee’s level of understanding. For example, the interviewer has been working with Adapter for the last 3 years and the interviewee only read 3 articles on Adapter pattern (or vice versa). The level of discussion between interviewer and interviewee on Adapter pattern is going to be shallow.

The bad news is, some interviewers doesn’t seem to recognize the breadth and depth gap. Some interviewers insist on discussing rigid details on specific design pattern. It will end up being an unpleasant experience for both interviewer and interviewee. Interviewee feeling inferior for not being able to provide an answer; while interviewer feeling not satisfied because he cannot have a meaningful technical discussion with interviewee to assess his technical skill.

The good news is, when design pattern base questions are done right, it gives both the interviewer and interviewee a good discussion to explore areas they both might not have thought of before as an individual.

Algorithm

This is a very safe approach to use during technical interviews because all programmers are expected to have solid logical thinking. Algorithm is all about combining programming techniques and logical thinking to solve specific problem. It is a very suitable approach to assess interviewee’s ability to solve a problem using codes.

Interview questions base on algorithm could be as simple as writing a function to print a few asterisks (*) on the screen, to detect whether the input is an odd or even number, to sorting a series of numbers, to printing a calendar. Usually the company who uses algorithm base questions will have 3 level questions such as easy, medium, hard. If you want to secure a job, you should at least get it right on easy and medium. The hard question is there for the interviewer to identify a grand-master coder over a senior coder.

The ironic part about algorithm base question is a lot of candidates tend to shy away from them.

Example 1:  About 8 years back it was still pretty common to have the candidate to write down the solution on paper. The question was about a simple string manipulation function. Unfortunately, the candidate who appeared to be an experienced developer handed me empty paper and left with an apologetic tone saying this job might not be right for him.

Example 2: One company that I know of is asking the candidate to code a function to detect an integer input whether is an odd or even number and display an appropriate message – using the provided laptop with Visual Studio on it. The answer is surprisingly simple which is to use a modulus (%) and put an If check at the remainder. However this took a candidate who is applying a senior developer position 20 minutes to type a few keystrokes and a few backspaces, type a few keystrokes and a few backspaces.

Example 3: Codility has been an handy tool for conducting programming test online to save everyone’s time. I recently found out a friend who applied for Tech Lead position. He was asked to write a function to work with zero-based index in Codility. To my surprise, he could not understand the question. He did not even attempt to write the solution and closed the browser.

It appears that interviewee feels very stressful when the technical interview involve writing algorithm. In the above examples, the question was not complicated, the answer was not complex. I believe all 3 candidates in the above examples can do reasonably well if they are not in an “technical interview” mode.

In the next article, I will discuss more about how I conduct technical interview instead…

Azure WebJob with Azure Queue

Standard

Cron job is essential part of complex systems to execute of certain script or program at a specific time interval. Traditionally, developer or system administrator create Windows Scheduled Task to execute scheduled job within the operating system.

In one project, I used to have multiple .exe programs scheduled to update the production database during mid night for various use cases such as expiring user credit. This gets the job done within application context but this is not the cleanest way when my system administrator need to take care of 20 other cron jobs coming from different machines and different operating systems.

The next thing I have implemented is to expose a cron job through WCF API endpoint. For example, I opened a WCP API endpoint to be triggered on a functionality in sending email notification. This end point will map user’s saved criteria and business inventory on a daily basis. (Yes, this is the annoying 9.00AM email spam notification you get everyday. Sorry!) The WCF API endpoint does not do anything if no one hits it. It is a simple HTTP endpoint waiting for something to tell him to get up and work.

The reason to expose the cron job as WCF API endpoint is to allow my system administrator to have a centralized system to trigger and monitor all the cron jobs in one place rather than logging into multiple servers (operating systems) to monitor and troubleshoot. This works alright except that now I have my cron job stuck in a WCF project instead of simple script or a lightweight .exe program.

Azure WebJob

The next option is Azure WebJob. Azure WebJob enables me to run programs or scripts in web app context as background processes. It runs and scales as part of Azure Web Apps. With Azure WebJob, now I can write my cron job as simple script or a light-weight .exe rather than WCF. With Azure WebJob, my system administrator can also have a centralized interface, Azure Portal to monitor and configure all the cron jobs. In fact, it’s pretty cool that I can trigger a .exe program using a public HTTP URL using the Web Hook property in WebJob.

Azure WebJob goes beyond the traditional cron job definition (timer based). Azure WebJobs can run continuously, on demand or on a schedule.

The following file types are accepted:

  • .cmd, .bat, .exe (using windows cmd)
  • .ps1 (using powershell)
  • .sh (using bash)
  • .php (using php)
  • .py (using python)
  • .js (using node)
  • .jar (using java)

I will use C#.NET to create a few .exe to demonstrate how Azure WebJob works.

Prerequisites (nice to have)

Preferably, you should have Microsoft Azure SDK installed on your machine. At the point of writing this, I’m running Visual Studio 2015 Update 3, so I have the SDK for VS2015 installed using Web Platform Installer.

Note that this is NOT a MUST have. You can still write your WebJob in any above-mentioned language and upload you job manually in Azure Portal. The reason I recommend you to install it is to make your development and deployment much easier.

Working with Visual Studio

If you have Microsoft Azure SDK installed for Visual Studio, you will see Azure WebJob template. We are going to start with a simple Hello World program to get started.

Your WebJob project will be pre-loaded with some sample codes. As usual, you will still need to build it once for NuGet to resolve the packages. 

The following packages are part of your packages.config if you started the project with Azure WebJobs template. No big deal if you didn’t, you can install them manually, although it’s a little tedious.

For now, we will ignore the fancy built-in SDK support for various Azure services. We will create a simple Hello World program and deploy it to Azure to get an end-to-end experience.

Remove everything else and left with Program.cs writing a simple message:

Go to your Azure Portal. Click on Get publish profile to download your publishing profile. You will need it to import into your Visual Studio later when publishing your WebJob.

Import your publish profile into your project in Visual Studio. The Import dialog will kick in at the first time when you publish your project as WebJob.

Right click on your project and select “Publish as Azure WebJob”, you will see the following dialog to set up your publishing profile.

Import the earlier downloaded publishing profile setting file into your WebJob project in Visual Studio.

Validate your connection.

Click on “Publish” to ship your application to Azure.

Upon successful publishing, you should be seeing a message “Web App was published successfully…”

Go to your Azure portal and verify that your WebJob is indeed listed.

Select your WebJob, click on the Logs button on top to see the following page.

The impressive part about using WebJob in Azure is the following WebJobs monitoring page. You can use this page to monitor multiple WebJobs status and drill down deeper into the respective logs. No extra cost, coding or configuration, all work out of the box!

Now we have our first Hello World application running in Azure. We have deployed our WebJob to run continuously, which means it will get triggered automatically every 60 seconds. Once the first round is completed, status will change to PendingRestart and wait for the next 60 seconds to kick in.

WebJob SDK sample project in GitHub demonstrates comprehensively how you can work with WebJob through Azure Queue, Azure Blob, Azure Table, Azure Service Bus. In this article, we will do a little bit more coding by using WebJob to interact with Azure Queue.

Azure WebJob with Azure Queue Storage

Microsoft.Azure.WebJobs namespace provides QueueTriggerAttribute. We will use it to trigger a method in our WebJob.

This works by whenever a new message is added into the queue, the WebJob will be triggered to pick up the message in the queue.

Before we continue in our codes, we first need to create a Azure Storage account to host the queue. Here, I have a storage account name “danielfoo”.

We will use Microsoft Azure Storage Explorer to get visual on our storage account. It’s a handy tool to visualize your data. If you do not have it, no worry, just imagine the queue message in your mind 🙂

Let’s add a new console application project in our solution to put some messages in our queue.

There are two packages that you’ll need to install into your queue project:

We will write the following simple codes to initialize a queue and put a message into the queue.

Of course, you will have to configure your StorageConnectionString in app.config for codes to recognize the connection string.

You can get your account name and key from Azure Portal.

Let’s execute our console application to test if our queue can be created and whether a message can be placed into the queue properly.

After execution, look at Storage Explorer to verify if the message is already in the queue.

Now we will dequeue this message so that it will not interfere with the actual QueueTrigger in our exercise later.

Next, we will create a new WebJob project that get triggered whenever a message is added into the queue by using QueueTriggerAttribute under Microsoft.Azure.WebJobs namespace.

This time we do not remove Functions.cs nor modify Program.cs.

Make sure that your Functions.cs method parameter contains the same queue name as what you defined earlier in your Queue.MessageGenerator project. In this example, we are using the name “danielqueue”.

Program.cs

Remember to fill up your App.config on the following connection string. This is to allow the WebJob to know which storage account to monitor.

Now, let’s start WebJob.QueueTrigger project as a new instance and allow it to wait for a new message add into “danielqueue”.

Then, we will start Queue.MessageGenerator project as a new instance to drop a message into the queue for WebJob.QueueTrigger to pick up.

Yes! Our local debug is has detected a new message is added into “danielqueue” hence hit the ProcessQueueMessage function.

Let’s publish our WebJob.QueueTrigger to Azure to see it processing the queue message in Azure context instead of local machine. After successful publishing, we now have 2 WebJobs.

Select QueueTrigger (the WebJob we just published) and click on Logs button on top. You will see the following log on queue message processing.

If you drill down into particular message, you will be redirected to the Invocation Details page

We have just setup our WebJob to work with Azure Queue!

That wraps up everything I want to show you in working with Azure WebJob and Azure Queue.

Obviously in reality you will write something more complex than simply output the log in your WebJob. You may write some logic to perform certain task. You may even use this to trigger another more complex job sitting in another service.

In the queue, obviously you also wouldn’t write a real “message” like I did. You will probably create one queue for very specific purpose. For example, you will create a queue to store a list of ID, where each of the ID is required for another type of process such as indexing. The queue will index the entity (represented by the ID) in batches (let’s say 4 messages at a time) instead of having a large surge of load in a short period of time.

Few more thoughts…

  1. By default, JobHostConfiguration.QueuesConfiguration.BatchSize handles 16 queue messages concurrently. I recommend you to override the default value with a smaller value (let’s say, 4) to ensure the other end which does the more heavy processing (for example indexing a document in Solr or Azure Search) is able to handle the load. The maximum value for JobHostConfiguration.QueuesConfiguration.BatchSize is 32. If having WebJob to handle 32 message at a go is not sufficient for you, you can further tweak the performance by setting a short JobHostConfiguration.QueuesConfiguration.MaxPollingInterval time to make sure you do not accumulate too many message before the processing kicks in.
  2. If for whatever reason you have max out the built-in configuration (such as BatchSize, MaxPollingInterval) and yet it is not good enough, a quick win will be to scale up your WebApp. Note that you cannot scale your WebJob alone because WebJob sits under the context of WebApp. If scaling up WebApp for the sake WebJob sound like an inefficient way, consider migrating your jobs to Worker Role.
  3. WebJobs are good for lightweight processing. They are good for tasks that only need to be run periodically, scheduled, or triggered. They are cheap and easy to setup and run. Worker Roles are good for more resource intensive workloads or if you need to modify the environment where they are running (for example .NET framework version). Worker Roles are more expensive and slightly more difficult to setup and run, but they offer significantly more power when you need to scale. There is a pretty comprehensive blog post by kloud comparing WebJob and Worker Role.
  4. Azure Storage Queue has no guarantee on message Ordering. In other words, a message get placed into the queue first does not necessary get processed first. Delivery for Azure Queue is At-Least-Once but not At-Most-Once. In other words, a message potentially get processed more than once. The application codes will need to handle the duplication of what happens after a message is picked up. If this troubles you, you should consider Service Bus Queue. The Ordering is First-In-First-Out (FIFO) and delivery is At-Least-Once and At-Most-Once. If you are wondering then why people still use Azure Storage Queue, it is because Storage Queue is designed to handle super large scale queuing. For example, maximum queue size for Storage Queue is 200 TB while Service Bus Queue is 1 GB to 80 GB; maximum number of queues for Storage Queue is Unlimited while for Service Bus Queue is 10,000. For complete comparison reference, please refer to Microsoft doc.

I hope you have enjoyed reading this article. If you find it useful, please share it with your friends who might benefit from this. Cheers!

Tech Talk 2016

Standard

2016 has been a fruitful year for me in software development. Apart from my day job in Sitecore as a lead developer, I also have a lot fun with services in Azure cloud in my spare time.

Another new “adventure” I tried in 2016 is speaking for Tech Talk in local tech communities – from my own office, university, Microsoft office to being an online panelist in Google Hangout discussion.

Microsoft Malaysia Level 26, Tower 3

Scaling SQL Server @ Microsoft Malaysia Level 26, Tower 3

multimedia-university

Software Industry Career Advice @ Multimedia University, Computing Faculty Lecture Hall, Cyberjaya

Sitecore Malaysia - Daniel Foo on Azure Search

Working with Azure Search @ Sitecore Malaysia, Level 18

Standardizing DevOps Across Organization @ Continuous Discussions (#c9d9) Google Hangout

It has been a rewarding experience to contribute and making a difference in tech communities through Tech Talk. It amazed me when some audiences asked me follow up questions based on what I have shared earlier. It is truly satisfying to learn that I have inspired some audiences with useful information or ideas in general which they can benefit from.

It has been an honor to share the stages with many of the knowledgeable speakers. As much as I enjoyed sharing, I have learned equally a lot from them. Thanks to those who invited me over to speak, those who helped me out to make the Tech Talks possible and those who came and supported me. Thank you! I am truly lucky to have you all amazing people as friends in the tech communities.

Signing off for 2016… It has been a fantastic year, looking forward to 2017!

Standardizing and Scaling DevOps Across the Organization

Standard

Large, complex organizations implementing DevOps find that they need to balance between supporting small teams and an agile way of working,  and finding ways to standardize their DevOps processes, tooling and implementation — in order to scale DevOps throughout the organization (and save costs).

On Tuesday I participated in an online panel on the subject of Standardizing and Scaling DevOps Across the Organization, as part of Continuous Discussions (#c9d9), a series of community panels about Agile, Continuous Delivery and DevOps. Watch a recording of the panel:

Continuous Discussions is a community initiative by Electric Cloud, which powers Continuous Delivery and Release Management at businesses like SpaceX, Cisco, GE and E*TRADE by automating their build, test and deployment processes.

Working with Azure Search in C#.NET

Standard

What is Azure Search

Azure Search is a service in Microsoft Azure Cloud Platform providing indexing and querying capabilities. Azure Search is intended to provide developers with complex search capabilities for mobile and web development while hiding infrastructure requirements and search algorithm complexities.

I like the idea of having an index service provider being part of my solution as it allows developers to perform searches quickly and effortlessly without going through the pain of writing essay-length SQL query.

The key capabilities of Azure Search include scalable full-text search over multiple languages, geo-spatial search, filtering and faceted navigation, type-ahead queries, hit highlighting, and custom analyzers.

azure-search-daniel-foo

Why Azure Search (or index provider in general)

  1. It’s Brilliant – Having an index provider sitting in between application and database is a great way to shield the database from unnecessary or inefficient Read requests. This allows database I/O and computing power to be reserved for operations that truly matters.
  2. It’s Modern – Traditional search approach requires developers to write essay-length SQL query to retrieve simple aggregated data. Index provider such as Azure Search allows developers to initiate a search request without writing complicated search algorithm (for example geo-spatial search) and no complex SQL query is required (for example faceted navigation).
  3. It Scales – Solutions that gain the most benefited from index provider are relatively larger enterprise system. Such system often require scaling to certain extain. Scaling Azure Search in Microsoft Azure Cloud Platform is several clicks away compared to having an on-premise service provider such as Solr.
  4. It’s FREE – Microsoft Azure provides a free tier for Azure Search. Developers can create and delete Azure Search service for development and self-learning purpose.

Use Case

A car classified site is a good example to make use of index service provider such as Azure Search. I will use a local car classified site, Carlist.my to illustrate several features the system can potentially tap into.

Type-ahead Queries allow developer to implement auto suggestions as user types. In the following example, as I was typing “merce”, the site returns me a list of potential Mercedes-Benz car model that I might be interested in.

type-ahead-queries

Facet allow developers to retrieve aggregated data (such as car make, car model, car type, car location, car price range, etc) without writing writing complex SQL query, which also means saving the Read load at database. In the following example, the site returns me a list of Mercedes-Benz model and the count in the bracket to indicate how many classified ads are available for the specific model.

facet

Filter allows developer to retrieve documents that fit the searching criteria without writing complex SQL queries with endless INNER JOIN and WHERE clauses. In the following example, I specified that I want all the used Mercedes-Benz, model of E-Class E200, variant of Avantgarde from Kuala Lumpur with the price range of RM100,000 to RM250,000. You can imagine the kind of INNER JOIN and WHERE clauses the poor developer has to design dynamically if this were to be retrieved from a database directly.

filter

Another feature the car classified site can potentially tap into is Geo-spatial Search although it is not seen implemented. For example if I were to search for a specific car in a dealership, the portal can suggest similar cars from other dealerships nearby to the dealership I’m looking at. That way when I make a trip to visit a dealership, I can also visit other nearby dealerships that have the similar cars.

Using C#.NET to work with Azure Search

Let’s roll our sleeves and hack some codes. I will be using a C#.NET console application to illustrate how we can design and create an index, upload documents into the index and perform several types of searches on the index. This solution will be used to simulate some of the potential codes required by a car classified portal.

First, we create a solution name AzureSearchSandbox.

We will need “Microsoft.Azure.Search” NuGet package from NuGet.org. Open your Package Manager Console and run the following command:

Upon successful installation, you will see several NuGet packages are added into your packages.config file in your solution.

Note that you will only need to install “Microsoft.Azure.Search”, the other packages are dependencies. The dependencies are resolved automatically. 

In your solution, add a new class Car.cs

This Car object will be used to represent your Azure Search document.

Next we create a static Helper.cs class to take care of the initialization of the index in Azure Search.

Initialize() is the main method to kick start our Azure Search index. Unlike other on-premise index service that require certain amount of setup such as Solr, it doesn’t take long to have our index up and running. In Solr, I have to install Solr using NuGet, install the right Java version, set the environment variable and finally create a core in Solr admin portal. With Azure Search, no upfront set up is required.

The index is created in CreateIndex() method, where we tell Azure Search client SDK that we want an index with the fields we define in our Index object.

To ensure this set of code is running on a fresh index, we have DeleteIfIndexExist() method to ensure the previous index is removed. We call this right before CreateIndex() method.

Next, we add a new class Uploader.cs to deal with the documents we are about to upload into our Azure Search index.

PrepareDocuments() is a simple method to construct a list of dummy Car object for our searches later on.

Upload() method gets the dummy Car objects from PrepareDocuments() and pass these object into Azure Search client SDK to upload the documents into index in a batch. Note that we added a 2000 millisecond sleep time to allow our service to upload and process car documents properly before moving on to next part of the code, which is search. However in practical sense, we would not want to add sleep time in our upload code. Instead, the component that takes care of searching should expect index is not available immediately. We also catch IndexBatchException implicitly to handle the index in case the batch upload of index failed. In this example, we merely output the index key. In practical sense, we should implement a retry or at least logging the failed index.

Once the index upload operation is completed, we will add another class Searcher.cs to take care of the searching capability.

SearchDocuments() method is to handle the searching mechanism on the index we created earlier. No fancy algorithm, only passing specific instruction to Azure Search client SDK on what we are looking for and display them. In this method, we take care simple text search, filter and facets. There are much more capabilities Azure Search client SDK can provide. Feel free to explore the SearchParameters and response object on your own.

Putting them all together in Program.cs

First we define index service name and API key to create a search index client instance. The instance is returned by Helper.Initialize(). We will make use of this instance for both search and upload later.

After initializing the index, we call Upload() method to upload some dummy car documents to the index.

Next, we perform the following searches:

  1. Simple text search. We will search for the text “Perodua” in the documents.

The result as following. Azure Search index returns 2 documents which contains the keyword “Perodua”

perodua

2. Using Filter. A more targeted and efficient approach to look for documents. In the following example, first we look for Category field which is equal to ‘Hatchback’; second we look for Price field which is greater than 100,000 and is a ‘Sedan’ category. More details on how to write expression syntax in Azure search.

The result as following: Car with the category of Hatchback and car cost more than 100,000 and is a Sedan.

filter

3. Searching facets. With facets, developer will no longer need to write long query that combines Count() and endless GROUP BY clauses.

If we have a traditional database table that represent the dummy car data, this is equivalent to “SELECT Category, Count(*) FROM Car GROUP BY Category”.

Result as following:

facets

This example might not look like a big deal if you have a small data set or simple data structure. Facet is super handy and fast when you have large number of data and when your query is getting more complex. The ability to define which facet is required in C# codes make the”query” much cleaner and easier to maintain.

One last thought…

You can clone the above source code from my GitHub repo. I will be happy to accept Pull Request if you want to demonstrate how to make use of other capabilities in Azure Search client SDK. Remember to change the API key and service name before compiling. Have fun hacking Azure Search!

Scaling SQL Server

Standard

Scaling database refers the ability to serve significantly more request to both read and write data without compromising performance. In many enterprise applications, performance bottleneck often happens at database, hence scaling database is a critical part on improving system performance. In the last article on Microservices, we discussed scaling database horizontally and vertically on high level. In this article we will talk more in-depth about:

  1. Three types of replication in SQL Server
  2. Distribute database load using log shipping
  3. Tools to shield your database from being hit.

3 Types of Replication

In typical enterprise applications, Read request significantly outnumber Write request. By implementing Replication, you effectively offload the bulk of Read request to the Subscribers while reserving Publisher for Writing.

Transactional Replication

Transactional Replication is the simplest form of replication to understand and to implement. Transactional Replication is implemented by having a Publisher to publish the changes. One or more than one Subscriber will replay the transaction log. Data changes and schema modifications made at the Publisher are delivered to the Subscriber(s) as they occur (almost real time). In this way, transactional consistency is guaranteed.

The incremental changes in Publisher will be propagated to Subscribers as they occur. If a row changes for 3 times in Publisher, the Subscriber will also change for 3 times. It is not just the net data change that get propagated over.

For example, if a row in Product table changes price three times from $1.00, to $1.10, to $1.20 and finally to $1.30, transactional replication allows an application to respond to each change. Perhaps, send a notification to user when the price hit $1.20. It is not simply the net data change to the row by changing the price from $1.00 to $1.30. This is ideal for applications that require access to intermediate data states. For example, a stock market price alert application that tracks near real time stock price changes to send price alert to users.

Is it possible to scale your Publisher horizontally? Yes, Bidirectional Transactional Replication and Peer-to-Peer Transactional Replication will help you to achieve it. However, Microsoft strongly recommend that write operations for each row be performed at only one node – for 2 reasons. First, if a row is modified at more than one node, it can cause a conflict or even a lost update when the row is propagated to other nodes. Second, there is always some latency involved when changes are replicated. For applications that require the latest change to be seen immediately, dynamically load balancing the application across multiple nodes can be problematic.

From experience, the most optimum solution is to scale your Subscriber horizontally by having multiple nodes. Keep your Publisher in one node and scale the Publisher vertically, when you really have to.

transactional-replication

 

Merge Replication

In Merge Replication, Subscriber synchronizes with the Publisher when connected to the network and exchanges all rows that have changed between the Publisher and Subscriber since the last synchronization occurred. You may see Merge Replication as a batch update from Subscriber to Publisher that propagates only the net data changes. For example, if a row changes five times at a Subscriber before it synchronizes with Publisher, the row will change only change once at the Publisher to reflect the net data change (which is the 5th value). Then, the unified changes in Publisher will be propagated back to other Subscribers.

Merge Replication is suitable for situation where Subscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers. For example a nationwide POS (point of sale) system where retail branches are spread across multiple physical locations. The retail branches will first initialize a snapshot from the Publisher database, make local offline changes (for example, through sales) to Subscriber database. The sales are not required to be propagated back to Publisher immediate. The many other retail branches also do not need immediate update on the changes happened at another retail branch although a more recent update will be beneficial, for example knowing whether another branch nearby has the stock that the local branch has run out to recommend customers where to go accordingly. Once a day or multiple times a day, depending on business need, the sales number in retail branches (Subscribers) will be propagated back to HQ (Publisher) and the executives in the HQ office can view the daily sales report.

merge-replication

Conflict can happen in Merge Replication and conflict will happen. Good news is, conflicts are resolved without the need of user intervention because SQL Server has built-in mechanism to resolve conflicts on data changes. However if you have unique use cases where you want to ensure SQL Server is doing exactly what you intended while resolving a conflict, you can view the conflict at Microsoft Replication Conflict Viewer and the outcome of the resolution can be modified.

Snapshot Replication

Snapshot Replication propagate data exactly as it appears at a specific moment in time and does not monitor for updates to the data. When synchronization occurs, the entire snapshot is generated and is sent to Subscribers. In simpler terms, Snapshot Replication takes a snapshot of the Publisher data state and overwrite it at the Subscribers. No conflict will happen as this replication basically overwrite the whole data set. Snapshot Replication is also used in both Transactional Replication and Merge Replicate to initialize the database at the Subscribers.

snapshot-replication

Snapshot Replication is suitable for system where

  • Subscriber deal with data that does not change frequently.
  • Subscriber do not require the most recent set of data for a long time.
  • The data set is small.

I was working on a small project with an advertising agency where the requirement is to analyze the trend of discussion and sentiment related to Telcos in a community forum. I quickly hack some web scraping codes to scrap what I needed and populate them into my database. As a result, I got 730 discussion topics and the total size of the database is about 5MB. From there I need to write some more algorithm to find out the trend and sentiment of discussion. During the development of my algorithm, I did not need the most updated dataset that reflects what is happening in the live forum. I pleasantly work (read) on a Subscriber node to develop and test out my algorithm. Few weeks later when the forum is updated with a lot more discussion topics, I simply replicate the changes in Publisher to my Subscriber in a fairly short amount of time. On production, knowing the analytical codes read database extensively, I pointed my codes to run on the Subscriber node. My Publisher node will not get any hit apart from getting the necessary insert operation from my web scraping service. Since my users do not need to know what was being discussed on a daily basis or on real time basis (because accurate trend and sentiment require months and months worth of data), I have configured my Snapshot Replication to happens on a monthly basis. The replication can be completed in a fairly short amount of time. On monthly basis the users will get a fresh copy of the trend and sentiment report base on at least 1 year worth of backdated forum discussion data without any performance downgrade. By implementing Snapshot Replication, I allow my web scraping service to write to Publisher without worrying if anyone need the database for reading to generate report at any particular time. Through Subscriber, I have also set up the foundation to make it possible to generate much more sophisticated reports without downgrading the performance by spinning up more Subscriber nodes when and if I need to.

Log Shipping

Log Shipping is used to automatically send transaction log backups from primary database to one or more secondary databases. The primary and secondary database should sit on different nodes. The transaction log backups are applied to each of the secondary database. Log Shipping is often used as part of disaster recovery strategy but creative database administrator often use Log Shipping for various other purposes *wink*.

In one of the projects I was working on, the SQL Server database was storing 130k registered users and their related activities such as payment history, credit spending history, Account & Contact relationship, products, login audit, etc. The company was at a rapid expansion stage where the CFO decided it’s time to get in a Business Intelligence guy to churn out some reports to give a sense how the business is doing on daily basis. The obvious thing to do here is to replicate a database for the BI colleague to run his heavy queries because running the reporting queries on production database is going to kill the poor database. The most suitable type of replication will be Transactional Replication. However the challenge was Replication requires SQL Server Enterprise Edition and we were running on SQL Server Standard Edition. The bad news is, the company did not have a lot of cash lying around for our disposal. We will have to find an alternative. After getting the green light from CTO, I implemented Log Shipping for BI reporting. In essence, I was “scaling” the SQL Server using a disaster recovery technique by offloading the reporting query load to a secondary database by replaying the transaction log to simulate Transactional Replication on an interval basis. It was the most optimum option we have in order to satisfy various stakeholders while keeping the cost low.

log-shipping

You can use this trick as long as your client does not require real time data. Take note on the following practical issues while “scaling” your SQL Server using Log Shipping.

  1. Understand that Log Shipping is a 3 steps process. First, primary database backup the transaction log at the primary server instance. Second, copy the transaction log file to the secondary server instance. Third, restore the log backup on the secondary server instance. From experience, third step Restore is the most fragile step where it broke often. To find out why Restore fails, go to SQL Server Agent to view the job history to see the detail error message.
  2. Understand your transaction backup cycle. If you have another Transaction Log backup automation happening by another agent / service, your Log Shipping will stop working fairly quickly (note: not immediately). Log Shipping works by taking all the transaction log since the last Full backup and clear off the log. It is important for SQL Server to match to log sequence. If there is another Transaction Log backup happened somewhere else, the tail of your newest Transaction Log will not match the head of last Transaction Log hence fail. If you need to use Log Shipping, disable / stop all other Transaction Log backup.
  3. Move your agent job interval up gradually. Note that Backup, Copy, Restore jobs are run by SQL Server agent on an interval. When you are setting up your Log Shipping, after the initial full database backup is restored at secondary server, your Transaction Log is ready for action. When do your Backup, Copy, and Restore jobs kick in depend on what is the interval you set while configuring your Log Shipping. I recommend you set a super short interval in the beginning so that you can monitor the failure in your setup fairly quickly. You do not want to wait for 6 hours for your Backup, Copy, Restore jobs to kick in to find out they failed, you then make some changes and wait for another 6 hours. I always start with 1 minute. Then, 5 minutes then to the actual time frame depending on business requirement. In my case, it was 2 hours.
  4. Enable Standby mode so that your client can read the data. I highly recommend you to check “Disconnect user in the database when restoring backups.”. SQL Server Agent Restore job will handle the Transaction Log intelligently by replaying the log that was missed previously. However as I mentioned earlier, Restore job is the most fragile step. Sometimes when Restore job breaks, you have to resetup the whole log shipping mechanism. Imagine a database size of 300GB (the actual size I was working with), it is pretty painful to wait for the whole process to complete. Hence, to ensure the integrity of the Transaction Log sequence, I would rather terminate all open connection to ensure my Restore step can be executed successfully. standby-mode
  5. Monitoring your Log Shipping continuity. Again, Log Shipping is pretty fragile especially if this is the first time you are doing it. You need certain mechanism to monitor your Log Shipping to ensure they are still running as expected 6 months down the road. You can open up SQL Server Agent to view the job history ensuring they are all green or configure Alert Job to raise an alert when job does not complete successfully. But personally, nothing is more assuring than knowing the data actually changes in the secondary database during specific time frame. What I do is monitoring a table column that is supposed to change after replay of Transaction Log. In my case, I monitor the latest user login time because I know this table is updated fairly frequently. The probability that no one login in the last 2 hours is close to zero. I make sure that every 2 hour the value in column changes. If you do not have a user login audit, you can make use of any table that you are sure the data will most likely change, for example your CreatedOn column on the highest transaction table.

Shielding your database

Shielding your database is a great way to keep your database load low so that you can serve more requests. But if you shield the database, where does the data come from? The data has to come from somewhere and that somewhere is known as index provider. I’m not referring to building more indexes within SQL Server because indexes served from SQL Server still add load to SQL Server and not to mention additional storage on disk. The index provider I was referring to are fast-searching services such as Solr, Elasticsearch, Azure Search, or even Redis.

This approach is to drastically offload the reading at database level to another service which are built for super fast searching. Not only the response time is much faster, you will save the poor programmers from write essay-length query to retrieve data.

Example 1: You have a clean 3rd normalized form database. With SQL Profiler you observed that clients have been issuing long query such as a query with 15 JOIN and 5 GROUP BY on very large tables. Not only the that query result comes back slowly, it also drags down the server performance and other queries are badly affected. You have discussed with your best SQL query guru to review the queries, studied the execution plans, revisited the data structure and the conclusion is that is really what it takes the get the desired result. So, what do you do now?

Index provider will come into rescue when you design your schema for your query. The joining is no longer required because the index has flatten the defined fields during indexing. The grouping is also no longer required because that info come from facets. Now instead of the JOIN and GROUP BY happen in SQL Server level, you get the data from an index lightening fast – as a simple read provided you have designed your index schema properly.

index-example-1

Example 2: You are working with 2 microservices where each of the service has its own database keeping different domain data. Your application requires data from both the services and both of them are responding to you slowly. Again, after reviewing the request with respective microservice owner and you all came to the conclusion that both the codes and SQL statement sare at the most optimum form, what do you do now?

Index provider will come into rescue when you index both the data from different microservices into one index provider and you query from the index provider instead. Not only you bypass the two slow microservices, you are also querying from one index instead of multiple databases. Of course, this is easier said than done because now it involves the two microservices to index the required data and additional effort to maintain the index provider.

index-example-2

Here are some of the practical tips you can consider while implementing index provider to shield your database from getting hit:

  1. Initial indexing will take a lot of time, especially from various sources. This is one of the reason why some deployments take long time. A trick to overcome this is to not relying on the on-the-fly index API call provided by index provider. Export your data as .csv and import them for indexing. Do it manually if your data is large enough. Automate this process if full reindexing happen on recurring basis.
  2. Minimize your index schema changes. Every schema change will require you to reindex your indexes, which usually take time and lead to longer down time. The key is when designing your schema, think holistically what is the use case and potential use cases instead of designing a schema just for one client and one use case. In the meantime, you also do not want to design an overly generic schema else it will be inefficient. The art is finding this balance and the way to do it is to understand your domain and context well before designing your schema.
  3. Ensure your stakeholders are aware that index is a reflection of your database. Your database remains as the single source of truth. Expect delay in your index. Eventual consistent is what you aim to achieve. Often, it is acceptable to have a few seconds or even minutes delay depending on how critical the data is. Ensure you get the acknowledgement from your stakeholders.
  4. Queue your indexing. A simple phone number update into database could result in 100 of reindexing request. Have a queueing mechanism to protect your index provider from sudden surge of reindexing request.
  5. Take note of the strongly type fields. Index provider such as Solr takes everything as string while index provider such as Azure Search has strongly-typed field (Edm.Boolean, Edm.String, Edm.Int32, etc). If you plan to switch index provider in the future, take care of the data type from day one else you will end up having an additional layer of mapper to deal with the data type later on.

Hope these help you in your journey to scale your SQL Server. Have fun!

Microservices

Standard

I was first introduced to Microservices Architecture in 2014. At that point of time, I have no idea what I was doing is known as Microservices. We designed our system that way simply because it made practical sense. We started off with 1 PHP web service, 1 PHP frontend application, 1 .NET web service, 2 .NET frontend applications and a CRM. The number of Microservices grow along with the business needs. Since then I learned that managing Microservices is as interesting and fun as building Microservices.

Microservices are relatively small applications that interacts with each other to achieve specific business requirement. In each Microservice, they are designed to do one thing and to do it really well. Sometimes, they are independent on their own but they often work together to accomplish more complex business requirement. The opposite of Microservices is a monolithic system, the kind of system where you have 5,000,000 line of codes in single code base.

Why do we use Microservices?

Technology Heterogeneity – It means the diversity of technology within a solution. Your Microservices have independent stacks of technology. You can choose the most suitable stack of technology depending on the problem you are solving. For example, a Photo Album Printing business might have a PHP frontend application (because the want to tap into WordPress as CMS), a .NET backend business rules exposed as Web API (because there is a legacy logic and SQL Server database), a Java image processing engine (because there are proprietary image processing libraries written in Java), and an R application to crunch big data on customer sentiment. In Microservices architecture, you can have different stacks of technologies that work together seamlessly. They interact through set of API exposed to each other.

Technology-Heterogeneity

This reason also align with Scrum. Each Scrum team potentially owns one Microservice and there will be multiple Scrum teams based on the technological domain. By the time the Scrum team gets too big, it also serves as an indicator it is time to break the Microservice to be smaller. Ideally you do not want to wait for the Microservice to be too big before you break it. You should be alert on not to stuff your Microservice to be bloated in the first place. Kick start another Microservice whenever you can logically scope the context boundary into a separate Microservice.

Scaling – The fundamental of scaling boils down to 2 approaches: Vertical and Horizontal. Vertical scaling is quick and easy but could get very expensive especially when hitting the top tiers of resources. Horizontal scaling is cheaper but could be difficult to implement if the solution is not designed to scale horizontally. For example, a stateful monolithic system. As a general rule of thumb, always design your solution to scale horizontally. To put this into perspective, one large virtual machine could be substantially more costly than three small virtual machines that provide the equal amount of processing power, depending on which cloud provider you are working with.

Building solution as Microservices provides the foundation to scale horizontally. Using the earlier Photo Album Printing system, say there are many users who submit photos in bulk for processing during 9.00 AM to 12.00 PM. The DevOps guy only need to scale up the Java image processing engine service.

9.00 AM-12.00 PM

Scaling for 9.00 AM-12.00 PM

At 6.00PM to 11.00PM, say there are many visitors come to the website to browse the photos. The DevOps guy only need to scale up the PHP frontend application.

6.00PM-11.00PM

Scaling for 6.00PM-11.00PM

If we have a gigantic monolithic system, we have to scale the entire system regardless of which component is being utilized most. To put this into perspective, imagine you keep your car engine running just because you want the air-cond. Heads will not roll, it is just not the most efficient way to use your technologies.

Ease of Deployment – If you have tried waking up 2.00 AM in the morning for a “major deployment”, or been through a 20-hour deployment, you probably will agree it is important to have clean and quick deployment. I can vividly remember how nervous my CIO got whenever we have a “major deployment”. Sometimes he will come in early morning together with us give us moral support by supplying us with coffee and McDonalds. Despite the heart-warming breakfast, it was really stressful for everyone go through such deployment. Long story short, we have improved our deployment to be able making 4 productions deployment within a day with 0 down time. It is not possible (or significantly more difficult) if we have not built our codes on Microservices architecture.

Deployment for Microservices is definitely easier compared to a gigantic monolithic system. The database that the Microservice using is simpler which make altering database schema changes less painful. The amount of code is lesser which indirectly means there are less configuration to deal with during deployment. The scope of what the Microservices is designed for is smaller which makes post-deployment (both automated or manual) testing faster. In worse case scenario, rolling back a small service is significantly straightforward compared to rolling back a monolithic system with 25 other dependencies where some of they need to be rolled back together.

Scaling Microservices

The secret to scaling Microservices is: start small, think big. You might start your Microservice as a small service coded by a solo developer in 2 weeks. Although a service could be small, you need to think about how to deal with it when your audience size grow 10 times larger. As we discussed earlier, scaling vertically is easy but could be fairly expensive when you get nearer to the top tiers. You want to design your Microservice to scale horizontally from day one.

How to build a horizontal-scale friendly Microservice? The most common reason some services cannot scale horizontally efficiently is due to session. When you have a session stuck in your service memory, your client will always have to go back to the same service, else you will discover all kind of weird behavior. Of course, you can overcome this problem by enabling stickiness in your load balancer, or have an additional SQL Server database to keep all the session (InProc mode). However why would you want to get yourselves into this situation in the first place? If having session within the service is going make your scaling effort more challenging, avoid relying on session from day one so that you can scale horizontally, effortlessly. Building your service base on RESTful principles is a good starting point.

If your Microservice really have to make use of session, make use of additional session service such as Redis instead of keeping your session in service memory.

Front your service with a load balancer. Having your service instances sit behind a load balancer acts as the foundation for horizontal scaling. Configure auto-scaling in whichever cloud provider you are using. By the time the additional load kicks in, your auto-scaling will automatically boot up additional hosts to serve the load.

load-balancer

Another advantage of having your Microservice instances sit behind a load balancer is to avoid single point of failure. You should consider having at least 2 hosts to avoid single point of failure. For example, perhaps your Microservice only need 1 medium size virtual machine computing power. While 2 small size virtual machines provide the equal computing power as 1 medium size virtual machine (you have to work out the Maths yourselves). Having 2 small size virtual machines sit behind a load balancer rather than 1 medium size virtual machine being connected directly is a good remedy to avoid single point of failure.

Scaling Databases

As the number of your Microservice instance grows, it usually means more load in your database. Database IO is the most common bottleneck in software performance. Unless you have put in explicit design to protect your database from getting hit, you will often end up with scaling your database vertically. However you might want to keep this option as your last card.

Out of the box, SQL Server provides you the option to do Transactional Replication, Merge Replication, and Snapshot Replication. You have to determine which mode is most optimum for your system. If you have no idea what all these are about, Transactional Replication is your safest bet. But if you are adventurous, you can mix and match different approaches.

Transactional Replication works by Publisher and Subscriber model. All Write will happen in Publisher while all Read will happen in Subscribers. In typical services, the number of Read far out number Write. In this set up, you distribute the Read load to multiple Subscriber hosts. You can continue to add more Subscriber hosts as you see fit.

database

The drawback is, you need to code your Microservice in a way to perform all data manipulation and insertion in the Publisher while the reading in the Subscriber(s), which requires conscious effort from developers to code in such manner. Another drawback for adopting replication is the skill set required to set up and maintain the replication. Replaying transactional log is pretty fragile from my experience. You need someone who understand the mechanism behind replication to troubleshoot the failure effectively.

I highly recommend you to give some forward thinking on how to avoid your database get hit unnecessary in the first place. Tap into search engines such as Solr and ElasticSearch when suitable. Identify how your data will be used up front. Keep the on-the-fly data aggregation to minimum. Design your search indexes accordingly. At the very minimum, make use of caching.

The key for scaling your data is to achieve eventual consistency. It is alright to have your data out of sync for a short period of time especially on non-mission critical system. As long as your data will be consistent eventually, you are heading to the right direction.

Scaling database could be tricky. If you need something to be done by tomorrow 9.00AM, the easiest option is to scale vertically. No code change and no SQL Server expert involved. Everyone will be happy… probably except the CFO.

Keep Failure in Mind

Failure is inevitable in software. In Microservices architecture, the chances for software to fail is even greater. The risk of failure is exponentially higher because your service no longer depend on yourselves alone. Every Microservice that your Microservice depends on could go wrong at any given time. Your Microservice need to expect other Mircroservices to fail and handle the failure gracefully.

Bake in your failure handling. Your Microservice depends on other Microservices at one point or another. Can your Microservice core features operate as usual when other Microservices start failing? For example, a CMS depends on a Comment Service. In an article page, if the Comment Service is not responding, how will that affect CMS ability to display the article? Will the CMS article page just crash when the visitor visits? Or will the CMS be able to handle the Comment Service failure gracefully by not showing the comments but the rest of the article is loaded as usual?

Another example, I was using Redis to keep my user token after every successful login. At one point, Redis decided not to keep token for me anymore by actively rejecting the new connection. My users could not login although they have entered the correct username and password. The users could not login simply because part of the non-critical authentication process has failed. We discovered the root cause later. However, in order to avoid such embarrassing moment from happening again, at code level we changed the interaction with Redis to an asynchronous call because creating a token in Redis is not the main criteria in authentication. By changing the Redis call to asynchronous, users can continue to utilize the core functionalities although minor portion of features that relies on Redis token will not work.

It is fine if you do not have a sophisticated failure handling mechanism. At the very minimum, code defensively. Expect failure to happen at every interaction point with other Microservices. Ideally we do not want any of the Microservices to fail. But when they do fail (and they will), defensive coding help your Microservice being minimally affected. It is better to have 70% of your core functionality working working than the whole service crashing down.

The Backends for Frontends

This is another concept I discovered by accident when I was hacking some codes in Android Studio in 2013. The Backends for Frontends design is extra practical for mobile application, although you can still apply the concept on any Microservices.

In essence, the Backends for Frontends design is to back your frontend application with a backend service. The primary objective of this backend service is to serve your frontend application. This is a very good choice for mobile application for several reasons.

backend-for-frontend

First, mobile application is known for having connectivity limitation. Instead of asking your mobile app to connect to 7 different other Microservices to request various information and do the processing at the client (mobile) side, it makes more sense to get the Backend for Frontend service to make the necessary server-to-server calls, process data, then only send the necessary data back to mobile client.

Second, the Backend for Frontend service also serve as a security gateway. Obviously you do not want to expose all your backend core services (for example your CRM) to the public. You need to design your network to have your backend core services sit in a private network. Then, grant permission for your public facing Backend for Frontend service to access to this private network. By doing this, your backend core services are protected from public access yet there are explicit permission granted to specific Backend for Frontend service. You can implement whichever security model you find fit in the Backend for Frontend service where your client application must and can comply to.

backend-for-frontend-security

Third, mobile application sits at client side, which makes updating the application more challenging. You want to minimize the logic in the client side. The Backend for Frontend service plays the perfect role for handling business logic. You can update the logic much easier in the Backend for Frontend service compared to the client application. In other words, your frontend application will be lightweight and is only responsible for UI presentation.

One Last Thought…

Microservices is a huge topic by itself. This article serves as a triggering point for you to get to know Microservices without going through at 400 pages book. If you would like to learn more, there are many books available. I recommend you to look at Building Microservices by Sam Newman. I hope you have discovered something new in this article. Until next time!

Powershell PathTooLongException

Standard

PathTooLongException could be triggered in various occasions when the path you are dealing with exceed 260 characters. For example while copying an innocent file to a destination with super long path. I hit PathTooLongException when I need to clean up a build in my Powershell build script.

For simplicity sake, I extracted the script that trigger PathTooLongException as following:

Here is our friendly error message:

PathTooLongException

Here is an example of what a (really) long path looks like:

C:\inetpub\wwwroot\sc82rev160610\Data\packages\MyApplication.Experience.Profile.8.2.0.60r160607\serialization\core\MyApplication\client\Applications\ExperienceProfile\Contact\Main\PageSettings\UserTabs\Activity\ActivitySubtabs\Minor\Automations\AutomationsPanel\EngagementPlans\Item\EngagementPlan.item

We will not argue whether it is right or wrong to have such a long path. Often, we are given a piece of software to work on. We have little choice but to find the most efficient solution to deal with the situation.

PathTooLongException is known since 2006. It has been discussed extensively by Microsoft BLC Team in 2007 here. At the point of writing this, Microsoft has no plan to do anything about it. The general advice you will find is is: do not use such long path. (Wow! Really helpful!)

I have been digging around the web for solution but unfortunately there was little luck. Well, the truth is I wasn’t happy with many of the solutions or they are just unnecessary complicated. I eventually wrote my own “hack” after some discussion with Alistair Deneys.

The trick is to setup a short symbolic link to represent the root of the working directory. Pass the short symbolic link as target directory to Remove-Item. We will use mklink command to set up the short symbolic link.

The bad news is, mklink is a Operating System level command. It is not recognized in Powershell. I will have to open Command Prompt to set up the symbolic link before proceeding my Powershell build script. As my build script is supposed to be fully automated without manual intervention, manually setting up the symbolic link using Command Prompt is obviously an ugly option.

The following is the command we need to execute to set up the short symbolic link using Command Prompt:

The good news is we can call Command Prompt from Powershell to set up the symbolic link fairly easily. How easy?

Of course, we will need to ensure there isn’t a folder or another symbolic link which is already being named the same as our target symbolic link. (Else you will get another exception.)

Once we created the symbolic link by executing the above command, there will be a “shortcut” icon.

symbolic-link

Prior to deleting the directory with a really long path, we will set up the symbolic link on the fly to “cheat” Powershell that the path isn’t really that long 😉

Obviously we don’t want to leave any trace for another person to guess what this “shortcut” is about especially the symbolic link is no longer required after our operation. Hence, once we are done doing what we need to do (in this case, I’m deleting some files), we will need to clean up the symbolic link.

There isn’t any dedicated command to remove a symbolic link, we will use the bare bone approach to remove the symbolic link / shortcut created earlier.

The interesting thing about removing a symbolic link / shortcut is it will remove it as a shortcut but the shortcut becomes a real directory (What?!). Hence we will need to remove it one more time! I don’t know what is the justification for this behaviour and I do not have the curiouscity to find out for now. What I ended up doing is calling Remove-Item twice.

Then, things got a little more sticky because Remove-Item will throw ItemNotFoundException if the symbolic link or the actual directory is not there.

ItemNotFoundException

Theoretically this should not happen because all we need to do is create a symbolic link, delete the symbolic link 2 times and we are clear. However reality is not always as friendly as theory 🙂 So, we need to script our Remove-Item defensively. I ended up creating a function to handle the removing task really carefully:

The full script looks as following:

Here, my Powershell happily helped me to remove the files which the path exceed 260 characters and I have a fairly simple “hack” that I can embed into my Powershell build script.

One Last Thought…

During my “hacking” time, I have also tried Subst and New-PSDrive. You might be able to do the same using them. However in my case, some of the dependencies didn’t work well with the above. For example, AttachDatabase could not recognize the path set by Subst. That’s why I settled with Mklink.

This obviously this isn’t a bullet proof solution to handle PathTooLongException. Imagine if the “saving” you gain by representing the actual directory is still not sufficient – as in the path starting from the symbolic link is still more than 260 characters. However I have yet to encounter such situation and this “hack” works for me everytime so far. If you have a simpler way to get this to work, feel free to drop me a comment. Enjoy hacking your Powershell!