Problem: my website goes down frequently. I need to be the first one to know.

Websites do fail. More often than not your website/blog goes down for some reason or another. A real example happened to this very blog you're reading: www.linuxschoolonline.com. I'm hosting this on an EC2 instance. Lately, I received complaints that the site was down, all pages are not loading. After some investigation, I found out that the root cause was the instance running out of memory, so I had to upgrade it. But that's not the problem, the real problem was that nobody had access to the website for hours and I came to know “by chance”.

IT resource monitoring should be taken very seriously, and when it comes to website availability, it becomes even more important.

Now, there are many online services that offer to monitor your website and send you e-mails when it goes down or when it encounters an error of some type. In this post, I am going to show you how you can implement your very own monitoring functionality using AWS Lambda functions.

A “Lambda function”? I like the name but what does it do exactly?

If you're a programmer or have some programming background, the first thing that might pop up in your mind when hearing the term “Lambda function” is that method that gets invoked dynamically in code. It doesn't need to be defined, it does need to even have a name. It can just be invoked in-place. It's used in all programming languages like Python, C#, and - of course - JavaScript.

Don't worry this post is not about that type of Lambda functions. AWS borrowed this familiar term and used it to brand one of its latest services. An AWS Lambda function is “code on demand”. All what you need to do is write a script using one of the supported languages (currently Python, Node.js, C#, Go, and Java). AWS will take this code, compile it if necessary, create a temporary runtime environment for it, and runs the code for you. It can be run periodically (like cron jobs), or it can be triggered by a variety of other services as we'll see later on.

Why don't I just create a nano EC2 instance and put my code there?

It's perfectly possible to build an EC2 instance dedicated for monitoring your website. You can even use the same instance that is hosting the web application for monitoring. But:

  1. You will waste money as a Lambda function is extremely cheaper than running an EC2 instance 24/7.
  2. You will waste time setting up the environment for your code to run. For example, installing the Java or Go runtime, or a virtual environment for Python.
  3. What if this script fails for some reason? You may need a way to monitor the script that monitors the website. The monitoring script may need monitoring and so on. That sounds like an infinite loop.

Using a Lambda function, you're lifting the environment setup, the hardware, and the execution monitoring off your shoulders and throwing them at Amazon.

The monitoring script

This might be the easiest part of the whole thing. All what you need is a script written in a supported language that communicates with the website and examines the expected output. This can get as complex as you want it to be. For example, you may want to test several pages for an expected response or running complex requests. It's up to you. In this particular example, I'll just query the front page and examine the HTTP response code. If it is not 200 then we have a problem.

import os
import requests

def check(url):
    r = requests.get(url)
    return str(r.status_code)

def lambda_handler(event, context):
    url = os.environ['url']
    response_code = os.environ['response_code']
    if check(url) != response_code:
        raise Exception("{url} is down".format(url=url))

The code is easy to understand. If it seems foreign to you or if you haven't coded in Python before, here is a quick breakdown:

First, we import the os and requests modules because we are going to use them later.

Then we define two functions: the check function accepts a URL string and issues a request against it using the get method made available by the requests module. The function returns the status code of the request. In HTTP, a status code of 200 means “OK, we got a valid response”. There are other response types that report a non-accessible resource like 404 and 400. There are also status codes that reports errors like 500. So, we are interested in status code 200. If you know the HTTP protocol well you may argue that there are also status codes like 301and 302 that does not imply a faulty or unreachable page, they mean that the address is being redirected to another one. But let's keep things simple and assume that we are only interested in the 200 code.

Then the Lambda function itself. We called it lambda_handler but you can choose whatever name you want. The check is simple, it determines whether or not the response code of the HTTP request made to the URL is equal to the expected response code or not and raises an unhandled exception when the response code mismatches. Notice here two important things:

  • We are getting the values of the URL and the expected response code from environment variables (more on that later).
  • We are NOT handling the exception. AWS automatically takes care of that for us. If we did handle it in a try..except clause, the function will NOT report an error to AWS and the whole thing will fail.

Adding third-party modules dilemma

Python is among the languages that are described as “batteries included”. This means that the language has enough built-in modules and libraries that makes the developer refrain from using or needing third-party modules. But sometimes, you may need to use one or two in your code. In our example, I intentionally used the requests module although I could have done exactly the same thing with the standard libraries like urllib3. I want to demonstrate how you can bundle those modules with the code that you will upload AWS Lambda.

The answer is very simple. You download and install third-party Python modules using the popular pip tool. So requests is normally installed using a command like this pip install requests.

We are going to use the same command but with a slight modification: we'll instruct it to download and place the package, with all its dependencies in the same working directory. So, let's create a directory and place our code file (you can name it main.py but any name is OK) and install the package as follows:

<code class="language-bash">mkdir lambda-check
mv main.py lambda-check
cd lambda-check
pip install requests -t .

You will notice new files and directories populating the directory. You will need to zip them in a file to be ready for uploading it to AWS Lambda.

Packaging the code

Now that your code and its dependencies are ready, let's package it and send it over to Amazon. You can use the zip tool that is available on Linux and macOS. Alternatively, you can use any zipping tool for Windows like WinZip, WinRAR or even the native Windows compression tool. The end result is that you have a zipped file containing your code.

Important note about zipping the files

The command I use on macOS and Linux is zip -r9 ~/package.zip * inside the working directory. If you are using Windows, make sure that the compressed files are in the archive root. In other words, when you open the archived file, you should see the main.py and the third-party package files not another directory. This is important because AWS will unzip the file expecting to find your executable file in the extracted files not another directory.

Creating a new Lambda function

Now the coding and packaging part is done. The rest of the tutorial will cover the AWS part. I assume that you already have an account on Amazon. Login to your account and navigate to the AWS Lambda page: https://us-west-2.console.aws.amazon.com/lambda/home?region=us-west-2#/functions. Notice that I'm using us-west-2 region so your URL might be different depending on your region.

Click on create function, you should see a page like the following:

Enter the name of the function, the runtime environment, and choose “Create a custom role”. This will automatically open a new page for you where you can edit the role that your function will use:

Click “Allow” to enable “lambda_basic_execution” role. The page closes and you're automatically on the Lambda creation page.

Click “Create function”

Uploading your code

AWS accepts two methods for receiving the Lambda function code: the first is by copying and pasting your code in the inline editor as shown below:

This the easiest way but only if your function does not need external dependencies that are not supplied by the Lambda runtime environment.

The other option (which we are going to use) is uploading a zip file containing the code and the dependencies. You can also place the zipped file in an S3 bucket and supply the URL of the bucket. This is shown below:

Choose “Upload a .zip file” and upload the package.zip file that we created earlier. Click “Save”

Lambda function environment variables

To make it more flexible, AWS has provided the ability to inject variables to the Lambda function after the code has been uploaded the deployed. Those variables are treated exactly as the environment variables set on the OS level. If you examine the code once again you should notice that we are accepting the URL and the expected status code as environment variables:

    url = os.environ['url']
    response_code = os.environ['response_code']

If you are running this script locally, you can set those variables on the command line as follows (macOS and Linux):

<code class="language-bash">export url="https://www.linuxschoolonline.com"
export response_code=200

If you are running this on the AWS Lambda runtime environment, those can be added in the Environment variables section as show below:

The function handler and the “Unable to import ‘lambda_function’ error”

The first time I ran my very first Lambda function, I received the following error:

I spent pretty long time till I figured out what caused this error. It's not your code or your variables.

In order for the Lambda function to work, it must know the name of the file that contains your code and the name of the method (or function) that is going to be executed. Alright let's avoid this by going to “Function code” and “Handler”. Type the name of the file followed by a period and the name of the function. In our example, the name of the file is main.py and the name of the function is lambda_handler so it should read like this main.lambda_handler. Notice that we do not add the .py extension nor do we add the double rounded brackets after the function name. This is shown below:

Giving your Lambda function a test drive

Click save on the upper right corner of the screen and let's test our function. Click on the test button. The first time you run the test, AWS will ask you to provide an event for the function. This is to simulate the scenario when our function is triggered by an event. In our case, we'll run it periodically and not in a response to an event. You can write any string in the box like test.

The function will spend a couple of seconds before it finishes. The output is displayed in the inline editor because our entire zip file is less than 10MB. It should look something like this:

The output is null because, thankfully, the website responded with the 200 OK code. Now, let's see what happens when the page that the function tries to reach is down. Change the URL in the environment variable to something that doesn't exist like https://www.linuxschoolonline.com/missing.

Testing the function now, an exception is thrown and the output should look something like the following:

The exception that we manually threw in the code when the response code is not the configured one (200) is working, effectively printing the error message in the output. Now we know that our function works.

Let's change the URL back to https://www.linuxschoolonline.com and move on to the next part.

What will trigger our Lambda function to run?

The Lambda function seems to be working properly. It can connect to the page that we specify, examine the status code that we configure and respond with an error when there is a mismatched response code. But so far, the function is run only when we click on the “Test” button. We need it to run periodically around the clock and also notify us (like through an e-mail) that something went wrong. That is where “CloudWatch” logs come into play.

Running the Lambda function periodically using CloudWatch events

Lambda functions can be triggered in a variety of ways. For example, it can respond to events coming from other AWS services. In our case, we just need it to self-trigger every five minutes. Click on CloudWatch events in the designer pane as shown:

In the Rules pane, select “Create new rule”

Choose a rule name that is descriptive for you. I'll name mine “LSO_health_check”. Notice that you are not allowed to add spaces to the rule name but you can add a description.

Choose “Schedule expression” and add the following in the box: rate(5 minutes). This instructs our Lambda function to run every five minutes.

CloudWatch events rules can also accept a cron expression, where you can specify your schedule in a format that is very similar to cron jobs. The end result should be as follows:

Click “Add”. Now your function will run every five minutes to check the website health.

Get notified using CloudWatch logs

Indeed, the function will run periodically to do what it's been programmed to. But the essence of the whole thing is that you get notified when your website is down. You may have noticed that we didn't add any Python code to send an alert of some type when the exception is thrown. While it's perfectly possible (and easy) to do that in Python, AWS has made this task even easier and, most importantly, without writing any code!

First, click on the “Monitoring” tab as shown:

You'll see various useful metrics that demonstrate when the function was executed, the error rate in addition to other things. Since we are interested in the logs that this function produces, click on “View logs in CloudWatch”. You should see a page like the following:

Click on “Search Log Group”, you should see a page that looks like the following:

Let's create a CloudWatch alarm for those errors

So now that we know that Lambda functions - by default - send their output to CloudWatch logs, we can easily create an alarm based on this.

Go to “Alarms” on the left navigation pane and select “Create Alarm”:

You will be prompted to select the metric on which the alarm will be created, choose “By Function Name” under “Lambda Metrics” as shown:

Check the errors metric and click next, in my example, it looked like this:

In the next page, you are prompted to enter the name of the alarm and when it should be triggered. The following settings instruct the alarm to fire if:

Whenever the Lambda function reports errors in a period of 10 minutes (that is, if the “error” even is more than 1 for 2 datapoints, where I defined one datapoint to be a 5 minutes interval). Notice that the Statistic here should be “Sum” not “Average”. This can be found on the right bottom corner of the screenshot:

Creating a CloudWatch alarm notification list (aka e-mail list)

Now the configuration settings we just specified will set the alarm off whenever there is an error reported by the Lambda function in the last 10 minutes. But this is how far it goes: when you visit the CloudWatch page, you will find a nice, red icon on the Alarms link notifying you that an alarm has been set off. That's clearly not enough, so we need to create a list of e-mails where the alarm should be sent.

In the Actions pane, click on “New list” and add one or more e-mails that should receive the notification.It should look like this:

Optionally, get notified when the alarm is gone

Sometimes, the website may go down, but just for a few minutes and it get back online perhaps due to a colleague of yours restarting the server. In such a case, you may be interested in getting notified when the website is back online. CloudWatch allows you to define a state when the Alarm is OK and sends you a notification. To do this, click on “+Notification” in the Actions pane, choose “State is OK” from the “Whenever this alarm:” and select the notification list that you created for the previous state (of course you can create another if you want to):

When done, click create alarm. You will have to confirm your e-mail address(es) by clicking on the confirmation link sent by AWS.

Testing everything

Go back to your Lambda configuration page, change the URL to something that does not exist and click “Test”. You should shortly receive an e-mail from AWS CloudWatch informing you the website is down.

Change the URL back to the correct one. Congratulations, you have used Lambda functions and CloudWatch to create your very own website basic monitoring tool.