OK, so you’re tasked with setting up a Kubernetes cluster on your organization’s AWS account. Or you need to set up one for you to learn or deploy your application. You already understand what Kubernetes is and you don’t want to waste your time reading about its theory, you want to to get to the action and have the cluster up and running as quickly as possible while also understanding the steps. You’ve come to the right place. In this article, we are going to set up a fully functional Kubernetes cluster. We’re not going to use the AWS managed Kubernetes service, EKS. EKS is good when you want to ease your mind and have the cluster set up for you automatically. Also, we won’t be using helper tools like kops. Although helper tools save you time and effort, they abstract a large part of the operation. Besides, where’s the fun in that? So, let’s get started. But first, there are some prerequisites that you should have to be able to follow all the steps in this article.

Kubernetes on AWS Prerequisites

  1. You have a valid AWS account with programmatic access enabled. You can go ahead and create one here. 2. You have the aws-cli tool installed and configured to access your account. 3. You have a basic understanding of AWS components like EC2, VPC, ELB, etc. 4. You have a basic understanding of Kubernetes and what it is used for.

Step 01: Creating a VPC for our cluster

So, you’re ready to start? OK. The first step we need is to create a private VPC for our nodes to live in. The reason we are not deploying our nodes in the default VPC is that we don’t want them to be publicly accessible. When we, the administrators, need to access the nodes, we do this through what’s called a bastion server, sometimes also called the jump server. The main job of this server is to provide a gateway for us to access internal parts of the infrastructure. More on that later. So, let’s type our first command to create the VPC and acquire its id:

$ VPCID=$(aws ec2 create-vpc --cidr-block 10.0.0.0/16 --query "Vpc.VpcId" --output text)
$ echo $VPCID
vpc-0254a59c54b3f6850

We typed the command that creates the VPC between $() so that we can capture its output and add assign it to the VPC variable. We need this VPC variable later on. It’s important that you check that the $VPC indeed contains the ID of the VPC you created and not an error message or an empty string.

Step 02: Enable DNS hostnames in the VPC

Since this is a new VPC that we created from scratch, there are some features that are not enabled by default and we need to manually enable them. For example, the VPC does not use a DNS server and, thus, will not assign DNS names for the EC2 instances that we create. Kubernetes uses the DNS names to give names to the nodes. So, let’s enable DNS and DNS hostnames using the following commands:

$ aws ec2 modify-vpc-attribute --enable-dns-support --vpc-id $VPCID
$ aws ec2 modify-vpc-attribute --enable-dns-hostnames --vpc-id $VPCID

If both commands are successful, there shouldn’t be any output.

Step 03: Tagging the VPC

On multiple occasions, Kubernetes manages the infrastructure for you. However, since we are already operating in a cloud provider’s infrastructure, we need to make the distinction between which resources are for Kubernetes to manage and which should not be modified. In AWS, Kubernetes uses tags to make this distinction. If you want to mark a resource as Kubernetes-managed, you simply add a tag with the key set to kubernetes.io/cluster/ where the is the cluster’s chosen name. The value of this tag is owned. However, sometimes, you have resources that are used by multiple clusters and, thus, no specific Kubernetes cluster should, for example, delete it. For that sort of resource, we use shared as the tag value. So, let’s tag our VPC resource to be shared:

$ aws ec2 create-tags --resources $VPCID --tags Key=Name,Value=lso Key=kubernetes.io/cluster/lso,Value=shared

The above command created two tags for our VPC:

  1. Name=lso 2. kubernetes.io/cluster/lso=shared

The name we’ve chosen for our cluster is the abbreviation of Linux School Online, lso.

Step 04: Creating the route tables for our VPC

Each VPC that gets created on AWS has a default route table automatically assigned to it. At this stage, we only need to create a second route table that will be used for public Internet access. The first one will be kept for private access.

$ PRIVATE_ROUTE_TABLE_ID=$(aws ec2 describe-route-tables --filters Name=vpc-id,Values=$VPCID --query "RouteTables[0].RouteTableId" --output=text)
$ echo $PRIVATE_ROUTE_TABLE_ID
rtb-07b9b951c25344f3f

The above command just grabs the id of the first Route Table (the default one) so that we can use it later. Again, it’s a good idea to display the contents of the environment variable that we’ve just created to ensure that the command completed successfully. Now, let’s create a second Route Table to be used with our public subnet and grab its ID for later usage:

$ PUBLIC_ROUTE_TABLE_ID=$(aws ec2 create-route-table --vpc-id $VPCID --query "RouteTable.RouteTableId" --output text)
$ echo $PUBLIC_ROUTE_TABLE_ID
rtb-04a2e0d78d327db6f

Now, let’s give the route tables Name tags so that we can track them later:

aws ec2 create-tags --resources $PUBLIC_ROUTE_TABLE_ID --tags Key=Name,Value=lso-public
$ aws ec2 create-tags --resources $PRIVATE_ROUTE_TABLE_ID --tags Key=Name,Value=lso-private

Step 05: Create our AWS Subnets

We need to create a subnet for our nodes to live in. To create a subnet in AWS, you need to specify the region and the availability zone. In this lab, our cluster will be accessible in one availability zone only:

$ PRIVATE_SUBNET_ID=$(aws ec2 create-subnet --vpc-id $VPCID --availability-zone us-west-2a --cidr-block 10.0.0.0/20 --query "Subnet.SubnetId" --output text)
$ echo $PRIVATE_SUBNET_ID
subnet-06830d0bd6878722f

The availability zone I selected is us-west-2a and the CIDR block is /20. That’s 4089 addresses that I can use in this subnet. Let’s add the necessary tags for this subnet:

$ aws ec2 create-tags --resources $PRIVATE_SUBNET_ID --tags Key=Name,Value=lso-private-1a Key=kubernetes.io/cluster/lso,Value=owned Key=kubernetes.io/role/internal-elb,Value=1

We added three tags here:

  • Name=lso-private-1a * kubernetes.io/cluster/lso=owned * kubernetes.io/role/internal-elb=1

Now, let’s create the second subnet that will host any resource that needs to publicly accessible like the bastion server and the public load balancer:

$ PUBLIC_SUBNET_ID=$(aws ec2 create-subnet --vpc-id $VPCID --availability-zone us-west-2a --cidr-block 10.0.16.0/20 --query "Subnet.SubnetId" --output text)
$ aws ec2 create-tags --resources $PUBLIC_SUBNET_ID --tags Key=Name,Value=lso-public-1a Key=kubernetes.io/cluster/lso,Value=owned  Key=kubernetes.io/role/elb,Value=1

By default, any subnet you create on AWS gets associated with the default route table. We designated our default route table for private subnets so we needn’t do anything to the private subnet we created. But to enable public access to the public subnet we’ve just created, we need to first associate this subnet with a different route table, the public route table that we created earlier:

$ aws ec2 associate-route-table --subnet-id $PUBLIC_SUBNET_ID --route-table-id $PUBLIC_ROUTE_TABLE_ID
{
    "AssociationId": "rtbassoc-0c63f1ed5b2caa505"
}

Now, what distinguishes a private subnet from a public one? The answer is that we add a route rule to route 0.0.0.0/0 to the Internet Gateway. So, we need three things now:

  • Create an Internet Gateway and acquire its ID:

    $INTERNET_GATEWAY_ID=$(aws ec2 create-internet-gateway --query "InternetGateway.InternetGatewayId" --output text)
    
  • Attach this Internet Gateway to the VPC:

    $aws ec2 attach-internet-gateway --internet-gateway-id $INTERNET_GATEWAY_ID --vpc-id $VPCID
    
  • Create the necessary route rule in the public route table:

    $aws ec2 create-route --route-table-id $PUBLIC_ROUTE_TABLE_ID --destination-cidr-block 0.0.0.0/0 --gateway-id $INTERNET_GATEWAY_ID
    

OK, so now any resources created in the public subnet will have access (and can be accessed from) the public internet. But what about the private subnet? Our requirements state that the private subnet should not be accessible from outside the VPC. But, currently, any resource created inside the private subnet will neither be accessed nor can access any resource outside the VPC. So, what if we need to download updates and patches from the Internet? The answer is to create a NAT Gateway and configure the subnet to route outside traffic to that NAT gateway. Any outside traffic cannot pass the NAT Gateway to the subnet, but internal traffic can. Let’s do just that:

$NAT_GATEWAY_ALLOCATION_ID=$(aws ec2 allocate-address --domain vpc --query AllocationId --output text)
$NAT_GATEWAY_ID=$(aws ec2 create-nat-gateway --subnet-id $PUBLIC_SUBNET_ID --allocation-id $NAT_GATEWAY_ALLOCATION_ID --query NatGateway.NatGatewayId --output text)

The NAT gateway takes a few seconds to be available. You can either wait or execute the following command to check the resource status:

$ aws ec2 describe-nat-gateways --query "NatGateways[].State" --filter "Name=nat-gateway-id,Values=$NAT_GATEWAY_ID" --output text

Execute the command several times till the output is available.

Now, let’s create the route using the following command:

$ aws ec2 create-route --route-table-id $PRIVATE_ROUTE_TABLE_ID --destination-cidr-block 0.0.0.0/0 --nat-gateway-id $NAT_GATEWAY_ID

This ends our infrastructure part. You are now ready to create the components, starting with the bastion server.

Step 06: creating the bastion server

The bastion server needs a security group that allows incoming SSH traffic on port 22. Let’s do that:

$ BASTION_SG_ID=$(aws ec2 create-security-group --group-name ssh-bastion  --description "SSH Bastion Hosts"  --vpc-id $VPCID  --query GroupId --output text)
$ aws ec2 authorize-security-group-ingress --group-id $BASTION_SG_ID --protocol tcp --port 22 --cidr 0.0.0.0/0

The first command creates the security group while the second one enables traffic to port 22 from any source.

The next thing we need to do it create a key pair for our instance. From the AWS Console go to EC2 and select keypairs:

Click on Create Key Pair

Once you click Create, the public key will get created and stored in your account while the private key will be downloaded to your machine. Notice that there is no way to re-acquire the private key from AWS so make sure you don’t lose it.

The next step is to create the EC2 instance that will act as our bastion server. I’m using Ubuntu 18.04 as the OS for this image but you can use whatever Linux flavor of your choice. The AMI ID of this image is ami-09c6723c6c24250c9. You can search for the AMI ID of your image of choice on this website https://cloud-images.ubuntu.com/locator/ec2/

$ export UBUNTU_AMI_ID=ami-09c6723c6c24250c9
$ BASTION_ID=$(aws ec2 run-instances --image-id $UBUNTU_AMI_ID --instance-type t3.micro --key-name lso --security-group-ids $BASTION_SG_ID --subnet-id $PUBLIC_SUBNET_ID --associate-public-ip-address --query "Instances[0].InstanceId" --output text)

The above command creates an EC2 instance of type t3-micro, using the AMI ID of our choice. It ensures that the instance uses the lso keypair and the bastion security group, It also creates it in the public subnet, and associates a public IP address to it so we can access it from our laptop. Finally, we acquire the instance id and store it in a variable to be used later.

It’s always a good idea to tag all the resources that you create for easier recognition and, more importantly, cost analysis. So, let’s add the necessary tags for our newly-created instance:

$ aws ec2 create-tags --resources $BASTION_ID --tags Key=Name,Value=ssh-bastion

Let’s ensure that we can access our instance by trying to SSH to its public IP address:

$ BASTION_IP=$(aws ec2 describe-instances --instance-ids $BASTION_ID --query "Reservations[0].Instances[0].PublicIpAddress" --output text)
$ chmod 400 lso.pem
$ ssh -i lso.pem ubuntu@$BASTION_IP

The first command grabs the public IP address of the instance, the second one ensures that the private key has the correct permissions (otherwise SSH won’t work), and the last one attempts to open a shell with the remote server.

If everything works fine, you should be in a session inside the bastion server.

Step 07: Create the necessary IAM roles

Kubernetes will need to communicate with several AWS components through their APIs. This needs IAM roles to be set. We need two roles: the master and the worker.

The master role

Creating an IAM role involves creating a policy, a role, then attach that policy to the role. Later on, we can attach that role to an EC2 instance so that any program inside the instance may access AWS components according to the attached role permissions. Let’s start by creating the policy:

  1. Create a file called master_policy.json (the name is of no significance) and add the following to it:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "ec2:DescribeInstances",
                "ec2:DescribeRegions",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVolumes",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:ModifyInstanceAttribute",
                "ec2:ModifyVolume",
                "ec2:AttachVolume",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CreateRoute",
                "ec2:DeleteRoute",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteVolume",
                "ec2:DetachVolume",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:DescribeVpcs",
                "elasticloadbalancing:AddTags",
                "elasticloadbalancing:AttachLoadBalancerToSubnets",
                "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",
                "elasticloadbalancing:CreateLoadBalancer",
                "elasticloadbalancing:CreateLoadBalancerPolicy",
                "elasticloadbalancing:CreateLoadBalancerListeners",
                "elasticloadbalancing:ConfigureHealthCheck",
                "elasticloadbalancing:DeleteLoadBalancer",
                "elasticloadbalancing:DeleteLoadBalancerListeners",
                "elasticloadbalancing:DescribeLoadBalancers",
                "elasticloadbalancing:DescribeLoadBalancerAttributes",
                "elasticloadbalancing:DetachLoadBalancerFromSubnets",
                "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
                "elasticloadbalancing:ModifyLoadBalancerAttributes",
                "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
                "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer",
                "elasticloadbalancing:AddTags",
                "elasticloadbalancing:CreateListener",
                "elasticloadbalancing:CreateTargetGroup",
                "elasticloadbalancing:DeleteListener",
                "elasticloadbalancing:DeleteTargetGroup",
                "elasticloadbalancing:DescribeListeners",
                "elasticloadbalancing:DescribeLoadBalancerPolicies",
                "elasticloadbalancing:DescribeTargetGroups",
                "elasticloadbalancing:DescribeTargetHealth",
                "elasticloadbalancing:ModifyListener",
                "elasticloadbalancing:ModifyTargetGroup",
                "elasticloadbalancing:RegisterTargets",
                "elasticloadbalancing:SetLoadBalancerPoliciesOfListener",
                "iam:CreateServiceLinkedRole",
                "kms:DescribeKey"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
  1. Create the policy be referencing this file using the following command:
$ aws iam create-policy --policy-name k8s-cluster-iam-master --policy-document file://master_policy.json
{
    "Policy": {
        "PolicyName": "k8s-cluster-iam-master",
        "PermissionsBoundaryUsageCount": 0,
        "CreateDate": "2019-11-10T13:04:53Z",
        "AttachmentCount": 0,
        "IsAttachable": true,
        "PolicyId": "ANPA3P7UTH5EDNZXB6QZC",
        "DefaultVersionId": "v1",
        "Path": "/",
        "Arn": "arn:aws:iam::790250078024:policy/k8s-cluster-iam-master",
        "UpdateDate": "2019-11-10T13:04:53Z"
    }
}

Make sure you copy the policy’s Arn string from the output because we’ll need it in the coming command.

  1. The next step is to create a role. Our role will be used by EC2 instances so we need to specify that using a policy trust document. Create a file called trust_policy.json (again the name does not matter) and add the following to it:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create a role referencing this document using the following command:

$ aws iam create-role --role-name k8s-cluster-iam-master --assume-role-policy-document file://trust_policy.json
{
    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ec2.amazonaws.com"
                    }
                }
            ]
        },
        "RoleId": "AROA3P7UTH5EIA2JMZEXP",
        "CreateDate": "2019-11-10T13:16:34Z",
        "RoleName": "k8s-cluster-iam-master",
        "Path": "/",
        "Arn": "arn:aws:iam::790250078024:role/k8s-cluster-iam-master"
    }
}

Create the necessary instance profile:

$ aws iam create-instance-profile --instance-profile-name k8s-cluster-iam-master-Instance-Profile

The worker role

We need to replicate the above steps but for the worker role:

  1. Create a policy file (say worker_policy.json) and add the following lines to it:
{
   "Version": "2012-10-17",
   "Statement": [
       {
           "Effect": "Allow",
           "Action": [
               "ec2:DescribeInstances",
               "ec2:DescribeRegions",
               "ecr:GetAuthorizationToken",
               "ecr:BatchCheckLayerAvailability",
               "ecr:GetDownloadUrlForLayer",
               "ecr:GetRepositoryPolicy",
               "ecr:DescribeRepositories",
               "ecr:ListImages",
               "ecr:BatchGetImage"
           ],
           "Resource": "*"
       }
   ]
}
  1. You may notice that the policy contains a lot fewer actions than the ones used for the master. That’s because most of the work is done on the master node. Now, let’s create the policy:
$ aws iam create-policy --policy-name k8s-cluster-iam-worker --policy-document file://worker_policy.json
{
    "Policy": {
        "PolicyName": "k8s-cluster-iam-worker",
        "PermissionsBoundaryUsageCount": 0,
        "CreateDate": "2019-11-10T14:00:24Z",
        "AttachmentCount": 0,
        "IsAttachable": true,
        "PolicyId": "ANPA3P7UTH5EKKE3YNLQ5",
        "DefaultVersionId": "v1",
        "Path": "/",
        "Arn": "arn:aws:iam::790250078024:policy/k8s-cluster-iam-worker",
        "UpdateDate": "2019-11-10T14:00:24Z"
    }
}
  1. Create the IAM role, we can use the same trust policy document that we used with the master role:
$ aws iam create-role --role-name k8s-cluster-iam-worker --assume-role-policy-document file://trust_policy.json
{
    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ec2.amazonaws.com"
                    }
                }
            ]
        },
        "RoleId": "AROA3P7UTH5EGLAGYFRAV",
        "CreateDate": "2019-11-10T14:02:49Z",
        "RoleName": "k8s-cluster-iam-worker",
        "Path": "/",
        "Arn": "arn:aws:iam::790250078024:role/k8s-cluster-iam-worker"
    }
}

Create the instance profile for this role:

$ aws iam create-instance-profile --instance-profile-name k8s-cluster-iam-worker-Instance-Profile

Step 08: Create a base AMI

We’ll be creating several nodes, each of them will basically contain the same software packages. Instead of repeating ourselves, we’ll do everything once on an EC2 instance and then create an AMI from that instance. Later on, we can use this AMI as the base image for our Kubernetes cluster nodes.

Creating the EC2 instance

Let’s start by creating the security group that will be shared among the instance:

$ K8S_AMI_SG_ID=$(aws ec2 create-security-group --group-name k8s-ami --description "Kubernetes AMI Instances" --vpc-id $VPCID --query GroupId --output text)

The base instance will be created in the private subnet. To access it, we need a rule in the security group to allows SSH access from the bastion server:

$ aws ec2 authorize-security-group-ingress --group-id $K8S_AMI_SG_ID --protocol tcp --port 22 --source-group $BASTION_SG_ID

Now, let’s create the instance itself:

$ K8S_AMI_INSTANCE_ID=$(aws ec2 run-instances --subnet-id $PRIVATE_SUBNET_ID --image-id $UBUNTU_AMI_ID --instance-type t3.micro --key-name lso --security-group-ids $K8S_AMI_SG_ID --query "Instances[0].InstanceId" --output text)

Notice that this instance will not has a public IP address assigned to it. It uses the lso keypair and is located in the private subnet.

As usual, let’s tag our instance:

aws ec2 create-tags --resources $K8S_AMI_INSTANCE_ID --tags Key=Name,Value=kubernetes-node-ami

And grab the private IP address so that we can connect to it from our bastion server:

$ K8S_AMI_IP=$(aws ec2 describe-instances --instance-ids $K8S_AMI_INSTANCE_ID --query "Reservations[0].Instances[0].PrivateIpAddress" --output text)

Now, we can try connecting to the instance through our bastion host using a command like the following:

$ ssh -J ubuntu@$BASTION_IP ubuntu@$K8S_AMI_IP

The -J option lets you specify a jump host that you can use to login to the intended instance.

Installing Kubernetes components

Once we’re inside the instance, we need to start installing the different components that Kubernetes needs to function.

Docker

Before installing Docker, we need to create the necessary iptables FORWARD rule to enable traffic coming at the Kubernetes Service to be correctly routed to the pods. If this policy is not set, Docker will drop those packets by default:

sudo mkdir -p /etc/systemd/system/docker.service.d/ && printf "[Service]\nExecStartPost=/sbin/iptables -P FORWARD ACCEPT" | sudo tee /etc/systemd/system/docker.service.d/10-iptables.conf

The command is just a shorthand for creating the docker.service.d directory, creating a 10-iptables.conf file inside it and adding the necessary iptables command the will get executed upon Docker daemon start.

Now, let’s actually install Docker:

$ sudo apt-get update && sudo apt-get install -y docker.io

Make sure the service is enabled on system start

$ sudo systemctl enable docker

Installing kubeadm, kubelet, and kubectl

Now, we need to install the Kubernetes control plane components. The control plane is responsible for watching the API server and controlling the different resources.

Add the repository key

$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

Add the repository

$ sudo apt-add-repository 'deb http://apt.kubernetes.io/ kubernetes-xenial main'

Install the packages

$ sudo apt-get update && sudo apt-get install -y kubelet kubeadm kubectl

Generate the AMI

It’s time to convert this instance to an AMI so that it can be used to spawn new instances. Any instances that are created based on this image will have Docker, kubelet, kubeadm, and kubectl already installed. Additionally, the iptables rule that we used with Docker will be applied. The first step in creating the AMI is shutting down the instance:

$ sudo shutdown -h now

We may want to wait a few seconds before the machine is fully shut down. You can check the status of the instance using a command like the following:

$ aws ec2 describe-instance-status --instance-id $K8S_AMI_ID --region us-west-2 --query "InstanceStatuses[].InstanceState.Name" --output text

Once shut down, we can create the AMI by using the following command:

$ K8S_AMI_ID=$(aws ec2 create-image --name k8s --instance-id $K8S_AMI_INSTANCE_ID --description "Kubernetes" --query ImageId --output text)

If may take a few seconds for the AMI to become available. You can check the status using a command like the following:

$ aws ec2 describe-images --owners self --image-ids $K8S_AMI_ID  --query "Images[0].State"

Step 09: Launch the Master node instance

We start by creating the security group:

$ K8S_MASTER_SG_ID=$(aws ec2 create-security-group --group-name k8s-master --description "Kubernetes Master Hosts" --vpc-id $VPCID --query GroupId --output text)

As usual, the node is created in the private subnet. Let’s allow SSH traffic from the bastion server so that we can access the node:

$ aws ec2 authorize-security-group-ingress --group-id $K8S_MASTER_SG_ID --protocol tcp --port 22 --source-group $BASTION_SG_ID

Let’s create the instance itself:

$ K8S_MASTER_INSTANCE_ID=$(aws ec2 run-instances --private-ip-address 10.0.0.10 --subnet-id $PRIVATE_SUBNET_ID --image-id $K8S_AMI_ID --instance-type t3.medium --key-name lso --security-group-ids $K8S_MASTER_SG_ID --iam-instance-profile Name=k8s-cluster-iam-master-Instance-Profile --query "Instances[0].InstanceId" --output text)

Notice that we assigned a fixed IP to our instance, 10.0.0.10. Now, let’s add the necessary Name and other Kubernetes tags to the instance:

$ aws ec2 create-tags --resources $K8S_MASTER_INSTANCE_ID --tags Key=Name,Value=lso-k8s-master Key=kubernetes.io/cluster/lso,Value=owned

In a few moments, the instance is created. We can log in to it using the following command:

ssh -J ubuntu@$BATION_IP ubuntu@10.0.0.10

The first thing we need to do it set the hostname of the instance. It’s important to match the DNS name set by AWS so that the hostname is the same as the DNS name:

sudo hostnamectl set-hostname $(curl -s http://169.254.169.254/latest/meta-data/hostname) && hostnamectl status

We use curl to grab the external DNS hostname set by AWS. The IP address 169.254.169.254 is a special one. It can be used to retrieve several metadata information about the instance like the DNS name, public IP address, the instance type, among other things.

We also need to configure the kubelet to work with the AWS cloud provider. This can be done using the following command:

$ printf '[Service]\nEnvironment="KUBELET_EXTRA_ARGS=--cloud-provider=aws --node-ip=10.0.0.10"' | sudo tee /etc/systemd/system/kubelet.service.d/20-aws.conf

What the above command does is that it adds a line to the kubelet configuration file that ensures that the kubelet binary has the –cloud-provider=aws supplied. Additionally, it adds the master node IP 10.0.0.10. Once done, we need to reload the systemd service and restart the kubelet to apply the new configuration:

$ sudo systemctl daemon-reload && sudo systemctl restart kubelet

Now, we need to actually bootstrap the cluster by running the kubeadm command. By itself, the kubeadm will use the default values so we need to use a configuration file to override some of those. Create a new file called kubeadm.yaml and add the following:

---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
apiServer:
  extraArgs:
    cloud-provider: aws
clusterName: lso
controlPlaneEndpoint: ip-10-0-0-10.us-west-2.compute.internal
controllerManager:
  extraArgs:
    cloud-provider: aws
    configure-cloud-routes: "false"
kubernetesVersion: stable
networking:
  dnsDomain: cluster.local
  podSubnet: 10.0.0.0/16
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    cloud-provider: aws

We just instructed the API server that we need the cloud provider to be AWS and set the cluster name. Execute the following command to start the cluster:

sudo kubeadm init --config=kubeadm.yaml

The command will print a few messages about how to access the cluster. It will also print the command that you’ll use to let other nodes join the cluster. Take a note of this command. For example:

kubeadm join ip-10-0-0-10.us-west-2.compute.internal:6443 --token 0xvon9.vjw81xr695l0jy2p --discovery-token-ca-cert-hash sha256:05c7bf8417567f34817e2aafde52c21806c90a15a61f327d12a7eb6172b149a0

The next step is to apply the command that the tool printed for us:

ubuntu@ip-10-0-0-10:~$   mkdir -p $HOME/.kube
ubuntu@ip-10-0-0-10:~$   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
ubuntu@ip-10-0-0-10:~$   sudo chown $(id -u):$(id -g) $HOME/.kube/config

Now, if you run kubectl version you should get the version of Kubernetes (1.16.2) as well as kubectl:

ubuntu@ip-10-0-0-10:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

We’re not done yet! If you run kubectl get nodes you’ll find that the master node is NotReady:

ubuntu@ip-10-0-0-10:~$ kubectl get nodes
NAME                                      STATUS     ROLES    AGE   VERSION
ip-10-0-0-10.us-west-2.compute.internal   NotReady   master   11m   v1.16.2

The reason is that we need to install the network plugin that Kubernetes will use. In this lab, we’ll use WeaveNet:

ubuntu@ip-10-0-0-10:~$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

The deployment creates the necessary pods for the network plugin and the CoreDNS. You can see that by running the following:

ubuntu@ip-10-0-0-10:~$ kubectl get pods -n kube-systemNAME                                                              READY   STATUS    RESTARTS   AGEcoredns-5644d7b6d9-d6drr                                          1/1     Running   0          22mcoredns-5644d7b6d9-lrct9                                          1/1     Running   0          22metcd-ip-10-0-0-10.us-west-2.compute.internal                      1/1     Running   0          39mkube-apiserver-ip-10-0-0-10.us-west-2.compute.internal            1/1     Running   0          39mkube-controller-manager-ip-10-0-0-10.us-west-2.compute.internal   1/1     Running   0          39mkube-proxy-xcdbn                                                  1/1     Running   0          40mkube-scheduler-ip-10-0-0-10.us-west-2.compute.internal            1/1     Running   0          39mweave-net-fjwjr                                                   2/2     Running   0          40s

If we have a look at our nodes, we find that we have the master node and it is in the running state:

ubuntu@ip-10-0-0-10:~$ kubectl get nodes
NAME                                      STATUS   ROLES    AGE    VERSION
ip-10-0-0-10.us-west-2.compute.internal   Ready    master   106s   v1.16.2

Congratulations! You have a Kubernetes cluster running.

Step 10: Enabling API access from your workstation

Typically, you don’t want to log in to the master node whenever you want to execute a command against the API server (for example, deploying a Pod or a Service or troubleshooting an application). So, let’s configure the security group to allow access to the API server on its port (6443):

$ aws ec2 authorize-security-group-ingress --group-id $K8S_MASTER_SG_ID --protocol tcp --port 6443 --source-group $BASTION_SG_ID

Starting at this point, we need to enable communication between our own workstation and the Kubernetes master server. We can create the plain old SSH tunnels but that will be a lot of work as we’ll need to forward port 22, port 6443 (for the API) as well as requiring to resolve the hostnames using the VPCs internal DNS server. The easier solution is to use a tool like sshuttle. The tool is compatible with all major operating systems. Once installed you can use it as follows:

sshuttle -D --dns -r ubuntu@52.12.189.248 10.0.0.0/16

Notice that we had to use the actual IP address of our bastion server instead of $BASTION_IP because sshuttle may need sudo access, which will open a root session where the environment variables that we defined are not available. Execute the following command to download the kubeconfig file from the master node:

scp -i lso.pem ubuntu@10.0.0.10:~/.kube/config .

Notice that we are no longer using the bastion server as the tunnel we created is already using it.

Let’s ensure that the kubectl command is able to connect to the remote API server by running the following command:

$ kubectl get nodes --kubeconfig=./config
NAME                                      STATUS   ROLES    AGE   VERSION
ip-10-0-0-10.us-west-2.compute.internal   Ready    master   74m   v1.16.2

OK, now we have the master node running and we can successfully connect to it through the bastion server. The last step in this long tutorial is to add a worker node.

Step 11: Adding a worker node

Let’s start by creating the necessary security group for all the worker nodes that we’ll create:

K8S_NODES_SG_ID=$(aws ec2 create-security-group --group-name k8s-nodes --description "Kubernetes Nodes" --vpc-id $VPCID --query GroupId --output text)

Next, enable SSH access from the bastion server:

aws ec2 authorize-security-group-ingress --group-id $K8S_NODES_SG_ID --protocol tcp --port 22 --source-group $BASTION_SG_ID

The worker nodes will need access to the API server on the master node. We need to enable this access:

aws ec2 authorize-security-group-ingress --group-id $K8S_MASTER_SG_ID --protocol tcp --port 6443 --source-group $K8S_NODES_SG_ID

Additionally, we need the workers to be able to access the DNS addon on the master node:

aws ec2 authorize-security-group-ingress --group-id $K8S_MASTER_SG_ID --protocol all --port 53 --source-group $K8S_NODES_SG_ID

Also, the master node needs to have access to the kubelet API:

aws ec2 authorize-security-group-ingress --group-id $K8S_NODES_SG_ID --protocol tcp --port 10250 --source-group $K8S_MASTER_SG_ID 
aws ec2 authorize-security-group-ingress --group-id $K8S_NODES_SG_ID --protocol tcp --port 10255 --source-group $K8S_MASTER_SG_ID

The last rule we need to add is to enable pod intercommunication:

aws ec2 authorize-security-group-ingress --group-id $K8S_NODES_SG_ID --protocol all --port -1 --source-group $K8S_NODES_SG_ID

Now, we need to create a script that each worker node will use when it starts. The script simply joins the cluster using the command that the kubeadm provided for us earlier. Create a script called user_data.sh and add the following:

#!/bin/bash
set -exuo pipefail
hostnamectl set-hostname $(curl http://169.254.169.254/latest/meta-data/hostname)
cat <<EOT > /etc/systemd/system/kubelet.service.d/20-aws.conf
[Service]
Environment="KUBELET_EXTRA_ARGS=--cloud-provider=aws --node-ip=$(curl http://169.254.169.254/latest/meta-data/local-ipv4) --node-labels=node-role.kubernetes.io/node"
EOT
systemctl daemon-reload
systemctl restart kubelet
kubeadm join ip-10-0-0-10.us-west-2.compute.internal:6443 --token zgfjak.y6tdgphoe37n9jxz --discovery-token-ca-cert-hash sha256:1605df32eecf247a0bd121fbf0ad4ee86f391c59b5bca71a5096c4bfa78d12f0

The script adds the necessary flags for the kubelet service, restarts systemd and the kubelet attempts to join the cluster with the command provided for us by kubeadm.

We need to create an Auto Scaling configuration template. This is how AWS can automatically launch new EC2 instances automatically based on the conditions that we specify:

$ aws autoscaling create-launch-configuration --launch-configuration-name k8s-node-1.16.2-t3-medium-001 --image-id $K8S_AMI_ID --key-name lso --security-groups $K8S_NODES_SG_ID --user-data file://user_data.sh --instance-type t3.medium --iam-instance-profile k8s-cluster-iam-master-Instance-Profile --no-associate-public-ip-address

In a few moments, you should have a worker node running and joining the cluster. You can check by running:

ubuntu@ip-10-0-0-10:~$ kubectl get nodes
NAME                                        STATUS     ROLES    AGE   VERSION
ip-10-0-0-10.us-west-2.compute.internal     Ready      master   42m   v1.16.2
ip-10-0-15-192.us-west-2.compute.internal   Ready   <none>   12s   v1.16.2

Step 12: Deploying a sample application

Let’s conclude this lab by deploying a sample application on your cluster. Create the YAML file (call it deploy.yaml) and add the following:

---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: frontend
 labels:
   app: frontend
spec:
 replicas: 3
 selector:
   matchLabels:
     app: frontend
 template:
   metadata:
     labels:
       app: frontend
   spec:
     containers:
     - name: app
       image: nginx
---
apiVersion: v1
kind: Service
metadata:
 name: frontend-svc
spec:
 selector:
   app: frontend
 ports:
 - name: http
   port: 80
   targetPort: 80
   protocol: TCP
 type: LoadBalancer

The file contains a simple deployment for nginx and a Service that exposes the pods to the outside world. The service is of type LoadBalancer so we get to see how Kubernetes creates an Elastic Load Balancer for us automatically. Now, apply the above definition as follows:

$ kubectl apply -f deploy.yaml
deployment.apps/frontend created
service/frontend-svc created

Give it a few seconds and run the following commands to check what was created for us:

$ kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
frontend-86748d5fcd-j5hh5   1/1     Running   0          33s
frontend-86748d5fcd-kdt66   1/1     Running   0          33s
frontend-86748d5fcd-tn7nd   1/1     Running   0          33s

And the service:

$ kubectl get svc
NAME           TYPE           CLUSTER-IP      EXTERNAL-IP                                                              PORT(S)        AGE
frontend-svc   LoadBalancer   10.108.237.28   a2919d32df3a64941a48dfdfd30a6bc7-607501052.us-west-2.elb.amazonaws.com   80:31450/TCP   2m45s
kubernetes     ClusterIP      10.96.0.1       <none>                                                                   443/TCP        61m

OK, so the Load Balancer was created for us and the Service is displaying its auto-generated DNS name. Let’s test that in the browser:

If you reached that far, you should lap yourself on the back. You’ve successfully built a Kubernetes cluster on AWS from scratch without using any supporting tools.

NOTE

Don’t forget to delete the resources that you created on AWS once you no longer need this lab so that you’re not charged for them.