Deploy Private Docker Registry on GCP with Nexus, Terraform and Packer

In this post, I will walk you through how to deploy Sonatype Nexus OSS 3 on Google Cloud Platform and how to create a private Docker hosted repository to store your Docker images and other build artifacts (maven, npm and pypi, etc). To achieve this, we need to bake our machine image using Packer to create a gold image with Nexus preinstalled and configured. Terraform will be used to deploy a Google compute instance based on the baked image. The following schema describes the build workflow:



PS : All the templates used in this tutorial, can be found on my GitHub.

To get started, we need to create the machine image to be used with Google Compute Engine (GCE). Packer will create a temporary instance based on the CentOS image and use a shell script to provision the instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{
"variables" : {
"zone" : "YOUR ZONE",
"project" : "YOUR PROJECT ID",
"source_image" : "centos-7-v20181210",
"ssh_username" : "packer",
"credentials_path" : "PATH/account.json"
},
"builders" : [
{
"type": "googlecompute",
"account_file": "{{user `credentials_path`}}",
"project_id": "{{user `project`}}",
"source_image": "{{user `source_image`}}",
"ssh_username": "{{user `ssh_username`}}",
"zone": "{{user `zone`}}",
"image_name" : "nexus-v3-14-0-04"
}
],
"provisioners" : [
{
"type" : "file",
"source" : "./nexus.rc",
"destination" : "/tmp/nexus.rc"
},
{
"type" : "file",
"source" : "./repository.json",
"destination" : "/tmp/repository.json"
},
{
"type" : "shell",
"script" : "./setup.sh",
"execute_command" : "sudo -E -S sh '{{ .Path }}'"
}
]
}

The shell script, will install the latest stable version of Nexus OSS based on their official documentation and wait for the service to be up and running, then it will use the Scripting API to post a groovy script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#!/bin/bash

NEXUS_USERNAME="admin"
NEXUS_PASSWORD="admin123"

echo "Install Java JDK 8"
yum update -y
yum install -y java-1.8.0-openjdk wget

echo "Install Nexus OSS"
mkdir /opt/nexus
cd /opt/nexus
wget https://download.sonatype.com/nexus/3/latest-unix.tar.gz
tar -xvf latest-unix.tar.gz
rm latest-unix.tar.gz
mv nexus-3.14.0-04 nexus
useradd nexus
chown -R nexus:nexus /opt/nexus/
ln -s /opt/nexus/nexus/bin/nexus /etc/init.d/nexus
cd /etc/init.d
chkconfig --add nexus
chkconfig --levels 345 nexus on
mv /tmp/nexus.rc /opt/nexus/nexus/bin/nexus.rc
service nexus restart

until $(curl --output /dev/null --silent --head --fail http://localhost:8081); do
printf '.'
sleep 2
done


echo "Upload Groovy Script"
curl -v -X POST -u $NEXUS_USERNAME:$NEXUS_PASSWORD --header "Content-Type: application/json" 'http://localhost:8081/service/rest/v1/script' -d @/tmp/repository.json

echo "Execute it"
curl -v -X POST -u $NEXUS_USERNAME:$NEXUS_PASSWORD --header "Content-Type: text/plain" 'http://localhost:8081/service/rest/v1/script/docker-repository/run'

The script will create a Docker private registry listening on port 5000:

1
2
3
4
import org.sonatype.nexus.blobstore.api.BlobStoreManager; 
import org.sonatype.nexus.repository.storage.WritePolicy;

repository.createDockerHosted('mlabouardy', 5000, 443, BlobStoreManager.DEFAULT_BLOBSTORE_NAME, true, true, WritePolicy.ALLOW)

Once the template files are defined, issue packer build command to bake our machine image:



If you head back to Images section from Compute Engine dashboard, a new image called nexus should be created:



Now we are ready to deploy Nexus, we will create a Nexus server based on the machine image we baked with Packer. The template file is self-explanatory, it creates a set of firewall rules to allow inbound traffic on port 8081 (Nexus GUI) and 22 (SSH) from anywhere, and creates a google compute instance based on the Nexus image:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
provider "google" {
credentials = "${file("${var.credentials}")}"
project = "${var.project}"
region = "${var.region}"
}

resource "google_compute_firewall" "nexus" {
name = "nexus-firewall"
network = "${google_compute_network.nexus.name}"

allow {
protocol = "tcp"
ports = ["22", "8081"]
}

source_ranges = ["0.0.0.0/0"]
}

resource "google_compute_network" "nexus" {
name = "nexus-network"
}

resource "google_compute_instance" "nexus" {
name = "nexus"
machine_type = "${var.instance_type}"
zone = "${var.zone}"

boot_disk {
initialize_params {
image = "${var.image_name}"
size = 100
}
}

metadata {
sshKeys = "${var.ssh_user}:${file(var.ssh_pub_key_file)}"
}

network_interface {
network = "${google_compute_network.nexus.name}"
access_config = {}
}
}

On the terminal, run the terraform init command to download and install the Google provider, shown as follows:



Create an execution plan (dry run) with the terraform plan command. It shows you things that will be created in advance, which is good for debugging and ensuring that you’re not doing anything wrong, as shown in the next screenshot:



When you’re ready, go ahead and apply the changes by issuing terraform apply:



Terraform will create the needed resources and display the public ip address of the nexus instance on the output section. Jump back to GCP Console, your nexus instance should be created:



If you point your favorite browser to http://instance_ip:8081, you should see the Sonatype Nexus Repository Manager interface:



Click the “Sign in” button in the upper right corner and use the username “admin” and the password “admin123”. Then, click on the cogwheel to go to the server administration and configuration section. Navigate to “Repositories”, our private Docker repository should be created as follows:



The docker repository is published as expected on port 5000:



Hence, we need to allow inbound traffic on that port, so update the firewall rules accordingly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
resource "google_compute_firewall" "nexus" {
name = "nexus-firewall"
network = "${google_compute_network.nexus.name}"

allow {
protocol = "tcp"
ports = ["22", "8081", "5000"]
}

source_ranges = ["0.0.0.0/0"]
}

resource "google_compute_network" "nexus" {
name = "nexus-network"
}

Issue terrafrom apply command to apply the changes:



Your private docker registry is ready to work at instance_ip:5000, let’s test it by pushing a docker image.

Since we have exposed the private Docker registry on a plain HTTP endpoint, we need to configure the Docker daemon that will act as client to the private Docker registry as to allow for insecure connections.



  • On Windows or Mac OS X: Click on the Docker icon in the tray to open Preferences. Click on the Daemon tab and add the IP address on which the Nexus GUI is exposed along with the port number 5000 in Insecure registries section. Don’t forget to Apply & Restart for the changes to take effect and you’re ready to go.
  • Other OS: Follow the official guide.

You should now be able to log in to your private Docker registry using the following command:



And push your docker images to the registry with the docker push command:



If you head back to Nexus Dashboard, your docker image should be stored with the latest tag:



Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Build a Ruby based Lambda Function

At AWS re:Invent 2018, it was announced that Ruby is now a supported language for AWS Lambda. In this post, I walk you through how to write your very first Ruby-based Lambda function from scratch, followed by how to configure, deploy, and test a Lambda function.



API Gateway will forward incoming requests to the target Ruby based Lambda function, which will call the corresponding DynamoDB operation on the movies table.

To get started, create a Lambda execution role with permission to invoke the Scan operation on the DynamoDB table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "1",
"Effect": "Allow",
"Action": "dynamodb:Scan",
"Resource": [
"arn:aws:dynamodb:eu-west-3:*:table/movies",
"arn:aws:dynamodb:eu-west-3:*:table/movies/index/*"
]
}
]
}

The function entry-point below is is self explanatory, it uses the AWS SDK (the package is pre-installed in Lambda) to instantiate a DynamoDB client in the appropriate region and issues the Scan operation on the DynamoDB table (defined in an environment variable):

1
2
3
4
5
6
7
8
9
10
11
require 'aws-sdk'
require 'json'

def lambda_handler(event:, context:)
dynamodb = Aws::DynamoDB::Client.new(region: ENV['AWS_REGION'])

resp = dynamodb.scan({
table_name: ENV['TABLE_NAME'],
})
{ statusCode: 200, body: JSON.generate(resp.items) }
end

The AWS SDK for Ruby is included in the Lambda execution environment by default.

Now that our handler is defined, head to the Lambda form creation and select the IAM role (you might need to refresh the page for the changes to take effect) from the Existing role drop-down list. Then, click the Create function button:



Set the table name as an environment variable:



The movies table contains a set of movies:



Create a deployment package (zip file) and update the function’s code using the AWS CLI command:

1
2
zip -r deployment.zip handler.rb
aws lambda update-function-code --function-name ScanMovies --zip-file fileb://./deployment.zip

Make sure to set the Lambda function handler to handler.lambda_handler

Once the function has been deployed, invoke it manually using the sample event data by clicking on the “Test” button in the top right of the console.



So far, we learned how to build our first Lambda function with Ruby. We also learned how to invoke it manually from the console. To leverage the power of Lambda, we are going to learn how to trigger this Lambda function in response to incoming HTTP requests (event-driven architecture) using the AWS API Gateway service:



Create a deployment stage and open your favorite browser with the API Invoke URL; you should see a message like the one shown in the following screenshot:



The following screenshot shows a properly configured Ruby based Lambda function with IAM access to DynamoDB:



Like what you’re read­ing? Check out my book and learn how to build, secure, deploy and manage production-ready Serverless applications in Golang with AWS Lambda.

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Full guide to building a Serverless API with zero code

A common use case of API Gateway is building API endpoints in top of Lambda functions. It can also be used as an API proxy to connect to AWS services. In this guide, I will walk you through how to create your own API using API Gateway and DynamoDB only and go through advanced features to enhance your API endpoints such as:

  • Mapping templates, Integration Request and Integration Response.
  • Error handling and request validation.
  • Authentication with AWS Cognito and Lambda Authorizer.
  • API Throttling with Plan usage and API keys.
  • API documentation generation.
  • API Gateway custom domain.

Setting up DynamoDB

To get started, create a DynamoDB table called movies with an id as a partition key (leave the read/write capacity to default values):



Next, insert few items into the table, it should look something like this:



Next, we need to grant the API Gateway access to DynamoDB table. Therefore we need to create an IAM role assumable by API Gateway:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
"Version": "2012–10–17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "apigateway.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

The role will give API Gateway permission to invoke the following DynamoDB operations on movies table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
"Version": "2012–10–17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:DeleteItem",
"dynamodb:Scan",
"dynamodb:Query"
],
"Resource": [
"arn:aws:dynamodb:eu-west-3:*:table/movies",
"arn:aws:dynamodb:eu-west-3:*:table/movies/index/*"
]
}
]
}

API Endpoints

Before going into further detail about the architecture, the following diagram shows how API Gateway and DynamoDB will fit into the API architecture:



When calling the API endpoints, the request will go through the API Gateway, which will invoke the appropriate DynamoDB operation. This returns a response which is proxied by the API Gateway to the client in a JSON format.

GET /MOVIES

Create new API called MoviesAPI from API Gateway Console, and create a new resource, let’s call it movies:



Expose a GET method on /movies resource by clicking on “Create Method”. Select AWS Service under the “Integration type” section, choose the DynamoDB service, set the HTTP method to be POST and action type to be a Scan operation.



Next, we need to transform the HTTP request coming into API Gateway to a proper Scan request for DynamoDB. In the API Gateway console, select the “Integration Request”. All the way at the bottom we can select the Body Mapping Templates. Here, create a new application/json mapping template with the following configuration:



Deploy the API from “Actions” and create a new deployment stage, an invocation URL will be displayed:



Point your browser to the URL given or use a modern REST client like Postman. The endpoint will return a list of movies in a JSON format:



The output is returned in DynamoDB response format, in order to map the raw response to traditional JSON object structure, we will use Integration Response feature.

Click on “GET” method and navigate to “Integration Response”, expand the 200 response code. Expand the “Mapping Templates” section. In Content-Type choose application/json and create a mapping template that loop through each item from the Items array, extracts the relevant attributes of the movie’s item and places them into a response structure:



Mapping template is a script expressed in Velocity Template Language (VTL) and applied to the payload using JSONPath expressions.

As a result, you should now see a formatted response.



GET /MOVIES/:ID

The second endpoint will be responsible of fetching a movie based on an ID provided by the client. Hence, a new resource with a path parameter should be created. The value of ID will be made available via the $input.params(‘id’) method:



Expose a GET method, and then link the resource to the DynamoDB service. The action will be GetItem operation:



Again, specify a body mapping template for the integration request, now with the following template:



When the API URL is invoked with an ID, the movie corresponding to the ID is returned if it exists.



Similarly we will use integration response to map the raw DynamoDB response to the similar JSON object structure we defined earlier:



If you test it out once again, the following JSON will be returned:



POST /MOVIES

Now we know how the GET method works with and without path parameters. The next step will be to insert a new item to the table. Create a POST method with PutItem as an action:



We will create a mapping template to transform the client request into the structure that the DynamoDB API PutItem requires. The below mapping template creates the JSON structure required by the DynamoDB PutItem API. The three input variables are referenced from the request JSON using the $input variable:



Back in the “Method Execution” pane click “TEST”. Create an example request body that matches the API definition documented above and then choose “Test”. For example, your request body could be:



Navigate to the DynamoDB console and view the movies table to show that the request really was successfully processed:



Try to insert a new movie without giving a movie’s name attribute. The following error will returned:



It’s a DynamoDB PutItem error. Fortuently, API Gateway allows you to validate your request body before invoking the downstream resources (In our example the DynamoDB table). To achieve this, we will use API Gateway Models. A Model defines the payload data structure. Models definitions are written using JSON Schema draft 4.

In the API Gateway, navigate to the Models tab and create a new model. Fill in the form as so:



The model above defines a movie entity with 3 attributes and requires id and name attributes to be defined (used during validation).

Head back to “Resources” page and click on “Method Request” from the POST method, enable the request validator option as below:



If you try to insert a new movie without providing the required parameters, a bad request message error will be returned:



You can override the default 400 message from the “Gateway Responses” as follows:



As a result, the user defined error message will be returned:



Great! Try implementing the PUT and DELETE methods:



Authentication

The serverless API that we have built so far works like a charms. However, its open to the public, anyone can insert data into DynamoDB table if he/she has the API Gateway invocation URL. Luckily, API Gateway offers two ways to handle authentication:



API Gateway Authentication with Cognito and Lambda Authorizer

AMAZON COGNITO

Create a new user pool, click on “Review defaults” to create a pool with default settings. A success message should be displayed at the end of the creation process:



After creating your first user pool, register your serverless API from “App clients” under “General settings” and select “Add an app client”. Give the application a name and check the server-based authentication ADMIN_NO_SRP_AUTH option:



Create a new user using the AWS command line:

1
2
3
4
5
6
7
# Create a user
aws cognito-idp sign-up -region AWS_REGION -client-id CLIENT_ID \
-username USERNAME -password PASSWORD -user-attributes Name=email,Value=EMAIL

# Confirm sign up
aws cognito-idp admin-confirm-sign-up -region AWS_REGION -user-pool-id USER_POOL \
-username USERNAME

Now that the user pool has been created, we can configure the API Gateway to validate access tokens from a successful user pool authentication before granting access to DynamoDB.



To begin securing API access, go to API Gateway console, choose the RESTful API that we built in the previously, and click on “Authorizers” from the navigation bar. Click on the “Create New Authorizer” button and select “Cognito”. Then, select the user pool that we created earlier and set the token source field to Authorization. This defines the name of the incoming request header containing the API caller’s identity token for Authorization:



You can now secure all of the endpoints, for instance, in order to secure the endpoint responsible for creating an new movie. Click on the corresponding POST method under the /movies resource. Click on the “Method Request” box, then on “Authorization”, and select the user pool we created previously:



Once done, redeploy the API and try to insert a new movie using the API invocation URL. This time, the endpoint is secured and requires authentication:



In order to authenticate, we need to obtain an identity token for the signed-in user from the the user pool and include the identity token in the Authorization header for the API Gateway requests. Issue the following AWS CLI command to get a new token:

1
aws cognito-idp admin-initiate-auth -region AWS_REGION -cli-input-json file://input.json

The command above takes a JSON file with the following attributes:

1
2
3
4
5
6
7
8
9
{
"UserPoolId": "USER_POOL",
"ClientId": "CLIENT_ID",
"AuthFlow": "ADMIN_NO_SRP_AUTH",
"AuthParameters": {
"USERNAME": "USERNAME",
"PASSWORD": "PASSWORD"
}
}

Once executed, the preceding command will return the following JSON response:

1
2
3
4
5
6
7
8
9
10
{
"AuthenticationResult": {
"ExpiresIn": 3600,
"IdToken": "ID_TOKEN",
"RefreshToken": "REFRESH_TOKEN",
"TokenType": "Bearer",
"AccessToken": "ACCESS_TOKEN"
},
"ChallengeParameters": {}
}

Copy the ID token and add it to the Authorization header of your request:



The API Gateway will verify the token and will invoke the PutItem operation on the movies table, which will insert a new movie into the table:



LAMBDA AUTHORIZER

When a client sends a request to your API, it will go through the API Gateway, which will extracts the token from the request and calls your Lambda function authorizer with it. The function evaluates the token, generates a policy and sends it back to API Gateway. API Gateway evaluates the policy and invoke the DynamoDB action registered for the API endpoint.

For the sake of simplicity, our function will verify if the token provided by the client equals to our secret (environment variable) and returns a policy document based on the result. The following is the function handler source code written in Node.JS:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
const TOKEN = process.env.TOKEN;

const generatePolicy = (effect, methodArn) => {
return {
'policyDocument': {
'Version': '2012-10-17',
'Statement': [
{
'Sid': '1',
'Action': 'execute-api:Invoke',
'Effect': effect,
'Resource': methodArn
}
]
}
}
}

exports.handler = async (event, context) => {
if(event.authorizationToken == TOKEN){
return generatePolicy('ALLOW', event.methodArn)
}
return generatePolicy('DENY', event.methodArn)
}

Head back to API Gateway and created a new “Lambda Authorizer” and set Authorization to be the header API Gateway will extract the token from:



Choose the method you want to secure, let’s say, it will be the endpoint responsible of deleting a movie from the table. Click on “Method Request” and under Authorization select your new authorizer:



Let’s try calling the endpoint, As expected, we’re not getting through to our real endpoint:



If you include the secret token to the Authorization header of your request, you should be able to delete an item:



Looks good!

API Throttling

You can use usage plans combined with API keys to set method-level throttling limits for your API and define how much and how fast clients can access your API (request rates and quotas).

The following procedure describes how to create a usage plan:

API USAGE

Create a usage plan called basic, with a throttling limit of 1 request per second and quota limit of 10000 requests per day:



Create a 2nd usage plan called premium, with a throlling limit of 10 requests per second and a quota limit of 1 million requests per day:



API KEYS

Next, create two API keys:



Assign the first API key to basic usage plan and second key to premium usage plan:



Associate the usage plans we created to the API deployment stage:



Configure an API method to require an API key:



Deploy or redeploy the API for the requirement to take effect:



Now if you added the x-api-key header. If all goes well you will receive output like this:



If you exceed the rate limit or quota limit associated with your API key, a “Too many requests” HTTP error will be returned:



Custom Domains

You can use your own domain name for an API and deployment stage, create a Custom Domain Name backed by an ACM (Amazon Certificate Manager) certificate:



Create a new custom domain name from API Gateway Console:



Add a path mapping to map your domain name to your API deployment stage:



Once configured, you can query your API using your custom domain name as follows: https://api.serverlessmovies.com/movies

Documentation

Before finishing this guide, we will go through how to create documentation for the serverless API we’ve built so far.

On the API Gateway console, select the deployment stage that you’re interested in generating documentation for. In the following example, I chose the sandbox environment. Then, click on the Export tab and click on the Export as Swagger section:



Swagger is an implementation of the OpenAPI, which is a standard defined by the Linux Foundation on how to describe and define APIs. This definition is called the OpenAPI specification document.

You can save the document in either a JSON or YAML file. Then, navigate to https://editor.swagger.io/ and paste the content on the website editor, it will be compiled and an HTML page will be generated as follows:



Like what you’re read­ing? Check out my book and learn how to build, secure, deploy and manage production-ready Serverless applications in Golang with AWS Lambda.

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Build real-world, production-ready applications with AWS Lambda

Serverless architecture is popular in the tech community due to AWS Lambda. Go is simple to learn, straightforward to work with, and easy to read for other developers; and now it’s been heralded as a supported language for AWS Lambda. This book is your optimal guide to designing a Go serverless application and deploying it to Lambda.



This book starts with a quick introduction to the world of serverless architecture and its benefits, and then delves into AWS Lambda using practical examples. You’ll then learn how to design and build a production-ready application in Go using AWS serverless services with zero upfront infrastructure investment. The book will help you learn how to scale up serverless applications and handle distributed serverless systems in production. You will also learn how to log and test your application.

Along the way, you’ll also discover how to set up a CI/CD pipeline to automate the deployment process of your Lambda functions. Moreover, you’ll learn how to troubleshoot and monitor your apps in near real-time with services such as AWS CloudWatch and X-ray. This book will also teach you how to secure the access with AWS Cognito.

By the end of this book, you will have mastered designing, building, and deploying a Go serverless application.

Hands-On Serverless Applications with Go is available at the online stores below:







CI/CD for Lambda Functions with Jenkins

The following post will walk you through how to build a CI/CD pipeline to automate the deployment process of your Serverless applications and how to use features like code promotion, rollbacks, versions, aliases and blue/green deployment. At the end of this post, you will be able to build a pipeline similar to the following figure:



For the sake of simplicity, I wrote a simple Go based Lambda function that calculates the Fibonacci number:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package main

import (
"errors"

"github.com/aws/aws-lambda-go/lambda"
)

func fibonacci(n int) int {
if n <= 1 {
return n
}
return fibonacci(n-1) + fibonacci(n-2)
}

func handler(n int) (int, error) {
if n < 0 {
return -1, errors.New("Input must be a positive number")
}
return fibonacci(n), nil
}

func main() {
lambda.Start(handler)
}

I implemented also a couple of unit tests for both the Fibonacci recursive and Lambda handler functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package main

import (
"errors"
"testing"

"github.com/stretchr/testify/assert"
)

func TestFibonnaciInputLessOrEqualToOne(t *testing.T) {
assert.Equal(t, 1, fibonacci(1))
}

func TestFibonnaciInputGreatherThanOne(t *testing.T) {
assert.Equal(t, 13, fibonacci(7))
}

func TestHandlerNegativeNumber(t *testing.T) {
responseNumber, responseError := handler(-1)
assert.Equal(t, -1, responseNumber)
assert.Equal(t, errors.New("Input must be a positive number"), responseError)
}

func TestHandlerPositiveNumber(t *testing.T) {
responseNumber, responseError := handler(5)
assert.Equal(t, 5, responseNumber)
assert.Nil(t, responseError)
}

To create the function in AWS Lambda and all the necessary AWS services, I used Terraform. An S3 bucket is needed to store all the deployment packages generated through the development lifecycle of the Lambda function:

1
2
3
4
5
// S3 bucket
resource "aws_s3_bucket" "bucket" {
bucket = "${var.bucket}"
acl = "private"
}

The build server needs to interact with S3 bucket and Lambda functions. Therefore, an IAM instance role must be created with S3 and Lambda permissions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// Jenkins slave instance profile
resource "aws_iam_instance_profile" "worker_profile" {
name = "JenkinsWorkerProfile"
role = "${aws_iam_role.worker_role.name}"
}

resource "aws_iam_role" "worker_role" {
name = "JenkinsBuildRole"
path = "/"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}

resource "aws_iam_policy" "s3_policy" {
name = "PushToS3Policy"
path = "/"

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:PutObject",
"s3:GetObject"
],
"Effect": "Allow",
"Resource": "${aws_s3_bucket.bucket.arn}/*"
}
]
}
EOF
}

resource "aws_iam_policy" "lambda_policy" {
name = "DeployLambdaPolicy"
path = "/"

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"lambda:UpdateFunctionCode",
"lambda:PublishVersion",
"lambda:UpdateAlias"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
EOF
}

resource "aws_iam_role_policy_attachment" "worker_s3_attachment" {
role = "${aws_iam_role.worker_role.name}"
policy_arn = "${aws_iam_policy.s3_policy.arn}"
}

resource "aws_iam_role_policy_attachment" "worker_lambda_attachment" {
role = "${aws_iam_role.worker_role.name}"
policy_arn = "${aws_iam_policy.lambda_policy.arn}"
}

An IAM role is needed for the Lambda function as well:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Lambda IAM role
resource "aws_iam_role" "lambda_role" {
name = "FibonacciFunctionRole"
path = "/"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}

Finally, a Go-based Lambda function will be created with the following properties:

1
2
3
4
5
6
7
8
// Lambda function
resource "aws_lambda_function" "function" {
filename = "deployment.zip"
function_name = "Fibonacci"
role = "${aws_iam_role.lambda_role.arn}"
handler = "main"
runtime = "go1.x"
}

Next, build the deployment package with the following commands:

1
2
3
4
# Build linux binary
GOOS=linux go build -o main main.go
# Create a zip file
zip deployment.zip main

Then, issue the terraform apply command to create the resources:



Sign in to AWS Management Console and navigate to Lambda Console, a new function called “Fibonacci” should be created:



You can test it out, by mocking the input from the “Select a test event” dropdown list:



If you click on “Test” button the Fibonacci number of 7 will be returned:



So far our function is working as expected. However, how can we ensure each changes to our codebase doesn’t break things ? That’s where CI/CD comes into play, the idea is making all code changes and features go through a complex pipeline before integrating them to the master branch and deploying it to production.

You need a Jenkins cluster with at least a single worker (with Go preinstalled), you can follow my previous post for a step by step guide on how to build a Jenkins cluster on AWS from scratch.

Prior to the build, the IAM instance role (created with Terraform) with the write access to S3 and the update operations to Lambda must be configured on the Jenkins workers:



Jump back to Jenkins Dashboard and create new multi-branch project and configure the GitHub repository where the code source is versioned as follows:



Create a new file called Jenkinsfile, it defines a set of steps that will be executed on Jenkins (This definition file must be committed to the Lambda function’s code repository):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
def bucket = 'deployment-packages-mlabouardy'
def functionName = 'Fibonacci'
def region = 'eu-west-3'

node('slaves'){
stage('Checkout'){
checkout scm
}

stage('Test'){
sh 'go get -u github.com/golang/lint/golint'
sh 'go get -t ./...'
sh 'golint -set_exit_status'
sh 'go vet .'
sh 'go test .'
}

stage('Build'){
sh 'GOOS=linux go build -o main main.go'
sh "zip ${commitID()}.zip main"
}

stage('Push'){
sh "aws s3 cp ${commitID()}.zip s3://${bucket}"
}

stage('Deploy'){
sh "aws lambda update-function-code --function-name ${functionName} \
--s3-bucket ${bucket} \
--s3-key ${commitID()}.zip \
--region ${region}"
}
}

def commitID() {
sh 'git rev-parse HEAD > .git/commitID'
def commitID = readFile('.git/commitID').trim()
sh 'rm .git/commitID'
commitID
}

The pipeline is divided into 5 stages:

  • Checkout: clone the GitHub repository.
  • Test: check whether our code is well formatted and follows Go best practices and run unit tests.
  • Build: build a binary and create the deployment package.
  • Push: store the deployment package (.zip file) to an S3 bucket.
  • Deploy: update the Lambda function’s code with the new artifact.

Note the usage of the git commit ID as a name for the deployment package to give a meaningful and significant name for each release and be able to roll back to a specific commit if things go wrong.

Once the project is saved, a new pipeline should be created as follows:



Once the pipeline is completed, all stages should be passed, as shown in the next screenshot:



At the end, Jenkins will update the Lambda function’s code with the update-function-code command:



If you open the S3 Console, then click on the bucket used by the pipeline, a new deployment package should be stored with a key name identical to the commit ID:



Finally, to make Jenkins trigger the build when you push to the code repository, click on “Settings” from your GitHub repository, then create a new webhook from “Webhooks”, and fill it in with a URL similar to the following:



In case you’re using Git branching workflows (you should), Jenkins will discover automatically the new branches:



Hence, you must separate your deployment environments to test new changes without impacting your production. Therefore, having multiple versions of your Lambda functions makes sense.

Update the Jenkinsfile to add a new stage to publish a new Lambda function’s version, every-time you push (or merge) to the master branch:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
def bucket = 'deployment-packages-mlabouardy'
def functionName = 'Fibonacci'
def region = 'eu-west-3'

node('slaves'){
stage('Checkout'){
checkout scm
}

stage('Test'){
sh 'go get -u github.com/golang/lint/golint'
sh 'go get -t ./...'
sh 'golint -set_exit_status'
sh 'go vet .'
sh 'go test .'
}

stage('Build'){
sh 'GOOS=linux go build -o main main.go'
sh "zip ${commitID()}.zip main"
}

stage('Push'){
sh "aws s3 cp ${commitID()}.zip s3://${bucket}"
}

stage('Deploy'){
sh "aws lambda update-function-code --function-name ${functionName} \
--s3-bucket ${bucket} \
--s3-key ${commitID()}.zip \
--region ${region}"
}

if (env.BRANCH_NAME == 'master') {
stage('Publish') {
sh "aws lambda publish-version --function-name ${functionName} \
--region ${region}"
}
}
}

def commitID() {
sh 'git rev-parse HEAD > .git/commitID'
def commitID = readFile('.git/commitID').trim()
sh 'rm .git/commitID'
commitID
}

On the master branch, a new stage called “Published” will be added:



As a result, a new version will be published based on the master branch source code:



However, in agile based environment (Extreme programming). The development team needs to release iterative versions of the system often to help the customer to gain confidence in the progress of the project, receive feedback and detect bugs in earlier stage of development. As a result, small releases can be frequent:



AWS services using Lambda functions as downstream resources (API Gateway as an example) need to be updated every-time a new version is published -> operational overhead and downtime. USE aliases !!!

The alias is a pointer to a specific version, it allows you to promote a function from one environment to another (such as staging to production). Aliases are mutable, unlike versions, which are immutable.

That being said, create an alias for the production environment that points to the latest version published using the AWS command line:

1
2
3
aws lambda create-alias --function-name Fibonacci \
--name production --function-version 2 \
--region eu-west-3

You can now easily promote the latest version published into production by updating the production alias pointer’s value:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def bucket = 'deployment-packages-mlabouardy'
def functionName = 'Fibonacci'
def region = 'eu-west-3'

node('slaves'){
stage('Checkout'){
checkout scm
}

stage('Test'){
sh 'go get -u github.com/golang/lint/golint'
sh 'go get -t ./...'
sh 'golint -set_exit_status'
sh 'go vet .'
sh 'go test .'
}

stage('Build'){
sh 'GOOS=linux go build -o main main.go'
sh "zip ${commitID()}.zip main"
}

stage('Push'){
sh "aws s3 cp ${commitID()}.zip s3://${bucket}"
}

stage('Deploy'){
sh "aws lambda update-function-code --function-name ${functionName} \
--s3-bucket ${bucket} \
--s3-key ${commitID()}.zip \
--region ${region}"
}

if (env.BRANCH_NAME == 'master') {
stage('Publish') {
def lambdaVersion = sh(
script: "aws lambda publish-version --function-name ${functionName} --region ${region} | jq -r '.Version'",
returnStdout: true
)
sh "aws lambda update-alias --function-name ${functionName} --name production --region ${region} --function-version ${lambdaVersion}"
}
}
}

def commitID() {
sh 'git rev-parse HEAD > .git/commitID'
def commitID = readFile('.git/commitID').trim()
sh 'rm .git/commitID'
commitID
}

Like what you’re read­ing? Check out my book and learn how to build, secure, deploy and manage production-ready Serverless applications in Golang with AWS Lambda.

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Deploy a Jenkins Cluster on AWS

Few months ago, I gave a talk at Nexus User Conference 2018 on how to build a fully automated CI/CD platform on AWS using Terraform, Packer & Ansible. I illustrated how concepts like infrastructure as code, immutable infrastructure, serverless, cluster discovery, etc can be used to build a highly available and cost-effective pipeline. The platform I built is given in the following diagram:



The platform has a Jenkins cluster with a dedicated Jenkins master and workers inside an autoscaling group. Each push event to the code repository will trigger the Jenkins master which will schedule a new build on one of the available slaves. The slave will be responsible of running the unit and pre-integration tests, building the Docker image, storing the image to a private registry and deploying a container based on that image to Docker Swarm cluster.



On this post, I will walk through how to deploy the Jenkins cluster on AWS using top trending automation tools.

The cluster will be deployed into a VPC with 2 public and 2 private subnets across 2 availability zones. The stack will consists of an autoscaling group of Jenkins workers in a private subnets and a private instance for the Jenkins master sitting behind an elastic Load balancer. To add or remove Jenkins workers on-demand, the CPU utilisation of the ASG will be used to trigger a scale out (CPU > 80%) or scale in (CPU < 20%) event. (See figure below)



To get started, we will create 2 AMIs (Amazon Machine Image) for our instances. To do so, we will use Packer, which allows you to bake your own image.

The first AMI will be used to create the Jenkins master instance. The AMI uses the Amazon Linux Image as a base image and for provisioning part it uses a simple shell script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{
"variables" : {
"region" : "eu-west-3",
"source_ami" : "ami-0ebc281c20e89ba4b"
},
"builders" : [
{
"type" : "amazon-ebs",
"profile" : "default",
"region" : "{{user `region`}}",
"instance_type" : "t2.micro",
"source_ami" : "{{user `source_ami`}}",
"ssh_username" : "ec2-user",
"ami_name" : "jenkins-master-2.107.2",
"ami_description" : "Amazon Linux Image with Jenkins Server",
"run_tags" : {
"Name" : "packer-builder-docker"
},
"tags" : {
"Tool" : "Packer",
"Author" : "mlabouardy"
}
}
],
"provisioners" : [
{
"type" : "file",
"source" : "COPY FILES",
"destination" : "COPY FILES"
},
{
"type" : "shell",
"script" : "./setup.sh",
"execute_command" : "sudo -E -S sh '{{ .Path }}'"
}
]
}

The shell script will be used to install the necessary dependencies, packages and security patches:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/bin/bash

echo "Install Jenkins stable release"
yum remove -y java
yum install -y java-1.8.0-openjdk
wget -O /etc/yum.repos.d/jenkins.repo http://pkg.jenkins-ci.org/redhat-stable/jenkins.repo
rpm --import https://jenkins-ci.org/redhat/jenkins-ci.org.key
yum install -y jenkins
chkconfig jenkins on

echo "Install Telegraf"
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.6.0-1.x86_64.rpm -O /tmp/telegraf.rpm
yum localinstall -y /tmp/telegraf.rpm
rm /tmp/telegraf.rpm
chkconfig telegraf on
mv /tmp/telegraf.conf /etc/telegraf/telegraf.conf
service telegraf start

echo "Install git"
yum install -y git

echo "Setup SSH key"
mkdir /var/lib/jenkins/.ssh
touch /var/lib/jenkins/.ssh/known_hosts
chown -R jenkins:jenkins /var/lib/jenkins/.ssh
chmod 700 /var/lib/jenkins/.ssh
mv /tmp/id_rsa /var/lib/jenkins/.ssh/id_rsa
chmod 600 /var/lib/jenkins/.ssh/id_rsa

echo "Configure Jenkins"
mkdir -p /var/lib/jenkins/init.groovy.d
mv /tmp/basic-security.groovy /var/lib/jenkins/init.groovy.d/basic-security.groovy
mv /tmp/disable-cli.groovy /var/lib/jenkins/init.groovy.d/disable-cli.groovy
mv /tmp/csrf-protection.groovy /var/lib/jenkins/init.groovy.d/csrf-protection.groovy
mv /tmp/disable-jnlp.groovy /var/lib/jenkins/init.groovy.d/disable-jnlp.groovy
mv /tmp/jenkins.install.UpgradeWizard.state /var/lib/jenkins/jenkins.install.UpgradeWizard.state
mv /tmp/node-agent.groovy /var/lib/jenkins/init.groovy.d/node-agent.groovy
chown -R jenkins:jenkins /var/lib/jenkins/jenkins.install.UpgradeWizard.state
mv /tmp/jenkins /etc/sysconfig/jenkins
chmod +x /tmp/install-plugins.sh
bash /tmp/install-plugins.sh
service jenkins start

It will install the latest stable version of Jenkins and configure its settings:

  • Create a Jenkins admin user.
  • Create a SSH, GitHub and Docker registry credentials.
  • Install all needed plugins (Pipeline, Git plugin, Multi-branch Project, etc).
  • Disable remote CLI, JNLP and unnecessary protocols.
  • Enable CSRF (Cross Site Request Forgery) protection.
  • Install Telegraf agent for collecting resource and Docker metrics.

The second AMI will be used to create the Jenkins workers, similarly to the first AMI, it will be using the Amazon Linux Image as a base image and a script to provision the instance:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash

echo "Install Java JDK 8"
yum remove -y java
yum install -y java-1.8.0-openjdk

echo "Install Docker engine"
yum update -y
yum install docker -y
usermod -aG docker ec2-user
service docker start

echo "Install git"
yum install -y git

echo "Install Telegraf"
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.6.0-1.x86_64.rpm -O /tmp/telegraf.rpm
yum localinstall -y /tmp/telegraf.rpm
rm /tmp/telegraf.rpm
chkconfig telegraf on
usermod -aG docker telegraf
mv /tmp/telegraf.conf /etc/telegraf/telegraf.conf
service telegraf start

A Jenkins worker requires the Java JDK environment and Git to be installed. In addition, the Docker community edition (building Docker images) and a data collector (monitoring) will be installed.

Now our Packer template files are defined, issue the following commands to start baking the AMIs:

1
2
3
4
5
# validate packer template
packer validate ami.json

# build ami
packer build ami.json

Packer will launch a temporary EC2 instance from the base image specified in the template file and provision the instance with the given shell script. Finally, it will create an image from the instance. The following is an example of the output:



Sign in to AWS Management Console, navigate to “EC2 Dashboard” and click on “AMI”, 2 new AMIs should be created as below:



Now our AMIs are ready to use, let’s deploy our Jenkins cluster to AWS. To achieve that, we will use an infrastructure as code tool called Terraform, it allows you to describe your entire infrastructure in templates files.

I have divided each component of my infrastructure to a template file. The following template file is responsible of creating an EC2 instance from the Jenkins master’s AMI built earlier:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
resource "aws_instance" "jenkins_master" {
ami = "${data.aws_ami.jenkins-master.id}"
instance_type = "${var.jenkins_master_instance_type}"
key_name = "${var.key_name}"
vpc_security_group_ids = ["${aws_security_group.jenkins_master_sg.id}"]
subnet_id = "${element(var.vpc_private_subnets, 0)}"

root_block_device {
volume_type = "gp2"
volume_size = 30
delete_on_termination = false
}

tags {
Name = "jenkins_master"
Author = "mlabouardy"
Tool = "Terraform"
}
}

Another template file used as a reference to each AMI built with Packer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
data "aws_ami" "jenkins-master" {
most_recent = true
owners = ["self"]

filter {
name = "name"
values = ["jenkins-master-2.107.2"]
}
}

data "aws_ami" "jenkins-slave" {
most_recent = true
owners = ["self"]

filter {
name = "name"
values = ["jenkins-slave"]
}
}

The Jenkins workers (aka slaves) will be inside an autoscaling group of a minimum of 3 instances. The instances will be created from a launch configuration based on the Jenkins slave’s AMI:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// Jenkins slaves launch configuration
resource "aws_launch_configuration" "jenkins_slave_launch_conf" {
name = "jenkins_slaves_config"
image_id = "${data.aws_ami.jenkins-slave.id}"
instance_type = "${var.jenkins_slave_instance_type}"
key_name = "${var.key_name}"
security_groups = ["${aws_security_group.jenkins_slaves_sg.id}"]
user_data = "${data.template_file.user_data_slave.rendered}"

root_block_device {
volume_type = "gp2"
volume_size = 30
delete_on_termination = false
}

lifecycle {
create_before_destroy = true
}
}

// ASG Jenkins slaves
resource "aws_autoscaling_group" "jenkins_slaves" {
name = "jenkins_slaves_asg"
launch_configuration = "${aws_launch_configuration.jenkins_slave_launch_conf.name}"
vpc_zone_identifier = "${var.vpc_private_subnets}"
min_size = "${var.min_jenkins_slaves}"
max_size = "${var.max_jenkins_slaves}"

depends_on = ["aws_instance.jenkins_master", "aws_elb.jenkins_elb"]

lifecycle {
create_before_destroy = true
}

tag {
key = "Name"
value = "jenkins_slave"
propagate_at_launch = true
}

tag {
key = "Author"
value = "mlabouardy"
propagate_at_launch = true
}

tag {
key = "Tool"
value = "Terraform"
propagate_at_launch = true
}
}

To leverage the power of automation, we will make the worker instance join the cluster automatically (cluster discovery) using Jenkins RESTful API:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash

JENKINS_URL="${jenkins_url}"
JENKINS_USERNAME="${jenkins_username}"
JENKINS_PASSWORD="${jenkins_password}"
TOKEN=$(curl -u $JENKINS_USERNAME:$JENKINS_PASSWORD ''$JENKINS_URL'/crumbIssuer/api/xml?xpath=concat(//crumbRequestField,":",//crumb)')
INSTANCE_NAME=$(curl -s 169.254.169.254/latest/meta-data/local-hostname)
INSTANCE_IP=$(curl -s 169.254.169.254/latest/meta-data/local-ipv4)
JENKINS_CREDENTIALS_ID="${jenkins_credentials_id}"

sleep 60

curl -v -u $JENKINS_USERNAME:$JENKINS_PASSWORD -H "$TOKEN" -d 'script=
import hudson.model.Node.Mode
import hudson.slaves.*
import jenkins.model.Jenkins
import hudson.plugins.sshslaves.SSHLauncher
DumbSlave dumb = new DumbSlave("'$INSTANCE_NAME'",
"'$INSTANCE_NAME'",
"/home/ec2-user",
"3",
Mode.NORMAL,
"slaves",
new SSHLauncher("'$INSTANCE_IP'", 22, SSHLauncher.lookupSystemCredentials("'$JENKINS_CREDENTIALS_ID'"), "", null, null, "", "", 60, 3, 15),
RetentionStrategy.INSTANCE)
Jenkins.instance.addNode(dumb)
' $JENKINS_URL/script

At boot time, the user-data script above will be invoked and the instance private IP address will be retrieved from the instance meta-data and a groovy script will be executed to make the node join the cluster:

1
2
3
4
5
6
7
8
9
10
data "template_file" "user_data_slave" {
template = "${file("scripts/join-cluster.tpl")}"

vars {
jenkins_url = "http://${aws_instance.jenkins_master.private_ip}:8080"
jenkins_username = "${var.jenkins_username}"
jenkins_password = "${var.jenkins_password}"
jenkins_credentials_id = "${var.jenkins_credentials_id}"
}
}

Moreover, to be able to scale out and scale in instances on demand, I have defined 2 CloudWatch metric alarms based on the CPU utilisation of the autoscaling group:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Scale out
resource "aws_cloudwatch_metric_alarm" "high-cpu-jenkins-slaves-alarm" {
alarm_name = "high-cpu-jenkins-slaves-alarm"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "80"

dimensions {
AutoScalingGroupName = "${aws_autoscaling_group.jenkins_slaves.name}"
}

alarm_description = "This metric monitors ec2 cpu utilization"
alarm_actions = ["${aws_autoscaling_policy.scale-out.arn}"]
}

resource "aws_autoscaling_policy" "scale-out" {
name = "scale-out-jenkins-slaves"
scaling_adjustment = 1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = "${aws_autoscaling_group.jenkins_slaves.name}"
}

// Scale In
resource "aws_cloudwatch_metric_alarm" "low-cpu-jenkins-slaves-alarm" {
alarm_name = "low-cpu-jenkins-slaves-alarm"
comparison_operator = "LessThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "20"

dimensions {
AutoScalingGroupName = "${aws_autoscaling_group.jenkins_slaves.name}"
}

alarm_description = "This metric monitors ec2 cpu utilization"
alarm_actions = ["${aws_autoscaling_policy.scale-in.arn}"]
}

resource "aws_autoscaling_policy" "scale-in" {
name = "scale-in-jenkins-slaves"
scaling_adjustment = -1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = "${aws_autoscaling_group.jenkins_slaves.name}"
}

Finally, an Elastic Load Balancer will be created in front of the Jenkins master’s instance and a new DNS record pointing to the ELB domain will be added to Route 53:

1
2
3
4
5
6
7
8
9
10
11
resource "aws_route53_record" "jenkins_master" {
zone_id = "${var.hosted_zone_id}"
name = "jenkins.slowcoder.com"
type = "A"

alias {
name = "${aws_elb.jenkins_elb.dns_name}"
zone_id = "${aws_elb.jenkins_elb.zone_id}"
evaluate_target_health = true
}
}

Once the stack is defined, provision the infrastructure with terraform apply command:

1
2
3
4
5
6
7
8
# Install the AWS provider plugin
terraform int

# Dry-run check
terraform plan

# Provision the infrastructure
terraform apply --var-file=variables.tfvars

The command takes an additional parameter, a variables file with the AWS credentials and VPC settings (You can create a new VPC with Terraform from here):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
region = ""

aws_profile = ""

shared_credentials_file = ""

key_name = ""

hosted_zone_id = ""

bastion_sg_id = ""

jenkins_username = ""

jenkins_password = ""

jenkins_credentials_id = ""

vpc_id = ""

vpc_private_subnets = []

vpc_public_subnets = []

ssl_arn = ""

Terraform will display an execution plan (list of resources that will be created in advance), type yes to confirm and the stack will be created in few seconds:



Jump back to EC2 dashboards, a list of EC2 instances will created:



In the terminal session, under the Outputs section, the Jenkins URL will be displayed:



Point your favorite browser to the URL displayed, the Jenkins login screen will be displayed. Sign in using the credentials provided while baking the Jenkins master’s AMI:



If you click on “Credentials” from the navigation pane, a set of credentials should be created out of the box:



The same goes for “Plugins”, a list of needed packages will be installed also:



Once the Autoscaling group finished creating the EC2 instances, the instances will join the cluster automatically as you can see in the following screenshot:



You should now be ready to create your own CI/CD pipeline !



You can take this further and build a dynamic dashboard in your favorite visualisation tool like Grafana to monitor your cluster resource usage based on the metrics collected by the agent installed on each EC2 instance:



Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

AWS Events Analysis with ELK

Recording your AWS environment activity is a must have. It can help you monitor your environment’s security continuously and detect suspicious or undesirable activity in real-time. Hence, saving thousands of dollars. Luckily, AWS offers a solution called CloudTrail that allow you to achieve that. It records all events in all AWS regions and logs every API calls in a single S3 bucket.



From there, you can setup an analysis pipeline using the popular logging stack ELK (ElasticSearch, Logstash & Kibana) to read those logs, parse, index and visualise them in a single dynamic dashboard and even take actions accordingly:



To get started, create an AMI with the ELK components installed and preconfigured. The AMI will be based on an Ubuntu image:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
{
"variables" : {
"region" : "us-east-1"
},
"builders" : [
{
"type" : "amazon-ebs",
"profile" : "default",
"region" : "{{user `region`}}",
"instance_type" : "t2.xlarge",
"source_ami" : "ami-759bc50a",
"ssh_username" : "ubuntu",
"ami_name" : "elk-stack-6.2.4",
"ami_description" : "ELK Stack",
"run_tags" : {
"Name" : "packer-builder-docker",
"Tool" : "Packer",
"Author" : "mlabouardy"
}
}
],
"provisioners" : [
{
"type" : "file",
"source" : "./elasticsearch.yml",
"destination" : "/tmp/elasticsearch.yml"
},
{
"type" : "file",
"source" : "./cloudtrail.conf",
"destination" : "/tmp/cloudtrail.conf"
},
{
"type" : "file",
"source" : "./kibana.yml",
"destination" : "/tmp/kibana.yml"
},
{
"type" : "shell",
"script" : "./setup.sh",
"execute_command" : "sudo -E -S sh '{{ .Path }}'"
}
]
}

To provision the AMI, we will use the following shell script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#!/bin/bash

echo "Install Java JDK 8"
apt-get update
apt-get install openjdk-8-jre -y

echo "Install ElasticSearch 6"
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add -
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | tee -a /etc/apt/sources.list.d/elastic-6.x.list
apt-get update
apt-get install -y elasticsearch
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch
mv /tmp/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml

echo "Start ElasticSearch"
systemctl enable elasticsearch.service
systemctl start elasticsearch.service

echo "Install Logstash"
apt-get install -y apt-transport-https logstash
mv /tmp/cloudtrail.conf /etc/logstash/conf.d/cloudtrail.conf

echo "Start Logstash"
systemctl enable logstash
systemctl start logstash

echo "Install Kibana"
apt-get install -y kibana
mv /tmp/kibana.yml /etc/kibana/kibana.yml

echo "Start Kibana"
systemctl enable kibana
systemctl start kibana

Now the template is defined, bake a new AMI with Packer:

1
packer build ami.json

Once the AMI is created, create a new EC2 instance based on the AMI with Terraform. Make sure to grant S3 permissions to the instance to be able to read CloudTrail logs from the bucket:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
provider "aws" {
region = "${var.aws_region}"
}

data "aws_ami" "elk" {
most_recent = true
owners = ["self"]

filter {
name = "state"
values = ["available"]
}

filter {
name = "name"
values = ["elk-stack-6.2.4"]
}
}

resource "aws_security_group" "elk_sg" {
name = "elk_sg"
description = "Allow traffic on elasticsearch & kibana ports"

ingress {
from_port = "22"
to_port = "22"
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}

ingress {
from_port = "9200"
to_port = "9200"
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}

ingress {
from_port = "5601"
to_port = "5601"
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}

egress {
from_port = "0"
to_port = "0"
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}

tags {
Name = "elk_sg"
Author = "mlabouardy"
Tool = "Terraform"
}
}

resource "aws_iam_role_policy" "cloudtrail_bucket_access_policy" {
name = "CloudTrailEventsBucketFullAccessPolicy"
role = "${aws_iam_role.cloudtrail_bucket_access_role.id}"

policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::cloudtrail-demo-2018"
}
]
}
EOF
}

resource "aws_iam_role" "cloudtrail_bucket_access_role" {
name = "CloudTrailEventsBucketFullAccessRole"

assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}

resource "aws_iam_instance_profile" "cloudtrail_bucket_access_profile" {
name = "cloudtrail_bucket_access_profile"
role = "${aws_iam_role.cloudtrail_bucket_access_role.name}"
}

resource "aws_instance" "elk" {
key_name = "${var.key_name}"
instance_type = "${var.instance_type}"
ami = "${data.aws_ami.elk.id}"
security_groups = ["${aws_security_group.elk_sg.name}"]
iam_instance_profile = "${aws_iam_instance_profile.cloudtrail_bucket_access_profile.name}"

root_block_device {
volume_size = 100
}

tags {
Name = "elk"
Author = "mlabouardy"
Tool = "Terraform"
}
}

Issue the following command to provision the infrastructure:

1
terraform apply

Head back to AWS Management Console, navigate to CloudTrail, and click on “Create Trail” button:



Give it a name and apply the trail to all AWS regions:



Next, create a new S3 bucket on which the events will be stored on:



Click on “Create“, and the trail should be created as follows:



Next, configure Logstash to read CloudTrail logs on an interval basis. The geoip filter adds information about the geographical location of IP addresses, based on sourceIPAddress field. Then, it stores the logs to Elasticsearch automatically:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
input {
s3 {
"bucket" => "cloudtrail-demo-2018"
}
}

filter {
json {
source => "message"
}

split {
field => "Records"
}

geoip {
source => "[Records][sourceIPAddress]"
target => "geoip"
add_tag => [ "cloudtrail-geoip" ]
}
}

output {
elasticsearch {
hosts => 'http://localhost:9200'
index => 'cloudtrail-%{+YYYY.MM.dd}'
}
}

In order for the changes to take effect, restart Logstash with the command below:

1
service restart logstash

A new index should be created on Elasticsearch (http://IP:9200/_cat/indices?v)



On Kibana, create a new index pattern that match the index format used to store the logs:



After creating index, we can start exploring our CloudTrail events:



Now that we have processed data inside Elasticsearch, let’s build some graphs. We will use the Map visualization in Kibana to monitor geo access to our AWS environment:



You can now see where the environment is being accessed from:



Next, create more widgets to display information about the identity of the user, the user agent and actions taken by the user. Which will look something like this:



You can take this further and setup alerts based on specific event (someone accesses your environment from an undefined location) to be alerted in near real-time.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

One-shot containers with Serverless

Have you ever had short lived containers like the following use cases:

  • Batch and ETL (Extract, Transform & Load) Jobs.
  • Database backups and synchronisation.
  • Machine Learning algorithms for generation of learning and training models.
  • Integration & Sanity tests.
  • Web scrapers & crawlers.

And you were wondering how you can deploy your container periodically or in response to an event ? The answer is by using Lambda itself, the idea is by making a Lambda function trigger a deployment of your container from the build server. The following figure illustrates how this process can be implemented:



I have wrote a simple application in Go to simulate a short time process using sleep method:

1
2
3
4
5
6
7
8
9
10
11
12
package main

import (
"fmt"
"time"
)

func main() {
fmt.Println("Start working ...")
time.Sleep(10 * time.Second)
fmt.Println("Done")
}

As Go is a complied language, I have used Docker multi-stage build feature to build a lightweight Docker image with the following Dockerfile:

1
2
3
4
5
6
7
8
9
10
FROM golang:1.10
WORKDIR /go/src/github.com/mlabouardy/lambda-oneshot-container
COPY main.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/mlabouardy/lambda-oneshot-container/app .
CMD ["./app"]

Next, I have a simple CI/CD workflow in Jenkins, the following is the Jenkinsfile used to build the pipeline:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
node('slaves'){
stage('Checkout'){
checkout scm
}

stage('Build'){
docker.build(image)
}

stage('Push'){
docker.withRegistry(registry, 'registry') {
docker.image(image).push("${commitID()}")

if (env.BRANCH_NAME == 'master') {
docker.image(image).push('latest')
}
}
}

stage('Deploy'){
build job: "oneshot-app-deployment"
}
}

An example of the pipeline execution is given as follows:



Now, all changes to the application will trigger a new build on Jenkins which will build the new Docker image, push the image to a private registry and deploy the new Docker image to the Swarm cluster:



If you issue the “docker service logs APP_NAME” on one of the cluster managers, your application should be working as expected:



Now our application is ready, let’s make execute everyday at 8am using a Lambda function. The following is the entrypoint (handler) that will be executed on each invocation of the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
func triggerJob() error {
url := fmt.Sprintf(`%s/job/%s/build`, os.Getenv("JENKINS_HOST"), os.Getenv("JENKINS_JOB"))

client := http.Client{}
req, err := http.NewRequest("POST", url, nil)
if err != nil {
return err
}

crumb, err := getToken()
if err != nil {
return err
}

req.Header.Set(crumb[0], crumb[1])
req.SetBasicAuth(os.Getenv("JENKINS_USERNAME"), os.Getenv("JENKINS_PASSWORD"))

resp, err := client.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()

if resp.StatusCode != 201 {
return errors.New("Cannot trigger job")
}

return nil
}

It uses the Jenkins API to trigger the deployment process job.

Now the function is defined, use the shell script below to create the following:

  • Build a deployment package (.zip file).
  • Create an IAM role with permissions to push logs to CloudWatch.
  • Create a Go based Lambda function from the deployment package.
  • Create a CloudWatch Event rule that will be executed everyday at 8am.
  • Make the CloudWatch Event invoke the Lambda function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#!/bin/bash

## Override
JENKINS_HOST=""
JENKINS_USERNAME=""
JENKINS_PASSWORD=""
JENKINS_JOB=""
CRON_EXPRESSION="cron(0 8 * * ? *)"
## Global variables
AWS_REGION="us-east-1"
FUNCTION_NAME="RestartJob"

echo "Building binary"
GOOS=linux go build -o main main.go

echo "Generating deployment package"
zip deployment.zip main

echo "Creating IAM Role"
POLICY_ARN=$(aws iam create-policy --policy-name $FUNCTION_NAME --policy-document file://policy.json | jq -r '.Policy.Arn')
ROLE_ARN=$(aws iam create-role --role-name $FUNCTION_NAME --assume-role-policy-document file://role.json | jq -r '.Role.Arn')
aws iam attach-role-policy --role-name $FUNCTION_NAME --policy-arn $POLICY_ARN

echo "Creating Lambda function"
FUNCTION_ARN=$(aws lambda create-function --function-name $FUNCTION_NAME --runtime go1.x \
--handler main --role $ROLE_ARN \
--zip-file fileb://./deployment.zip \
--environment Variables="{JENKINS_HOST=$JENKINS_HOST,JENKINS_USERNAME=$JENKINS_USERNAME,JENKINS_PASSWORD=$JENKINS_PASSWORD,JENKINS_JOB=$JENKINS_JOB}" \
--region $AWS_REGION | jq -r '.FunctionArn')

echo "Creating CloudWatch Event rule"
RULE_ARN=$(aws events put-rule --name launch-container-daily --schedule-expression ''"$CRON_EXPRESSION"'' | jq -r '.RuleArn')
aws lambda add-permission --function-name $FUNCTION_NAME \
--statement-id 1 \
--action 'lambda:InvokeFunction' \
--principal events.amazonaws.com \
--source-arn $RULE_ARN
sed -i '.bak' 's/FUNCTION_ARN/'"$FUNCTION_ARN"'/g' targets.json
aws events put-targets --rule launch-container-daily --targets file://targets.json


echo "Cleaning up"
rm main deployment.zip *.bak

As a result, a Lambda function will be created as follows:

1
aws lambda invoke --function-name RestartJob output

A new deployment should be triggered in Jenkins and your application should be deployed once again:



That’s it, it was a quick example on how you can use Serverless with Containers, you can go further and use Lambda functions to scale out/scale in your services in your Swarm/Kubernetes cluster by using either CloudWatch events for expected increasing traffic (Holidays, Black Friday …) or other AWS managed services like API Gateway in response to incoming client requests.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Docker on Elastic Beanstalk Tips

AWS Elastic Beanstalk is one of the most used PaaS today, it allows you to deploy your application without provisioning the underlying infrastructure while maintaining the high availability of your application. However, it’s painful to use due to the lack of documentation and real-world scenarios. In this post, I will walk you through how to use Elastic Beanstalk to deploy Docker containers from scratch. Followed by how to automate your deployment process with a Continuous Integration pipeline. At the end of this post, you should be familiar with advanced topics like debugging and monitoring of your applications in EB.



1 – Environment Setup

To get started, create a new Application using the following AWS CLI command:

1
2
aws elasticbeanstalk create-application --application-name avengers \
--region eu-west-3

Create a new environment. Let’s call it “staging” :

1
2
3
4
5
aws elasticbeanstalk create-environment --application-name avengers \
--environment-name staging \
--solution-stack-name "64bit Amazon Linux 2017.09 v2.9.2 running Docker 17.12.0-ce" \
--option-settings file://options.json \
--region eu-west-3

Head back to AWS Elastic Beanstalk Console, your new environment should be created:



Point your browser to the environment URL, a sample Docker application should be displayed:



Let’s deploy our application. I wrote a small web application in Go to return a list of Marvel Avengers (I see you Thanos 😉 )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
package main

import (
"encoding/json"
"io/ioutil"
"log"
"net/http"
)

type Avenger struct {
Character string `json:"character"`
Name string `json:"name"`
}

var avengers []Avenger

func init() {
data, _ := ioutil.ReadFile("avengers.json")
json.Unmarshal(data, &avengers)
}

func IndexHandler(w http.ResponseWriter, r *http.Request) {
response, _ := json.Marshal(avengers)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(200)
w.Write(response)
}

func main() {
http.HandleFunc("/", IndexHandler)
if err := http.ListenAndServe(":3000", nil); err != nil {
log.Fatal(err)
}
}

Next, we will create a Dockerfile to build the Docker image. Go is a compiled language, therefore we can use the Docker multi-stage feature to build a lightweight Docker image:

1
2
3
4
5
6
7
8
9
10
11
12
13
FROM golang:1.10 as builder
WORKDIR /go/src/github.com/mlabouardy/docker-eb-ci-mon
COPY main.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
MAINTAINER mlabouardy
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/mlabouardy/docker-eb-ci-mon/app .
COPY avengers.json .
EXPOSE 3000
CMD ["./app"]

Next, we create a Dockerrun.aws.json that describes how the container will be deployed in Elastic Beanstalk:

1
2
3
4
5
6
7
8
9
10
11
12
{
"AWSEBDockerrunVersion": "1",
"Image": {
"Name": "mlabouardy/avengers",
"Update": "true"
},
"Ports": [
{
"ContainerPort": "3000"
}
]
}

Now the application is defined, create an application bundle by creating a ZIP package:

1
zip -r deployment.zip .

Then, create a S3 bucket to store the different versions of your application bundles:

1
aws s3 mb s3://avengers-docker-eb --region AWS_REGION

And create a new application version from the application bundle:

1
2
3
4
5
aws elasticbeanstalk create-application-version --application-name avengers \
--version-label v1 \
--source-bundle S3Bucket="avengers-docker-eb",S3Key="deployment.zip" \
--auto-create-application \
--region AWS_REGION


Finally, deploy the version to the staging environment:

1
2
3
aws elasticbeanstalk update-environment --application-name avengers \
--environment-name staging \
--version-label v1 --region AWS_REGION

Give it a few seconds while it’s deploying the new version:



Then, repoint your browser to the environment URL, a list of Avengers will be returned in a JSON format as follows:



Now that our Docker application is deployed, let’s automate this process by setting up a CI/CD pipeline.

2 – CI/CD Pipeline

I opt for CircleCI, but you’re free to use whatever CI server you’re familiar with. The same steps can be applied.

Create a circle.yml file with the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
version: 2
jobs:
build:
docker:
- image: circleci/golang:1.10

working_directory: /go/src/github.com/mlabouardy/docker-eb-ci-mon

steps:
- checkout

- setup_remote_docker

- run:
name: Install AWS CLI
command: |
sudo apt-get update
sudo apt-get install -y awscli

- run:
name: Test
command: go test

- run:
name: Build
command: docker build -t mlabouardy/avengers:latest .

- run:
name: Push
command: |
docker login -u$DOCKERHUB_LOGIN -p$DOCKERHUB_PASSWORD
docker tag mlabouardy/avengers:latest mlabouardy/avengers:${CIRCLE_SHA1}
docker push mlabouardy/avengers:latest
docker push mlabouardy/avengers:${CIRCLE_SHA1}

- run:
name: Deploy
command: |
zip -r deployment-${CIRCLE_SHA1}.zip .
aws s3 cp deployment-${CIRCLE_SHA1}.zip s3://avengers-docker-eb --region eu-west-3
aws elasticbeanstalk create-application-version --application-name avengers \
--version-label ${CIRCLE_SHA1} --source-bundle S3Bucket="avengers-docker-eb",S3Key="deployment-${CIRCLE_SHA1}.zip" --region eu-west-3
aws elasticbeanstalk update-environment --application-name avengers \
--environment-name staging --version-label ${CIRCLE_SHA1} --region eu-west-3

The pipeline will firstly prepare the environment, installing the AWS CLI. Then run unit tests. Next, a Docker image will be built, then pushed to DockerHub. Last step is creating a new application bundle and deploying the bundle to Elastic Beanstalk.

In order to grant Circle CI permissions to call AWS operations, we need to create a new IAM user with following IAM policy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"elasticbeanstalk:*",
"s3:*",
"ec2:*",
"cloudformation:*",
"autoscaling:*",
"elasticloadbalancing:*"
],
"Resource": "*"
}
]
}

Generate AWS access & secret keys. Then, head back to Circle CI and click on the project settings and paste the credentials :



Now, everytime you push a change to your code repository, a build will be triggered:



And a new version will be deployed automatically to Elastic Beanstalk:



3 – Monitoring

Monitoring your applications is mandatory. Unfortunately, CloudWatch doesn’t expose useful metrics like Memory usage of your applications in Elastic Beanstalk. Hence, in this part, we will solve this issue by creating our custom metrics.

I will install a data collector agent on the instance. The agent will collect metrics and push them to a time-series database.

To install the agent, we will use .ebextensions folder, on which we will create 3 configuration files:

  • 01-install-telegraf.config: install Telegraf on the instance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
container_commands:
01downloadpackage:
command: "wget https://dl.influxdata.com/telegraf/releases/telegraf-1.6.0-1.x86_64.rpm -O /tmp/telegraf.rpm"
ignoreErrors: true
02installpackage:
command: "yum localinstall -y /tmp/telegraf.rpm"
ignoreErrors: true
03removepackage:
command: "rm /tmp/telegraf.rpm"
ignoreErrors: true
04enablereboot:
command: "chkconfig telegraf on"
ignoreErrors: true
05fixpermission:
command: "usermod -a -G docker telegraf"
ignoreErrors: true
  • 02-config-file.config: create a Telegraf configuration file to collect system usage & docker containers metrics.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
files:
"/etc/telegraf/telegraf.conf":
mode: "000666"
owner: root
group: root
content: |
[global_tags]
hostname="Avengers"

# Read metrics about CPU usage
[[inputs.cpu]]
percpu = false
totalcpu = true
fieldpass = [ "usage*" ]
name_suffix = "_vm"

# Read metrics about disk usagee
[[inputs.disk]]
fielddrop = [ "inodes*" ]
mount_points=["/"]
name_suffix = "_vm"

# Read metrics about network usage
[[inputs.net]]
interfaces = [ "eth0", "eth1" ]
fielddrop = [ "icmp*", "ip*", "tcp*", "udp*" ]
name_suffix = "_vm"

# Read metrics about memory usage
[[inputs.mem]]
name_suffix = "_vm"

# Read metrics about swap memory usage
[[inputs.swap]]
name_suffix = "_vm"

# Read metrics about system load and uptime
[[inputs.system]]
name_suffix = "_vm"

# Read metrics from docker socket api
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
container_names = []
name_suffix = "_docker"

[[outputs.influxdb]]
database = "instances"
urls = ["http://172.31.38.51:8086"]
namepass = ["*_vm"]

[[outputs.influxdb]]
database = "containers"
urls = ["http://172.31.38.51:8086"]
namepass = ["*_docker"]
  • 03-start-telegraf.config: start Telegraf agent.
1
2
3
4
container_commands:
01starttelegraf:
command: "service telegraf start"
ignoreErrors: true

Once the application version is deployed to Elastic Beanstalk, metrics will be pushed to your timeseries database. In this example, I used InfluxDB as data storage and I created some dynamic Dashboards in Grafana to visualize metrics in real-time:

Containers:



Hosts:



Note: for in-depth explaination on how to configure Telegraf, InfluxDB & Grafana read my previous article.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy

Infrastructure Cost Optimization with Lambda

Having multiple environments is important to build a continuous integration/deployment pipeline and be able to reproduce bugs in production with ease but this comes at price. In order to reduce cost of AWS infrastructure, instances which are running 24/7 unnecessarily (sandbox & staging environments) must be shut down outside of regular business hours.

The figure below describes an automated process to schedule, stop and start instances to help cutting costs. The solution is a perfect example of using Serverless computing.



Note: full code is available on my GitHub.

2 Lambda functions will be created, they will scan all environments looking for a specific tag. The tag we use is named ‘Environment’. Instances without an Environment tag will not be affected:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
func getInstances(cfg aws.Config) ([]Instance, error) {
instances := make([]Instance, 0)

svc := ec2.New(cfg)
req := svc.DescribeInstancesRequest(&ec2.DescribeInstancesInput{
Filters: []ec2.Filter{
ec2.Filter{
Name: aws.String("tag:Environment"),
Values: []string{os.Getenv("ENVIRONMENT")},
},
},
})
res, err := req.Send()
if err != nil {
return instances, err
}

for _, reservation := range res.Reservations {
for _, instance := range reservation.Instances {
for _, tag := range instance.Tags {
if *tag.Key == "Name" {
instances = append(instances, Instance{
ID: *instance.InstanceId,
Name: *tag.Value,
})
}
}
}
}

return instances, nil
}

The StartEnvironment function will query the StartInstances method with the list of instance ids returned by the previous function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func startInstances(cfg aws.Config, instances []Instance) error {
instanceIds := make([]string, 0, len(instances))
for _, instance := range instances {
instanceIds = append(instanceIds, instance.ID)
}

svc := ec2.New(cfg)
req := svc.StartInstancesRequest(&ec2.StartInstancesInput{
InstanceIds: instanceIds,
})
_, err := req.Send()
if err != nil {
return err
}
return nil
}

Similarly, the StopEnvironment function will query the StopInstances method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func stopInstances(cfg aws.Config, instances []Instance) error {
instanceIds := make([]string, 0, len(instances))
for _, instance := range instances {
instanceIds = append(instanceIds, instance.ID)
}

svc := ec2.New(cfg)
req := svc.StopInstancesRequest(&ec2.StopInstancesInput{
InstanceIds: instanceIds,
})
_, err := req.Send()
if err != nil {
return err
}
return nil
}

Finally, both functions will post a message to Slack channel for real-time notification:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
func postToSlack(color string, title string, instances string) error {
message := SlackMessage{
Text: title,
Attachments: []Attachment{
Attachment{
Text: instances,
Color: color,
},
},
}

client := &http.Client{}
data, err := json.Marshal(message)
if err != nil {
return err
}

req, err := http.NewRequest("POST", os.Getenv("SLACK_WEBHOOK"), bytes.NewBuffer(data))
if err != nil {
return err
}

resp, err := client.Do(req)
if err != nil {
return err
}

return nil
}

Now our functions are defined, let’s build the deployment packages (zip files) using the following Bash script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash

echo "Building StartEnvironment binary"
GOOS=linux GOARCH=amd64 go build -o main start/*.go

echo "Creating deployment package"
zip start-environment.zip main
rm main

echo "Building StopEnvironment binary"
GOOS=linux GOARCH=amd64 go build -o main stop/*.go

echo "Creating deployment package"
zip stop-environment.zip main
rm main

The functions require an IAM role to be able to interact with EC2. The StartEnvironment function has to be able to describe and start EC2 instances:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"ec2:DescribeInstances",
"ec2:StartInstances"
],
"Resource": [
"*"
]
}
]
}

The StopEnvironment function has to be able to describe and stop EC2 instances:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"ec2:DescribeInstances",
"ec2:StopInstances"
],
"Resource": [
"*"
]
}
]
}

Finally, create an IAM role for each function and attach the above policies:

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash

echo "IAM role for StartEnvironment"
arn=$(aws iam create-policy --policy-name StartEnvironment --policy-document file://start/policy.json | jq -r '.Policy.Arn')
result=$(aws iam create-role --role-name StartEnvironmentRole --assume-role-policy-document file://role.json | jq -r '.Role.Arn')
aws iam attach-role-policy --role-name StartEnvironmentRole --policy-arn $arn
echo "ARN: $result"

echo "IAM role for StopEnvironment"
arn=$(aws iam create-policy --policy-name StopEnvironment --policy-document file://stop/policy.json | jq -r '.Policy.Arn')
result=$(aws iam create-role --role-name StopEnvironmentRole --assume-role-policy-document file://role.json | jq -r '.Role.Arn')
aws iam attach-role-policy --role-name StopEnvironmentRole --policy-arn $arn
echo "ARN: $result"

The script will output the ARN for each IAM role:



Before jumping to deployment part, we need to create a Slack WebHook to be able to post messages to Slack channel:



Next, use the following script to deploy your functions to AWS Lambda (make sure to replace the IAM roles, Slack WebHook token & the target environment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash

START_IAM_ROLE="arn:aws:iam::ACCOUNT_ID:role/StartEnvironmentRole"
STOP_IAM_ROLE="arn:aws:iam::ACCOUNT_ID:role/StopEnvironmentRole"
AWS_REGION="us-east-1"
SLACK_WEBHOOK="https://hooks.slack.com/services/TOKEN"
ENVIRONMENT="sandbox"

echo "Deploying StartEnvironment to Lambda"
aws lambda create-function --function-name StartEnvironment \
--zip-file fileb://./start-environment.zip \
--runtime go1.x --handler main \
--role $START_IAM_ROLE \
--environment Variables="{SLACK_WEBHOOK=$SLACK_WEBHOOK,ENVIRONMENT=$ENVIRONMENT}" \
--region $AWS_REGION


echo "Deploying StopEnvironment to Lambda"
aws lambda create-function --function-name StopEnvironment \
--zip-file fileb://./stop-environment.zip \
--runtime go1.x --handler main \
--role $STOP_IAM_ROLE \
--environment Variables="{SLACK_WEBHOOK=$SLACK_WEBHOOK,ENVIRONMENT=$ENVIRONMENT}" \
--region $AWS_REGION \


rm *-environment.zip

Once deployed, if you sign in to AWS Management Console, navigate to Lambda Console, you should see both functions has been deployed successfully:

StartEnvironment:



StopEnvironment:



To further automate the process of invoking the Lambda function at the right time. AWS CloudWatch Scheduled Events will be used.

Create a new CloudWatch rule with the below cron expression (It will be invoked everyday at 9 AM):



And another rule to stop the environment at 6 PM:



Note: All times are GMT time.

Testing:

a – Stop Environment



Result:



b – Start Environment



Result:



The solution is easy to deploy and can help reduce operational costs.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×