Running Docker Swarm in AWS

Swarm cluster for AWS

Setup a swarm cluster production ready haven’t been so easy !

This is a bootstrap project, which creates 6 swarm nodes, 3 managers and 3 workers.

All Swarm nodes are in a single private subnet and they have access to internet via a Nat Gateway in the public subnet.

There is also a Bastion Host to configure and SSH the swarm nodes. A default Elastic Load Balancer is created also that listens the trafic on TCP:3543 of all swarm nodes and load balance TCP trafic comming from the internet on the port 80.

You can find a more detailled view as above: Cloud visual description

Get started

Clone the repo:

git clone https://github.com/markthebault/aws-swarm-cluster-for-production.git

Boostraping a Swarm cluster in AWS

This project uses CoreOS images on alpha version (to get the last updates from docker)

1/ Provisionning of the infrastructure

Make sure you environment contains the following variables:

export AWS_ACCESS_KEY_ID=<your access key>
export AWS_SECRET_ACCESS_KEY=<your secrect key>

Create a file terraform.tfvars in ./terraform

Example:

control_cidr = "10.234.231.21/32"
owner = "Mark"
default_keypair_name = "swarm-clstr-kp"
default_keypair_path = "~/.ssh/swarm-clstr-kp.pem"

Execute terraform plan to see what will be created and terraform apply to start terraforming ;)

Quick start

To start even faster just run the make file make up, this will create the infrastructure and provisionning the VMs. To destroy everything just run make down.

2/ Init the Swarm cluster

All operations are executed by ansible so make sure that you have you private-key in your ssh agent ssh-add -K ~/.ssh/swarm-clstr-kp.pem (on macOS)

Install ansible requirement on CoreOS (go in the ansible folder ./ansible): ansible-playbook bootstrap.yml

Start the cluster: ansible-playbook init-swarm.yml

Start if you want portainer (a web ui for docker) ansible-playbook docker-ui.yml

3/ Run new services

To start a new services is very easy, you can follow docker’s tutorials

Be aware of the loadbalancer is only configured to load balance trafic incoming from TCP:80 to go on TCP:3543 of the swarm instances

To change the load balancer configuration you can change the following file: ./terraform/elb.tf don’t forget to change the attached security group

4/ Connect with OpenVPN

You can connect to your cluster swarm direclty with openvpn and access to your services that you have deployed on your swarm cluster First you need to add openvpn service to bastion.

OpenVPN is based on this github.

Run the ansible playbook cd ansible && ansible-playbook bastion.yml

this will automaticaly create a file in /tmp/CLIENTADMIN.conf with the configuration of the OpenVPN.

If it fails, you can run manualy the following scripts ./scripts/get-admin-vpn-cert.sh > myconf.conf.

To create more configuration for your users you can run the script ./scripts/create-client-vpnconf.sh client-name file-name.conf

5/ Monitoring

This is experimental monitoring

The Monitoring stack have been done using this stack. To run the Monitoring, execute cd ./ansible && ansible-playbook docker-monitoring.yml You can also find an example of grafana dashboards in ./monitoring/grafana-dashboard/docker-swarm-container-overview.json Grafana is accessible on the port http://SWARM_NODE:3543 (you need to connect with the vpn in order to access to this service).

Optional

Docker API Accessible via TLS

You can also configure the docker deamon to be accessible via a distant CLI using TLS.

To enable that feature follow the difrent steps:

# After terraform have been applied
$ cd ./certificate

# Create the CA certificate and the server cretificates
$ make

# Create the client certificate
$ make gen-client-certs CLIENT=client

# Execute the assible playbook to enable TLS
$ cd ../ansible/
$ ansible-playbook docker-certificates.yml

# All nodes of you swarm are accessible via TLS like this:
$ export IP_OF_ONE_NODE=10.43.1.20
$ docker -H tcp://IP_OF_ONE_NODE:2376 --tlsverify  --tlscacert ../certificates/ca.pem --tlscert ../certificates/clients/client.pem --tlskey ../certificates/clients/client-key.pem info

Optimisations

  • Currently the project works only on one AZ so that’s not very good for high availability
  • The project only support AWS