A POC framework to create Wordpress docker image with Ansible/Packer and deploy it to AWS ECS
First of all, I implemented the ansible role to install Wordpress in the Docker image. The role is testable using Molecule which improve development velocity. Molecule applies test workflow in the local docker daemon. Unit tests uses TestInfra to check the configuration done by Ansible. I encountered some issues on MacOSX with Molecule. Consequently, I created a vagrantfile to run it inside an Ubuntu virtual machine for development.
I use php:7.2-apache
as base image to focus only on the project big picture. The ansible role installs wp-cli which
aims to download and configure Wordpress. It also configures supports for medias and MySQL. Finally, It copies an
entrypoint file which catches environment variables to customize the configuration. The entrypoint calls wp-cli
to download, configure wordpress and the database. It is idempotent.
Molecule is only for testing purpose and does not build the final docker image. This is the Packer role through two different approaches :
- Local build with
wordpress-local.json
template which builds the image locally - Remote build with
wordpress.json
template which builds the image locally, tag it and push it to AWS ECR
There is a docker-compose file including a MySQL database for functional tests. It uses the image generated by the
wordpress-local.json
. I used it to validate the Wordpress image works fine with a MySQL database. Once done, I have
continued going on with Terraform to provision the needed AWS infrastructure. I have split the required components
into modules. I organized Terraform by environment with layers to reuse modules.
Each layer has its own terraform state stored in the S3 bucket. The dependencies between modules
are passed through the remote states.
Just below you can find the different components :
- AWS S3 to configure a S3 bucket to store Terraform states
- AWS VPC to reuse the default VPC, subnet and security group
- AWS ECR to persist Docker image
- AWS ECS to configure the ECS cluster
- AWS RDS to provide MySQL database for Wordpress
- AWS ECS Wordpress to configure Wordpress service and task in the ECS cluster
After implementing ECR, I built and pushed the Docker image with Packer to continue working on the ECS Wordpress service and task.
-
Create or use existing IAM user with API access. If you don't have an AWS account yet, signup to AWS account and create a user with API access.
-
Clone this repository
-
Install packer, terraform, awscli and vagrant.
When I have realized this project, I was using following version :
- packer: 1.7.2
- terraform: 0.15.4
- ansible: 2.9
- vagrant: 2.2.14
- awscli: 1.19.15
- docker client: 20.10.6
I installed them using brew
on MacOSX :
brew install packer terraform ansible awscli docker
-
Install Docker for Desktop
-
Check everything is ready :
make check
- Configure AWS environment variables in
$HOME/.aws/credentials
:
[default]
aws_access_key_id = <your_aws_access_key_id>
aws_secret_access_key = <your_aws_secret_access_key>
- Run molecule tests in Vagrant:
$ cd ansible
$ vagrant up
$ vagrant ssh
vagrant@ubuntu-xenial:~$ cd /vagrant/roles/wordpress
vagrant@ubuntu-xenial:~$ molecule test
$ exit
- Build Wordpress docker image in local :
$ cd ..
$ make build-local
- Run docker-compose :
$ docker-compose up
-
Access local wordpress when it is ready on http://localhost
-
Stop docker-compose with CTRL-C
-
Replace the bucket name in the different layers :
$ find terraform/environments/dev -type f -name "main.tf" -exec sed -i -e 's/guivin-terraform-states/<your_bucket>/g' {} \;
- Deploy aws-s3-bucket layer :
$ cd terraform/environments/dev/aws-s3-bucket
$ terraform init
$ terraform apply
- Deploy aws-ecr layer :
$ cd ../aws-ecr
$ terraform init
$ terraform apply
- Deploy aws-vpc layer :
$ cd ../aws-vpc
$ terraform init
$ terraform apply
- Deploy aws-ecs layer :
$ cd ../aws-ecs
$ terraform init
$ terraform apply
- Deploy aws-rds layer :
$ cd ../aws-rds
$ terraform init
$ terraform apply
- Deploy aws-ecs-wordpress layer :
$ cd ../aws-ecs-wordpress layer
$ terraform init
$ terraform apply
- Make terraform destroy on the inverse order :
- aws-ecs-wordpress
- aws-rds
- aws-ecs
- aws-vpc
- aws-ecr
- aws-s3-bucket
- ECR is used to store images in AWS account and use it in ECS.
- Packer has two templates. Both use 4 provisioners : local shell-local provisioner to install ansible dependencies, a shell provisioner to install ansible,
an ansible-local provisioner to install the wordpress role, and a shell provisioner to clean up ansible at last step.
In
wordpress.json
, the post-processor instruction generates a tagged image and upload to ECR registry. Inwordpress-local.json
, the post-processor instruction build the image in the local docker. - For cost reasons, the project uses the default VPC with associated security group and only one availability zone. aws-rds and aws-ecs-wordpress modules use a dedicated security group linked to the default security group. aws-rds opens TCP/3306 to the default security group and aws-ecs-wordpress the TPC/80 port to the Internet.
- The infrastructure is on a single AZ to avoid cost due to az transfer and RDS multi-az.
- Wordpress is deployed on ECS using Fargate meaning there is no EC2 instance needed for the cluster.
- There is an IAM role for the wordpress ECS service to manage permissions.
- A public IP is automatically assigned to the Wordpress ECS service.
- There is a Cloudwatch group to store ECS wordpress logs and investigate them.
- I had issues with Molecule on MacOSX. I created a Vagrant image as a workaround.
- I tried to use alpine as based image to have lightweight image as possible. Usually I use it to write classical
Docker image. Here it took me more time with ansible. Finally, I decided to switch to
php:7.2-apache
to not stay blocked on details and go ahead on the global project structure. This can be done later. - Classic elastic load balancer is not compatible with fargate. This is possible with application load balancer if you have at least two availability zones.
- I tried to implement custom networking with 3 availability zones and subnets, 1 nat gateway to reduce costs but I go back because it was not included in the AWS free-tier.
- You cannot affect aws_db_instance into specific subnets if you have not at least specifying two subnets on two different availability zones. If you use only one availability zone you need to specify it and not the subnets.
- There are a lot of terraform layers to deploy but this can be automated with custom wrapper or an existing tool (eg: Terragrunt).
- Dedicated VPC with /16 ip address range (eg: 10.1.0.0/16).
- 3 private subnets (eg: 10.1.0.0/24, 10.1.1.0/24, 10.1.2.0/24).
- 3 public subnets (eg: 10.1.3.0/24, 10.1.4.0/24, 10.1.5.0/24).
- A private and public subnet on each availability zone (eg: us-east-1a, us-east-1b, us-east-1c).
- 3 NAT gateways for the 3 public subnets.
- Elastic IPs for each NAT gateways.
- Route tables and association for all subnets.
- Separated security groups for each component.
- Internet gateway to interface the VPC with the Internet.
- Configure EFS to ensure persistent container volumes. EFS is already an HA service.
- Configure Cloudfront for CDN.
- Configure an application load-balancer to distribute load to containers located in different availability zones.
- Configure RDS cluster instead of a simple database instance. Ensure there is replication between the primary and read replicas. There should be at least one primary and one replica. The primary and replica(s) must be located in different availability zones.
- Configure auto-scaling group for ECS wordpress service to scale up/down following the load.
- Configure and use HashiCorp Vault/AWS Secret Manager to store and share sensitive data like credentials.
- Configure HashiCorp Consul for auto-discovery between micro-services.
- Configure DynamoDB table to lock Terraform to avoid concurrent changes.
- Configure alerting and observability using Cloudwatch or Prometheus for example.
- Configure log alerting for containers.
- Configure Route53 public domain.
- Configure Route53 private record to communicate between services.
- Configure HTTPS on the Wordpress service.
- Configure TLS between Wordpress and RDS in the best case or SSL.
- Configure CDN for wordpress.
- Store media on S3 and distribute them with CloudFront.
- Configure SES to sent emails.
- Refine container resource assignments.
- Store user sessions using Redis via AWS Elasticache to easily manage them with TTL expiration.
- Separate Terraform module in different git repositories with unit tests (kitchen), CI and version them with git tags.
- Separate Wordpress ansible role in another git repositories and configure CI to trigger molecule test for changes.
- Follow 12factor manifesto as much as possible.
- Use alpine as based image to have lightweight image.
- Use Terragrunt to deploy all the layers in one command and facilitate value exchange between layers.
Tomorrow we want to put this project in production. What would be your advice and choices to achieve that ?
- There is only one wordpress deployed, multiple wordpress containers are expected for production.
- Encrypt RDS EBS volumes and snapshots
- Configure TLS/SSL between load-balancer and containers.
- Configure HTTPS for Wordpress.
- Configure RDS cluster with primary and read replica(s).
- Configure a custom domain for Wordpress.
- Save all sensible values in HashiCorp Vault or AWS Secret Manager.
- Store logs in Cloudwatch or in ELK stack.
- Configure log alerting.
- Monitor services by collecting time-series metrics using by example Cloudwatch, Datadog, Prometheus/Grafana. Monitoring must include listening of ECS events to track container crash loops.
- Configure backups/snapshots of the database. Backups restoration must be automatically tested to ensure they are valid. They can be stored in a S3 bucket.
- Collect logs from VPC and ELBs for security audit.
- Configure CI/CD pipeline to automate tests and delivery.