How to install a scalable, production ready WordPress with Ansible – Part 1


ha-wordpress-ansible

WordPress is one of the most popular (CMS) web applications out there. According to w3techs.com … about 25% of the whole web is using WordPress. It is based on PHP and MySQL and can support a nearly limitless number of plugins. You can find a lot of tutorials on how to install and setup a wordpress application in different variations. Most of them only deal with a very basic setup on a single node – which can be a good start – but it is definitely not enough for running your online business on it.

In this series of posts, I will try to explain how to optimally set up a WordPress infrastructure for a real production environment. This includes

 

  • hardening of the server settings
  • implementing a backup strategy
  • automatically applying security updates (both OS and application)
  • setting up a monitoring service
  • and all that on a scalable and high available, load balanced infrastructure.
  • … everything that you need to run and operate your business application on wordpress …

 

Anyway – we will start small with a low price tag and build an architecture that can easily scale up.

I will be using Ansible as an automation tool to install and configure all the necessary components. If you have never heard about Ansible – you might want to checkout this What is Ansible video first.

In the first part of the series, I will focus on setting up the servers and installing common tasks, this will include the following:

 

  • Using Ansible to create the droplets on DigitalOcean.
  • Using Ansible to run the common tasks role.
  • Using Ansible to install necessary tools and set up firewall (fail2ban, iptables, etc.)

 

 

WordPress infrastructure

WordPress is a lightweight application that can be installed with all its components on a minimal node. But this minimal node will not support a big number of visitors and will crash when it reaches a specific limit of hits. Especially if you are running more than a blog, an e-commerce site for example using WooCommerce.
In that case, the appropriate solution is to set up a multi-tier architecture, that divides our application into layers of functional components, a simple diagram for this architecture will be as the following:

 

scalable wordpress architecture

scalable wordpress architecture

 

As shown in the figure, the setup will consist of three layers:

 

Layer 1 – Load Balancer
This will be a reverse proxy and load balancer for the backend application, HAproxy or nginx can be used for this purpose and they yield fantastic results. It also gives the opportunity of scaling the application by adding more application servers to the backend.

Layer 2 – Application Servers
Each application server will be configured with all components needed to run the WordPress application except the database component. The application files will be on a private git repository, and I will use Amazon S3 to host the static files.

Layer 3 – Database Backend
The last layer will be the database backend, which will be a MySQL database instance and will be configured to be used later in MySQL replication scenario.

 

Dynamic Inventory and Creating Droplets

You probably guessed it, I am going to use DigitalOcean as our host, the droplets will be in the same datacenter and the they will use the private network to communicate.
Ansible provides a basic way to specify remote hosts by using an inventory file. This is just a normal text file that contains information about each host:

node1

[west]
node2
node3

In this example, we have 3 nodes and two of them are grouped in the “west” group. Using this method will enable us to apply several tasks on all hosts in a specific group.
Maintaining a static inventory files might not be the best approach, when you manage several servers on the cloud. That’s why Ansible also provides a way to pull the inventory dynamically from a cloud service provider (like AWS, Linode, or DigitalOcean).

The dynamic inventory or (inventory script) is a script that communicates with the remote cloud service provider’s API and pulls information about the hosts.
Ansible uses a famous python wrapper called dopy to communicate with DigitalOcean’s API.

I will be using Jeff Geerling’s inventory script  … to start using the script with Ansible, you have to install the dopy python package on your local machine:

 

$ sudo pip install dopy

And then set the two variables which define the client id and the API key (you get it from your Digital Ocean Account):

$ export DO_CLIENT_ID=xxxxx
$ export DO_API_KEY=xxxxxx

The droplets.yml file will contain set of tasks to create the droplets and add them to the inventory, the file will look like the following:

 

---
- hosts: localhost
  connection: local
  gather_facts: false

 # Create load balancer droplets
  tasks:

    - name: Create Load Balancer Droplet
      digital_ocean:
        state: present
        command: droplet
        name: lb1
        private_networking: yes
        size_id: 66
        image_id: 13089493
        region_id: 7
        ssh_key_ids: 430781
        unique_name: yes
        wait_timeout: 600
      register: lb1

    - name: Add lb1 to the inventory
      add_host:
        ansible_ssh_host: "{{ lb1.droplet.ip_address }}"
        ansible_ssh_port: 22
        name: lb1
        groups: lbs, all_droplets
      when: lb1.droplet is defined

 # Create Application backend droplets
 # app1
    - name: Create Application server
      digital_ocean:
        state: present
        command: droplet
        name: app1
        private_networking: yes
        size_id: 66
        image_id: 13089493
        region_id: 7
        ssh_key_ids: 430781
        unique_name: yes
        wait_timeout: 600
      register: app1

    - name: Add App server to the Inventory
      add_host:
        ansible_ssh_host: "{{ app1.droplet.ip_address }}"
        ansible_ssh_port: 22
        name: app1
        groups: apps, all_droplets
      when: app1.droplet is defined

# Create Database servers backend
    - name: Create Database servers
      digital_ocean:
        state: present
        command: droplet
        name: db1
        private_networking: yes
        size_id: 66
        image_id: 13089493
        region_id: 7
        ssh_key_ids: 430781
        unique_name: yes
        wait_timeout: 600
      register: db1

    - name: Add database server to the inventory
      add_host:
        ansible_ssh_host: "{{ db1.droplet.ip_address }}"
        ansible_ssh_port: 22
        name: db1
        groups: dbs, all_droplets
      when: db1.droplet is defined
      with_items: dbs_droplets

- hosts:
    - lbs
    - dbs
    - apps
  remote_user: root
  tasks:
    - name: Wait for port 22 to become available
      local_action: "wait_for port=22 host={{ ansible_eth0.ipv4.address }}"

The previous file will create 3 droplets (lb1, app1, db1) and add each of them to seperate group for load balancers, application servers, and database servers respectively, each of them is 512MB droplet in the same datacenter (region), and private network enabled.

 

Setting up the Playbook

The playbook will consist of several plays, this post will discuss the first play which will contain the common tools and tasks to install on all hosts:

---
# Provision Droplets
- include: droplets.yml

# Play 1 - Common tasks
- hosts: all_droplets
  remote_user: root
  gather_facts: force
  roles:
    - { role: common, tags: ["common"] }
    - { role: openssh, tags: ["openssh"] }
    - { role: fail2ban, tags: ["fail2ban"] }
    - { role: rkhunter, tags: ["rkhunter"] }
    - { role: iptables, tags: ["iptables"] }

The first play consists of 5 roles that will be installed commonly on all servers, the next section will discuss the major idea behind each role and discuss how it works.

Note:
The post will not discuss each line of each role, but it will explain briefly the major points for each role, if you want to see the whole thing, you can checkout the playbook on Github.

Common role

The common role will install different packages, like (vim, sudo, screen, git, etc.) then it will enable scrolling inside screen, which is useful thing to have when troubleshooting, then the role will install ntp:

 

---
- name: Install needed packages
  apt: name={{ item }} state=present update_cache=yes
  with_items: common_packages

- name: Enable Screen Scrolling
  lineinfile: dest=/etc/screenrc line="termcapinfo xterm* ti@:te@" insertafter=EOF

- name: Install NTP
  apt: name=ntp state=present update_cache=yes

- name: Adding Swapfile
  include: swap.yml
  when: common_swap == True

- name: Adding user
  include: user.yml
  when: common_user == True

The role also will add a swap using swap.yml file and will add an administrator user to the servers with sudo capabilities.

OpenSSH role

I will be using ANXS’s role with some changes, this role will make sure that OpenSSH client and server packages are installed, it will also add the configuration for the client and server:

 

---

- name: Install client and server packages
  apt: name={{ item }} state=present update_cache=yes
  with_items:
    - openssh-client
    - openssh-server

- name: Add openssh client settings
  template: src=openssh_client.config dest=/etc/ssh/ssh_config owner=root group=root mode=644
  notify: Restart Openssh

- name: Add openssh server settings
  template: src=openssh_server.config dest=/etc/ssh/sshd_config owner=root group=root mode=644
  notify: Restart Openssh

Later I will discuss how to set and manage the variables for each role and for the hosts too.

 

Fail2ban role

fail2ban tool monitors log files and compare it for patterns (filters) and it take corresponding action, when it finds this pattern, for example it will scan /var/log/auth.log for several failed attempts and take action on an this malicious IP, the action may include updating the firewall to block this IP, or send a warning mail to the administrator.

The role will install the fail2ban tool and configure multiple section for common attacks:

 

---
- name: Install fail2ban package
  apt: name=fail2ban state=present update_cache=yes

- name: Add fail2ban template configuration
  template: src=jail.local.j2 dest=/etc/fail2ban/jail.local mode=644 owner=root group=root
  notify: Restart fail2ban

The sections will be defined in jail.local file, and it will include 2 section to protect of SSH DDoS attacks:

 

[ssh]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 6

[ssh-ddos]
enabled = true
port = ssh
filter = sshd-ddos
logpath = /var/log/auth.log
maxretry = 6

Rkhunter role

Rkhunter is a tool that scans the system for rootkits, exploits, and backdoors, it do that by comparing the hashes of certain files against a well known database also it scans for common rootkits location on the system, suspicious hidden files or wrong permissions.

The role will install rkhunter tool and add configuration file that defines some rules which can be overridden to your requirements, and then put the baseline for the rkhunter:

 

---

- name: Install rkhunter
  apt: name=rkhunter state=installed update_cache=yes install_recommends=no

- name: Add rkhunter configuration file
  template: src=rkhunter.conf.j2 dest=/etc/rkhunter.conf owner=root group=root
  notify: update rkhunter

Iptables role

Configuring the firewall is crucial step while configuring the production servers, each server will have different services and therefore different ports that needs to be configured with the iptables.

I will be using Spirula’s role for iptables with some changes. The role will add an init script for iptables, and configure the iptables script which will add rules for each port and configure policies. It will also open the firewall to the private network:

 

---
- name: Copy iptables init script
  template: src=iptables.init dest=/etc/init.d/iptables mode=0700

- name: Enable iptables init script
  command: update-rc.d iptables defaults

- name: Add iptables script
  template: src=iptables.sh dest=/root/iptables_script.sh mode=0700
  ignore_errors: yes

- name: Create /var/lib/iptables
  file: path=/var/lib/iptables state=directory

- name: Run iptables script
  command: /root/iptables_script.sh

As described in the roles, the iptables script will iterate over the ports defined by incoming, and outgoing variables to allow either incoming or outgoing ports.

for example, the incoming ports for the application server will be:

 

incoming:
  - name: SSH
    port: 22
  - name: HTTP
    port: 80
  - name: HTTPS
    port: 443

and in the iptables script, it will loop over the incoming variable to define each rule, as following:

 

# Incoming
{% for service in incoming %}
{{'##'|e }} {{ service.name }}
{{'##'|e }} {{'=' * service.name|length }}

iptables -A INPUT  -p {{ service.protocol | default('tcp') }} {{ '-s '+service.source if service.source is defined else   '' }} --dport {{ service.port }}  -j ACCEPT

iptables -A OUTPUT -p {{ service.protocol | default('tcp') }} {{ '-d '+service.source if service.source is defined else   '' }} --sport {{ service.port }}  {{ '' if service.protocol is defined and service.protocol == 'udp' else '! --syn ' }} - j ACCEPT
{% endfor %}

Managing Variables

Each role has two places to define variables: defaults and vars directories, I used defaults to list the default variables for each role, for example the common role has its own defaults:

 

common_packages:
    - vim
    - screen
    - sudo 
    - htop
    - strace
    - curl
    - wget
    - git-core

common_user: True
common_swap: True

# common_user variables
common_users: 
  - name: myadmin
    key: "{{ lookup('file', 'myadmin.pub') }}"

# common_swap variables
swap_size: 512
swap_path: /swapfile

The defaults variables can be overridden by other places where you can define variables like the group_vars or the command line, and in our case I created a group_vars directory which has 4 files:

 

all.yml This file contains all variables common between all servers (load balancers, apps, databases)
dbs.yml This file contains all variables that affect database servers, and the naming is important to match the dbs.yml to the dbs group in the inventory.
apps.yml This file contains all variables that affect app servers.
lbs.yml This file contains all variables that affect load balancers.

 

Running the Playbook

Before running the Ansible playbook against the three nodes, let’s take a look of the playbook’s structure:

 

.
├── digital_ocean.py
├── droplets.yml
├── group_vars
│ ├── all.yml
│ ├── apps.yml
│ ├── dbs.yml
│ └── lbs.yml
├── playbook.yml
└── roles
├── common
├── fail2ban
├── iptables
├── openssh
└── rkhunter

You can clone the full code from https://github.com/slash4-de/install-wordpress-ansible

To run the playbook, execute the following command:

$ ansible-playbook -i digital_ocean.py playbook.yml

What’s Next

In the next post, I will add the roles for installing and configuring server’s main components, including mysql, php5, nginx, etc.. also I will use ansible vault to secure passwords, stay tuned.