In Part 2 of this series, we created one WordPress instance connected to a MySQL server on top of Red Hat OpenShift Container Platform. Now it’s time to scale from one deployment to more by better using our available resources. Once we’ve done that, we’ll show a failure scenario again, illustrating what effect different storage backends can have.
OpenShift on AWS test environment
All posts in this series use a Red Hat OpenShift Container Platform on AWS setup that includes 8 EC2 instances deployed as 1 master node, 1 infra node, and 6 worker nodes that also run Red Hat OpenShift Container Storage Gluster and Heketi pods.
The 6 worker nodes are basically the storage provider and persistent storage consumers (MySQL). As shown in the following, the OpenShift Container Storage worker nodes are of instance type m5.2xlarge with 8 vCPUs, 32 GB Mem, and 3x100GB gp2 volumes attached to each node for OCP and one 1TB gp2 volume for OCS storage cluster.
The AWS region us-west-2 has availability zones (AZs) us-west-2a, us-west-2b, and us-west-2c, and the 6 worker nodes are spread across the 3 AZs, 2 nodes in each AZ. This means the OCS storage cluster is stretched across these 3 AZs. Below is a view from the AWS console showing the EC2 instances and how they are placed in the us-east-2 AZs.
WordPress/MySQL setup
In Part 2 of this series, we showed how to use a stateful set to create one Wordpress/MySQL project. One deployment on a 6-node cluster is not a typical use case, however. To take our example to the next level, we will now create 60 identical projects, each running one WordPress and one MySQL pod.
The RAM available in our cluster is why we will use 60 deployments; Every compute node is equipped with 32 GB of RAM, so if we deploy 60 instances, each of which uses 2GB for the MySQL pod, will use 120 GB of the available overall 192 GB. That will leave enough memory available for the OpenShift cluster and the WordPress pods.
oc get projects | grep wp | wc -l 60
A closer look to project wp-1 shows us that it’s identical to what we used earlier:
oc project wp-1 oc get all NAME READY STATUS RESTARTS AGE pod/mysql-ocs-0 1/1 Running 0 10m pod/wordpress-1-6jmkt 1/1 Running 0 10m pod/wordpress-1-build 0/1 Completed 0 10m NAME DESIRED CURRENT READY AGE replicationcontroller/wordpress-1 1 1 1 10m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/glusterfs-dynamic-81b7b6cf-3f46-11e9-a504-02e7350e98d2 ClusterIP 172.30.176.74 <none> 1/TCP 10m service/mysql-ocs ClusterIP 172.30.27.4 <none> 3306/TCP 10m service/wordpress ClusterIP 172.30.23.152 <none> 8080/TCP,8443/TCP 10m NAME DESIRED CURRENT AGE statefulset.apps/mysql-ocs 1 1 10m NAME REVISION DESIRED CURRENT TRIGGERED BY deploymentconfig.apps.openshift.io/wordpress 1 1 1 config,image(wordpress:latest) NAME TYPE FROM LATEST buildconfig.build.openshift.io/wordpress Source Git 1 NAME TYPE FROM STATUS STARTED DURATION build.build.openshift.io/wordpress-1 Source Git@4094d36 Complete 10 minutes ago 20s NAME DOCKER REPO TAGS UPDATED imagestream.image.openshift.io/wordpress docker-registry.default.svc:5000/wp-1/wordpress latest 10 minutes ago NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/wordpress wordpress-wp-1.apps.ocpocs311.sagyocpocsonaws.com wordpress 8080-tcp None
Because configuring 60 WordPress instances can be tedious, we automated the process, using curl in a bash script:
#!/bin/bash START=1 END=60 # number of WordPress / MySQL projects # set up the WordPress instances to attach to the corresponding MySQL DBs function configure { echo Configuring host: $HOST curl -c /tmp/cookie $1/wp-admin/setup-config.php?step=1 2>&1 > /dev/null curl -b /tmp/cookie --data "dbname=wordpress&uname=admin&pwd=secret&dbhost=mysql-ocs&prefix=wp_&submit=Submit" $1/wp-admin/setup-config.php?step=2 2>&1 > /dev/null curl -b /tmp/cookie --data "weblog_title=Title&user_name=admin&admin_password=secret&pass1-text=secret&admin_password2=secret&pw_weak=on&admin_email=admin%40somewhere.com&Submit=Install+WordPress&language=en_US" $1/wp-admin/install.php?step=2 2>&1 > /dev/null } # get all the hosts we need to configure for (( i=$START; i<=$END; i++ )) do echo Sleeping for 2 minutes to allow pods to come up... sleep 120 HOST=$(oc get route wordpress -n wp-$i | grep -v NAME | cut -d " " -f 4) configure $HOST done
We now have our 60 projects running on Gluster-backed storage, one glusterfs volume per deployment. Each of these 60 projects comprises a namespace (synonym for project in the OpenShift terminology). There are 2 pods in each namespace (one WordPress pod, one MySQL pod) and one Persistent Volume Claim (PVC). This PVC is the storage on which MySQL will keep the database contents.
oc get project | grep wp | wc -l
60
Therefore, we have 60 projects up and running, each of which has its own PVC:
oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mysql-ocs-data-mysql-ocs-0 Bound pvc-81b7b6cf-3f46-11e9-a504-02e7350e98d2 8Gi RWO glusterfs-storage 15m
Failure scenario 1: WordPress/MySQL backed by Open Container Storage
Now we want to see how long it takes a higher number of our WordPress/MySQL deployments to be restarted after a simulated node/instance failure. To do that, we will, again, cordon one of our nodes and delete the pods running on that node:
oc get nodes | grep compute NAME STATUS ROLES AGE VERSION ip-172-16-26-120.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0 ip-172-16-27-161.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0 ip-172-16-39-190.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0 ip-172-16-44-7.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0 ip-172-16-53-212.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0 ip-172-16-56-45.us-west-2.compute.internal Ready compute 28d v1.11.0+d4cacc0
That done, let’s find all MySQL pods on one of our preceding compute nodes (should be 10 pods on each compute node):
oc adm manage-node ip-172-16-26-120.us-west-2.compute.internal --list-pods | grep -i mysql Listing matched pods on node: ip-172-16-26-120.us-west-2.compute.internal wp-1 mysql-ocs-0 1/1 Running 0 13m wp-13 mysql-ocs-0 1/1 Running 0 12m wp-20 mysql-ocs-0 1/1 Running 0 12m wp-22 mysql-ocs-0 1/1 Running 0 12m wp-27 mysql-ocs-0 1/1 Running 0 12m ...omitted
So these are the pods running on the node we will cordon. Similar to the method we used in Part 2, we’ve set up a monitoring routine continuously retrieving the HTTP status for the start page of the WordPress site. Now we cordon the node ip-172-16-26-120.us-west-2.compute.internal and then delete all MySQL pods on it, using the following script:
#!/bin/bash TARGET=$1 # get the namespaces in which the mysql pods live NAMESPACES=$(oc adm manage-node $TARGET --list-pods 2<&1 | grep -i mysql | awk '{print $1}') # cordon the node echo Cordoning $TARGET oc adm cordon $TARGET # force delete the pods to simulate a node failure echo Deleting mysql pods on $TARGET for NAME in $NAMESPACES do oc delete pod mysql-ocs-0 -n $NAME --force --grace-period=0 done
In the tests we’ve performed, we used 60 WordPress/MySQL instances with a distribution of 10 per compute node. Our monitoring script gave us the time between the first noticed failure on any of those 10 WordPress instances and the last one. We ran 5 identical tests, taking the average time it took all 10 instances to be fully functional again. That average time was 20 seconds.
In other words, from the first pod failure to the last pod recovery was as short as 20 seconds. This is not the time one MySQL pods takes to restart but rather is the total recovery time for all the failed pods. After this time, all the MySQL pods using glusterfs storage are back up and running the HTTP status for the start page of the WordPress successfully.
Note: For a higher number of MySQL pods, it may be necessary to increase fs.aio-max-nr on the compute nodes. Details about the reason and the solution can be found here.
Failure scenario 2: WordPress/MySQL backed by Amazon’s EBS volumes
The next step is to redo the tests on a different storage back end. We’ve chosen Amazon EBS storage. This type of storage comes with a few limitations compared to the gluster-based backend we used earlier:
-
First and most important to our tests is that EBS volumes cannot migrate between AZs. For our testing, that means pods can only migrate between two nodes, as we only have two OCP nodes per AZ.
-
Furthermore, we can only attach up to 40 EBS volumes to one node because of Linux-specific volume limits. Beyond this limit, it may not work as expected (AWS Instance Volume Limits).
To simulate the most comparable setup to the OCS-backed test we showed in Scenario 1, we decided to stick to 10 WordPress/MySQL instances per node. Additionally, our testing followed the same steps in Scenario 1:
-
Set up monitoring for the WordPress instances.
-
Cordon the node that runs the pods.
-
Delete the MySQL pods on the node.
-
Record the time it takes for all WordPress instances to be functional again.
-
Un-cordon the node.
While the time to re-instantiate all MySQL pods took about 20 seconds for 10 pods on OCS. For the 5 identical tests ran, we see an average time of 378 seconds for 10 similar pods on EBS storage.
Conclusion
The tests in this post show a few things:
-
OpenShift Container Storage can provide faster failover and recovery times compared to native EBS storage in case of a node failure, which can result in higher availability for your application.
Additionally, if we configured only one node per AZ--which is quite common--there would be an outage with the setup configured for EBS-only, which is not the case with OpenShift Container Storage because the latter can provide high availability across AZs. It’s important to note that OpenShift Container Storage also is backed by EBS volumes, but the abstraction layer that GlusterFS introduces can reduce the time for re-attaching the volume inside the MySQL pod following a failure.
Storage backend |
Time to failover 10 MySQL pods |
---|---|
OpenShift Container Storage |
20 seconds |
EBS volumes |
378 seconds |
-
OpenShift Container Storage spans the storage availability over different AZs and helps increase the reliability of the OpenShift cluster.
-
The usage of OpenShift Container Storage makes the instance volume limit for EBS less problematic, as a lower number of larger volumes can be used to host the required persistent volumes. Again, this is a benefit of the GlusterFS abstraction layer introduced through deploying OpenShift Container Storage.
The next blog post in this series will be about using the SysBench 0.5 database testing tool to measure MySQL read/write performance on OCS. Since real tuning is scale driven, this blog will feature many (60) small MySQL databases (10GB) and the results will be published for RWX (GlusterFS volume) and RWO (GlusterBlock volumes). The failure scenario is this blog post will also be repeated but this time with SysBench read/write load.
Sobre los autores
Navegar por canal
Automatización
Las últimas novedades en la automatización de la TI para los equipos, la tecnología y los entornos
Inteligencia artificial
Descubra las actualizaciones en las plataformas que permiten a los clientes ejecutar cargas de trabajo de inteligecia artificial en cualquier lugar
Nube híbrida abierta
Vea como construimos un futuro flexible con la nube híbrida
Seguridad
Vea las últimas novedades sobre cómo reducimos los riesgos en entornos y tecnologías
Edge computing
Conozca las actualizaciones en las plataformas que simplifican las operaciones en el edge
Infraestructura
Vea las últimas novedades sobre la plataforma Linux empresarial líder en el mundo
Aplicaciones
Conozca nuestras soluciones para abordar los desafíos más complejos de las aplicaciones
Programas originales
Vea historias divertidas de creadores y líderes en tecnología empresarial
Productos
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Servicios de nube
- Ver todos los productos
Herramientas
- Training y Certificación
- Mi cuenta
- Soporte al cliente
- Recursos para desarrolladores
- Busque un partner
- Red Hat Ecosystem Catalog
- Calculador de valor Red Hat
- Documentación
Realice pruebas, compras y ventas
Comunicarse
- Comuníquese con la oficina de ventas
- Comuníquese con el servicio al cliente
- Comuníquese con Red Hat Training
- Redes sociales
Acerca de Red Hat
Somos el proveedor líder a nivel mundial de soluciones empresariales de código abierto, incluyendo Linux, cloud, contenedores y Kubernetes. Ofrecemos soluciones reforzadas, las cuales permiten que las empresas trabajen en distintas plataformas y entornos con facilidad, desde el centro de datos principal hasta el extremo de la red.
Seleccionar idioma
Red Hat legal and privacy links
- Acerca de Red Hat
- Oportunidades de empleo
- Eventos
- Sedes
- Póngase en contacto con Red Hat
- Blog de Red Hat
- Diversidad, igualdad e inclusión
- Cool Stuff Store
- Red Hat Summit