Deploying HA MongoDB on OpenShift using Portworx

15 février 2019Jason Dobies9 minutes (temps de lecture)

This is a guest post written by Gou Rao, CTO and Co-Founder of Portworx, leading the company’s technology, market, and solution execution strategy. Previously Gou was the CTO of Data Protection at Dell, in charge of the technical direction, strategy and architecture.

Portworx, is a cloud-native storage platform to run persistent workloads deployed on a variety of orchestration engines, including Kubernetes and Red Hat OpenShift. With Portworx, customers can manage the database of their choice on any infrastructure using Red Hat OpenShift. It provides a single Kubernetes storage and data management layer for all stateful services,wherever they run and is optimized for low-latency, high-throughput workloads like Cassandra, Kafka, PostgreSQL, ElasticSearch, and the subject of today’s post, MongoDB.

Portworx recently achieved Red Hat certification for Red Hat OpenShift Container Platform, and PX-Enterprise is available in the Red Hat Container Catalog. Learn more about Portworx & OpenShift in our Product Brief.

This tutorial is a walk-through of the steps involved in deploying and managing a highly available MongoDB database on OpenShift. In this tutorial, we are using a cluster running OKD.

In summary, to run HA MongoDB on OpenShift you need to:

Create an OpenShift cluster running at least three nodes
Install a cloud native storage solution like Portworx as a daemon set on OpenShift
Create storage class defining your storage requirements like replication factor, snapshot policy, and performance profile
Deploy MongoDB using Kubernetes
Test failover by killing or cordoning node in your cluster and confirming that data is still accessible
Dynamically resize MongoDB volume
Take a snapshot and backup MongoDB to object storage

Installing Portworx on OpenShift

Since OpenShift is based on Kubernetes, the steps involved in installing Portworx are not very different from the standard Kubernetes installation. Portworx documentation has a detailed guide with the prerequisites and all the steps to install on OpenShift.

Before proceeding further, ensure that Portworx is up and running on OpenShift.

We can check the status of Portworx by running the following commands:

$ PX_POD=$(oc get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')

Once the OKD cluster is up and running, and Portworx is installed and configured, we will deploy a highly available MongoDB database.

Creating a storage class for MongoDB

Through storage class objects, an admin can define different classes of Portworx volumes that are offered in a cluster. These classes will be used during the dynamic provisioning of volumes. The storage class defines the replication factor, IO profile (e.g. for a database or a CMS), and priority (e.g. SSD or HDD). These parameters impact the availability and throughput of workloads and can be specified for each volume. This is important because a production database will have different requirements than a development Jenkins cluster.

In this example, the storage class that we deploy has a replication factor of 3 with I/O profile set to “db,” and priority set to “high.” This means that the storage will be optimized for low latency database workloads like MongoDB and automatically placed on the highest performance storage available in the cluster. Notice that we also mention the filesystem, XFS in the storage class.

$ cat > px-mongo-sc.yaml << EOF

kind: StorageClass

apiVersion: storage.k8s.io/v1beta1

metadata:

    name: px-ha-sc

provisioner: kubernetes.io/portworx-volume

parameters:

   repl: "3"

   io_profile: "db"

   io_priority: "high"

   fs: "xfs"

EOF

Create the storage class and verify its available in the default namespace.

$ oc create -f px-mongo-sc.yaml
storageclass.storage.k8s.io "px-ha-sc" created

$ oc get sc
NAME                PROVISIONER            AGE
generic (default)   kubernetes.io/azure-disk        52m
px-ha-sc            kubernetes.io/portworx-volume   13s
stork-snapshot-sc   stork-snapshot            17m

Creating a MongoDB PVC on OpenShift

We can now create a Persistent Volume Claim (PVC) based on the Storage Class. Thanks to dynamic provisioning, the claims will be created without explicitly provisioning a persistent volume (PV).

$ cat > px-mongo-pvc.yaml << EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: px-mongo-pvc
  annotations:
    volume.beta.kubernetes.io/storage-class: px-ha-sc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
EOF

$ oc create -f px-mongo-pvc.yaml
persistentvolumeclaim "px-mongo-pvc" created

$ oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
px-mongo-pvc Bound pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf 1Gi RWO px-ha-sc 15s

Deploying MongoDB on OpenShift

Finally, let’s create a MongoDB instance as a Kubernetes deployment object. For simplicity’s sake, we will just be deploying a single Mongo pod. Because Portworx provides synchronous replication for High Availability, a single MongoDB instance might be the best deployment option for your MongoDB database. Portworx can also provide backing volumes for multi-node MongoDB replica sets. The choice is yours.

$ cat > px-mongo-app.yaml << EOF

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

  name: mongo

spec:

  strategy:

    rollingUpdate:

      maxSurge: 1

      maxUnavailable: 1

    type: RollingUpdate

  replicas: 1

  template:

    metadata:

      labels:

        app: mongo

    spec:

      schedulerName: stork

      containers:

      - name: mongo

        image: mongo

        imagePullPolicy: "Always"

        ports:

        - containerPort: 27017

        volumeMounts:

        - mountPath: /data/db

          name: mongodb

      volumes:

      - name: mongodb

        persistentVolumeClaim:

          claimName: px-mongo-pvc

EOF

$ oc create -f px-mongo-app.yaml

deployment.extensions "mongo" created

The MongoDB deployment defined above is explicitly associated with the PVC, px-mongo-pvc created in the previous step.

This deployment creates a single pod running MongoDB backed by Portworx.

$ oc get pods -l app=mongo

NAME                    READY STATUS RESTARTS   AGE

mongo-97b758c4c-d845d   1/1 Running 0       1m

We can inspect the Portworx volume by accessing the pxctl tool running with the Mongo pod.

The output from the above command confirms the creation of volumes that are backing MongoDB database instance.

Failing over MongoDB pod on OpenShift

Populating sample data

Let’s populate the database with some sample data.

We will first find the pod that’s running MongoDB to access the shell.

$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'`
$ oc exec -it $POD mongo

MongoDB shell version v4.0.0

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 4.0.0

Welcome to the MongoDB shell.

…..

Now that we are inside the shell, we can populate a collection.

db.ships.insert({name:'USS Enterprise-D',operator:'Starfleet',type:'Explorer',class:'Galaxy',crew:750,codes:[10,11,12]})

db.ships.insert({name:'USS Prometheus',operator:'Starfleet',class:'Prometheus',crew:4,codes:[1,14,17]})

db.ships.insert({name:'USS Defiant',operator:'Starfleet',class:'Defiant',crew:50,codes:[10,17,19]})

db.ships.insert({name:'IKS Buruk',operator:' Klingon Empire',class:'Warship',crew:40,codes:[100,110,120]})

db.ships.insert({name:'IKS Somraw',operator:' Klingon Empire',class:'Raptor',crew:50,codes:[101,111,120]})

db.ships.insert({name:'Scimitar',operator:'Romulan Star Empire',type:'Warbird',class:'Warbird',crew:25,codes:[201,211,220]})

db.ships.insert({name:'Narada',operator:'Romulan Star Empire',type:'Warbird',class:'Warbird',crew:65,codes:[251,251,220]})

Let’s run a few queries on the Mongo collection.

Find one arbitrary document:

db.ships.findOne()

{

"_id" : ObjectId("5b5c16221108c314d4c000cd"),

"name" : "USS Enterprise-D",

"operator" : "Starfleet",

"type" : "Explorer",

"class" : "Galaxy",

"crew" : 750,

"codes" : [

10,

11,

12

]

}

Find all documents and using nice formatting:

db.ships.find().pretty()

…..

{

"_id" : ObjectId("5b5c16221108c314d4c000d1"),

"name" : "IKS Somraw",

"operator" : " Klingon Empire",

"class" : "Raptor",

"crew" : 50,

"codes" : [

101,

111,

120

]

}

{

"_id" : ObjectId("5b5c16221108c314d4c000d2"),

"name" : "Scimitar",

"operator" : "Romulan Star Empire",

"type" : "Warbird",

"class" : "Warbird",

"crew" : 25,

"codes" : [

201,

211,

220

]

}

…..

Shows only the names of the ships:

Finds one document by attribute:

db.ships.findOne({'name':'USS Defiant'})

{

"_id" : ObjectId("5b5c16221108c314d4c000cf"),

"name" : "USS Defiant",

"operator" : "Starfleet",

"class" : "Defiant",

"crew" : 50,

"codes" : [

10,

17,

19

]

}

Exit from the client shell to return to the host.

Simulating node failure

Now, let’s simulate node failure by cordoning off the OpenShift node on which MongoDB is running.

$ NODE=`oc get pods -l app=mongo -o wide | grep -v NAME | awk '{print $7}'`
$ oc adm cordon ${NODE}

node "mycluster-node-1" cordoned

The above command disabled scheduling on one of the nodes.

$ oc get nodes

NAME                                            STATUS ROLES AGE VERSION

NAME                 STATUS      ROLES AGE VERSION

mycluster-infra-0    Ready      <none> 1h v1.9.1+a0ce1bc657

mycluster-master-0   Ready      master 1h v1.9.1+a0ce1bc657

mycluster-node-0     Ready      compute 1h v1.9.1+a0ce1bc657

mycluster-node-1     Ready,SchedulingDisabled   compute 1h v1.9.1+a0ce1bc657

mycluster-node-2     Ready      compute 1h v1.9.1+a0ce1bc657

Now, let’s go ahead and delete the MongoDB pod.

$ POD=`oc get pods -l app=mongo -o wide | grep -v NAME | awk '{print $1}'`

$ oc delete pod ${POD}

pod "mongo-97b758c4c-d845d" deleted

As soon as the pod is deleted, it is relocated to the node with the replicated data. STorage ORchestrator for Kubernetes (STORK), a Portworx-contributed open source storage scheduler, co-locates the pod on the exact node where the data is stored. It ensures that an appropriate node is selected for scheduling the pod.

Let’s verify this by running the below command. We will notice that a new pod has been created and scheduled in a different node.

$ oc get pods -l app=mongo -o wide

NAME                     READY STATUS RESTARTS   AGE IP NODE

mongo-97b758c4c-sssfg   1/1 Running 0       18s 10.129.0.7 mycluster-node-2

Let’s uncordon the node to bring it back to action.

$ oc adm uncordon ${NODE}

node "mycluster-node-1" uncordoned

Finally, let’s verify that the data is still available.

Verifying that the data is intact

Let’s find the pod name and run the ‘exec’ command, and then access the Mongo shell.

$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'`

$ oc exec -it $POD mongo

MongoDB shell version v4.0.0

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 4.0.0

Welcome to the MongoDB shell.

…..

We will query the collection to verify that the data is intact.

Find one arbitrary document:

db.ships.findOne()

{

"_id" : ObjectId("5b5c16221108c314d4c000cd"),

"name" : "USS Enterprise-D",

"operator" : "Starfleet",

"type" : "Explorer",

"class" : "Galaxy",

"crew" : 750,

"codes" : [

10,

11,

12

]

}
Find all documents and using nice formatting:

db.ships.find().pretty()

…..

{

"_id" : ObjectId("5b5c16221108c314d4c000d1"),

"name" : "IKS Somraw",

"operator" : " Klingon Empire",

"class" : "Raptor",

"crew" : 50,

"codes" : [

101,

111,

120

]

}

{

"_id" : ObjectId("5b5c16221108c314d4c000d2"),

"name" : "Scimitar",

"operator" : "Romulan Star Empire",

"type" : "Warbird",

"class" : "Warbird",

"crew" : 25,

"codes" : [

201,

211,

220

]

}

…..

Shows only the names of the ships:

db.ships.find({}, {name:true, _id:false})

{ "name" : "USS Enterprise-D" }

{ "name" : "USS Prometheus" }

{ "name" : "USS Defiant" }

{ "name" : "IKS Buruk" }

{ "name" : "IKS Somraw" }

{ "name" : "Scimitar" }

{ "name" : "Narada" }

Finds one document by attribute:

db.ships.findOne({'name':Narada'})

{

"_id" : ObjectId("5b5c16221108c314d4c000d3"),

"name" : "Narada",

"operator" : "Romulan Star Empire",

"type" : "Warbird",

"class" : "Warbird",

"crew" : 65,

"codes" : [

251,

251,

220

]

}

Observe that the MongoDB collection is still there and all the content is intact! Exit from the client shell to return to the host.

Performing Storage Operations on MongoDB

After testing end-to-end failover of the database, let’s perform StorageOps on our OpenShift cluster.

Expanding the OpenShift Volume with no downtime

Currently the Portworx volume that we created at the beginning is of 1Gib size. We will now expand it to double the storage capacity.

First, let’s get the volume name and inspect it through the pxctl tool.

If you have access, SSH into one of the nodes and run the following command.

$ POD=`/opt/pwx/bin/pxctl volume list --label pvc=px-mongo-pvc | grep -v ID | awk '{print $1}'`

$ /opt/pwx/bin/pxctl v i $POD

Volume :  270718527425014856

Name              :  pvc-4a43eaca-999f-11e8-9135-000d3a1a1cdf

Size              :  1.0 GiB

Format            :  xfs

HA                :  3

IO Priority       :  LOW

Creation time     :  Aug 6 17:36:46 UTC 2018

Shared            :  no

Status            :  up

State             :  Attached: mycluster-node-2 (10.2.0.4)

Device Path       :  /dev/pxd/pxd270718527425014856

Labels            :  pvc=px-mongo-pvc

Reads             :  130

Reads MS          :  249

Bytes Read        :  2326528

Writes            :  108

Writes MS         :  189

Bytes Written     :  2453504

IOs in progress   :  0

Bytes used        :  10 MiB

Replica sets on nodes:

Set 0

  Node   : 10.2.0.6 (Pool 0)

  Node   : 10.2.0.5 (Pool 0)

  Node   : 10.2.0.4 (Pool 0)

Replication Status  :  Up

Volume consumers  :

- Name           : mongo-97b758c4c-vlhrr (5ff99af2-999f-11e8-9135-000d3a1a1cdf) (Pod)

  Namespace      : default

  Running on     : mycluster-node-2

  Controlled by  : mongo-97b758c4c (ReplicaSet)

Notice the current Portworx volume. It is 1GiB. Let’s expand it to 2GiB.

$ /opt/pwx/bin/pxctl volume update $POD --size=2

Update Volume: Volume update successful for volume 270718527425014856s

Check the new volume size. It is expanded to 2GiB.

Taking Snapshots of an OpenShift volume and restoring the database

Portworx supports creating snapshots for OpenShift PVCs.

Let’s create a snapshot for the PVC we created for MongoDB.

cat > px-mongo-snap.yaml << EOF
apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: px-mongo-snapshot
namespace: default
spec:
persistentVolumeClaimName: px-mongo-pvc
EOF

$ oc create -f px-mongo-snap.yaml
volumesnapshot.volumesnapshot.external-storage.k8s.io "px-mongo-snapshot" created

Verify the creation of volume snapshot.

$ oc get volumesnapshot

NAME                AGE

px-mongo-snapshot   7s

$ oc get volumesnapshotdatas
NAME AGE
k8s-volume-snapshot-0cb1c49f-9325-11e8-bae2-0a580a800005 7s

With the snapshot in place, let’s go ahead and delete the database.

$ POD=`oc get pods -l app=mongo | grep Running | grep 1/1 | awk '{print $1}'`

$ oc exec -it $POD mongo
db.ships.drop()
Since snapshots are just like volumes, we can use it to start a new instance of MongoDB. Let’s create a new instance of MongoDB by restoring the snapshot data.

$ cat > px-mongo-snap-pvc << EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: px-mongo-snap-clone
annotations:
   snapshot.alpha.kubernetes.io/snapshot: px-mongo-snapshot
spec:
accessModes:
    - ReadWriteOnce
storageClassName: stork-snapshot-sc
resources:
   requests:
     storage: 2Gi
EOF

$ oc create -f px-mongo-snap-pvc.yaml
persistentvolumeclaim "px-mongo-snap-clone" created

From the new PVC, we will create a MongoDB pod.

cat < px-mongo-snap-restore.yaml >> EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mongo-snap
spec:
strategy:
   rollingUpdate:
     maxSurge: 1
     maxUnavailable: 1
   type: RollingUpdate
replicas: 1
template:
   metadata:
     labels:
       app: mongo-snap
   spec:
     affinity:
       nodeAffinity:
         requiredDuringSchedulingIgnoredDuringExecution:
           nodeSelectorTerms:
           - matchExpressions:
             - key: px/running
               operator: NotIn
               values:
               - "false"
             - key: px/enabled
               operator: NotIn
               values:
               - "false"
   spec:
     containers:
     - name: mongo
       image: mongo
       imagePullPolicy: "Always"
       ports:
       - containerPort: 27017
       volumeMounts:
       - mountPath: /data/db
         name: mongodb
     volumes:
     - name: mongodb
       persistentVolumeClaim:
         claimName: px-mongo-snap-clone
EOF

$ oc create -f px-mongo-snap-restore.yaml
deployment.extensions "mongo-snap" created

Verify that the new pod is in running state.

$ oc get pods -l app=mongo-snap

NAME                         READY STATUS RESTARTS  AGE

mongo-snap-85474d56c-f2ff7   1/1 Running 0  3m

Finally, let’s access the sample data created earlier in the walkthrough.

$ POD=`oc get pods -l app=mongo-snap | grep Running | grep 1/1 | awk '{print $1}'`

$ oc exec -it $POD mongo

MongoDB shell version v4.0.0

connecting to: mongodb://127.0.0.1:27017

MongoDB server version: 4.0.0

Welcome to the MongoDB shell.

…..

sdb.ships.find({}, {name:true, _id:false})

{ "name" : "USS Enterprise-D" }

{ "name" : "USS Prometheus" }

{ "name" : "USS Defiant" }

{ "name" : "IKS Buruk" }

{ "name" : "IKS Somraw" }

{ "name" : "Scimitar" }

{ "name" : "Narada" }

Notice that the collection is still there with the data intact. We can also push the snapshot to Amazon S3 if we want to create a Disaster Recovery backup in another Amazon region. Portworx snapshots also work with any S3 compatible object storage, so the backup can go to a different cloud or even an on-premises data center.