피드 구독

One of Podman’s features is to be able to checkpoint and restore running containers. This means that Podman can "pause" a running container indefinitely, and then restart the container from where it left off on the same or another system. In this post, we'll look at how you can do this with a container running Microsoft's SQL Server.

Podman uses CRIU (Checkpoint/Restore In Userspace) to do the actual checkpointing and restoring of the processes inside of the container. This time I am using Podman 1.64 on Red Hat Enterprise Linux 8.2 with Microsoft’s SQL Server 2019. The documentation about how to set up SQL Server 2019 using Podman is on Microsoft's site

To start a container with SQL Server 2019 I am using this command on my RHEL 8 system:

# podman run -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=Pass0..0worD' --cap-add cap_net_bind_service -p 1433:1433 -d mcr.microsoft.com/mssql/rhel/server:2019-CU1-rhel-8

If you want to persist your data even after running podman rm it is necessary, as described in the documentation, to mount a host directory as a data volume into your container using:

-v <host directory>:/var/opt/mssql

To access the database running in the container I am installing the sqlcmd tool using the curl command and then yum.:

# curl https://packages.microsoft.com/config/rhel/7/prod.repo > /etc/yum.repos.d/msprod.repo
# yum -y install mssql-tools unixODBC-devel

Once the container is running I am able to connect to the server process using the just installed sqlcmd:

# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> SELECT Name from sys.Databases
2> go
Name                                                                                                                         
----------------------------------------------------------------
master 
tempdb
model 
msdb 

(4 rows affected)
1>

To add some data to the database I created a few small SQL scripts:

#️ cat create-table.sql
CREATE TABLE things (id INT, name NVARCHAR(50))
GO
#️ for i in `seq 1 10000`; do echo "insert into things values($i, 'thing $i')"; done > insert.sql ; echo go >> insert.sql

Running those two SQL scripts (create-table.sql, insert.sql) enables me to query the database:

# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD -i create-table.sql
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD -i insert.sql
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> select count(*) from things
2> go
        
-----------
   10000

(1 rows affected)
1>

At this point the SQL Server 2019 is running and has data to answer our queries. Now I would like to reboot the system to install a new kernel. Podman’s checkpoint/restore feature can help me now. First I will checkpoint the container then reboot the system and once the system has been rebooted, I will restore the container from the checkpoint.

# podman container checkpoint -l --tcp-established
# reboot

Without telling Podman (and thus CRIU) to checkpoint the container while keeping the established TCP connection intact (--tcp-established) the checkpointing will fail.The container has been now checkpointed with all its data and once the system has been rebooted, I can restore the container and query the database again:

# podman container restore -l --tcp-established
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> select count(*) from things
2> go
        
-----------
   10000

(1 rows affected)
1>

Restoring the container from the checkpoint after the reboot gives me back the database in the same state it was in before doing the checkpoint. All my data is still there even after having rebooted my system, which means I can reboot into a new kernel without losing the state of my database.

To make this work I had to actually change one of CRIU’s configuration files to handle open but deleted files correctly:

# cat /etc/criu/runc.conf
ghost-limit 40000000

In its default configuration CRIU will give an error message that it has a size limit for this type of files and therefore this configuration file change is necessary.

The value for ghost-limit does not depend on the database size. To identify the right value for ghost-limit, I was running the command podman container checkpoint -l which lead to the following error message: 

# podman container checkpoint -l
ERRO[0000] container is not destroyed                
ERRO[0000] criu failed: type NOTIFY errno 0
log file: /var/lib/containers/storage/overlay-containers/<ID>/userdata/dump.log

Looking at the log file dump.log I saw:

Error (criu/files-reg.c:899): Can't dump ghost file /var/opt/mssql/.system/profiles/Temp/82ff05c0df15b61e96c09a878c06ed07 of 20975616 size, increase limit

Increasing the ghost-limit to 40000000 as mentioned above resolves this error messages

Podman also offers the possibility to take multiple checkpoints of a container. Using the option --export Podman can be told to create external checkpoints:

# podman container checkpoint -l --leave-running --tcp-established --export=/tmp/checkpoint1.tar.gz

This tells Podman to create a checkpoint and store all the information about this checkpoint in the file /tmp/checkpoint1.tar.gz while the container keeps on running (--leave-running). Using a different file name with the --export option I can create as many checkpoints necessary at different points in time. This exported checkpoint can then be transferred to another system and I can migrate the running database from one system to another using the podman container restore command like this:

# podman container restore --import=/tmp/checkpoint1.tar.gz --tcp-established

If I want to migrate the container to another system, and if the data is made to persist using a host directory as data volume (-v <host directory>:/var/opt/mssql), it is also necessary to transfer that host directory to the destination host of the container migration. If no host directory is mounted as a data volume the checkpoint archive created using the --export option contains all relevant information and data to migrate the container from one host to another.

Podman’s checkpoint and restore features enables me to reboot my system without losing the state of my running database. One of the advantages of this technology is that data which has already been cached in memory stays cached and queries can be answered just as fast as before a reboot. It also enables me to move my database to another host without losing the state of my database.


저자 소개

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리