Using strace inside a SQL Server Container

So, if you’ve been following my blog you know my love for internals. Well, I needed to find out exactly how something worked at the startup of a SQL Server process running inside a docker container and my primary tool for this is stracewell how do you run strace against processes running in a container? I hadn’t done this before and needed to figure this out…so let’s go through how I pulled this off.

The First (not so successful) Attempt

My initial attempt involved creating a second container image with strace installed and then starting that container in the same PID namespace at the SQL Server container. The benefit here is that I do need to do anything special to the SQL Server container…I can use an unmodified SQL Server image and create a container for running strace.

Create a dockerfile for a container and install strace inside the container

FROM ubuntu:16.04

RUN export DEBIAN_FRONTEND=noninteractive && \
    apt-get update && \
    apt-get install -yq curl gnupg apt-transport-https && \
    apt-get install -y strace && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists

CMD /bin/bash

Then build the container with docker build -t strace .

docker build -t strace .
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM ubuntu:16.04
 ---> a3551444fc85
Step 2/3 : RUN export DEBIAN_FRONTEND=noninteractive &&     apt-get update &&     apt-get install -yq curl gnupg apt-transport-https &&     apt-get install -y strace &&     apt-get clean &&     rm -rf /var/lib/apt/lists
 ---> Running in 2832df1c4921
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:2 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
...output omitted...
Fetched 179 kB in 0s (218 kB/s)
Selecting previously unselected package strace.
(Reading database ... 5300 files and directories currently installed.)
Preparing to unpack .../strace_4.11-1ubuntu3_amd64.deb ...
Unpacking strace (4.11-1ubuntu3) ...
Setting up strace (4.11-1ubuntu3) ...
Removing intermediate container 2832df1c4921
 ---> 686bc74ddd24
Step 3/3 : CMD /bin/bash
 ---> Running in 1b1ca2bb04d7
Removing intermediate container 1b1ca2bb04d7
 ---> d89cfe1231c1
Successfully built d89cfe1231c1
Successfully tagged strace:latest

With the container built let’s use it to run strace against our SQL Server process running in another container. 

Startup a container running SQL Server

docker run \
    --name 'sql19' \
    -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=‘$PASSWORD' \
    -p 1434:1433 \
    -d mcr.microsoft.com/mssql/server:2019-latest

Then start up our strace container and attach it to the PID namespace of the sql19 container. 

docker run -it \
    --cap-add=SYS_PTRACE \
    --pid=container:sql19 strace /bin/bash -c '/usr/bin/strace -f -p 1' 

A lot is going on in this command so let’s expand out each of the parameters

  • -it – this will attach the standard out of our container to our current shell. Basically, we’ll see the output of strace on our active console and can redirect to file if needed.
  • –cap-add=SYS_PTRACE – this adds the SYS_PTRACE capability to the container. This allows ptrace (the system call behind strace) the ability to attach to process. If this is not specified you will get an error saying ‘Operation not permitted’
  • –pid=container:sql19 – specifies the container and the namespace we want to attach to. This will start up our strace container in the same PID namespace as the sql19 container. With this there is one process namespace shared between the two containers, effectively they will be able to see each other’s processes which is what we want. We want the strace process to be able to see the sqlservr process.
  • strace – is the name of the container image we built above.
  • /bin/bash -c ‘/usr/bin/strace -p 1 -f’ – this is the command (CMD) we want to run inside the strace container. In this case, we’re starting a sh shell with the parameters to launch strace
  • strace -p 1 -f – the option  -p 1 will attach strace to PID 1 which is sqlservr and the -f option will attach to any forked processes from the traced process
 When we run this docker command we get this output
docker run -it    --cap-add=SYS_PTRACE    --pid=container:sql19 strace sh -c '/usr/bin/strace -p 1 -f'
/usr/bin/strace: Process 1 attached with 2 threads
[pid     9] ppoll([{fd=14, events=POLLIN}], 1, NULL, NULL, 8
[pid     1] wait4(10, 
 
We’re attaching to an already running docker container running SQL. But what we get is an idle SQL Server process this is great if we have a running workload we want to analyze but my goal for all of this is to see how SQL Server starts up and this isn’t going to cut it.
 
My next attempt was to stop the sql19 container and quickly start the strace container but the strace container still missed events at the startup of the sql19 container. So I needed a better way.
 
UPDATE: David Barbarin, fellow Data Platform MVP and SQL Server and Container expert, pursued the idea of using a second container and came up with a very elegant solution to this! He is using the sleep command at the launch of the SQL Server container then attaching a second strace container to the PID namespace. Using this technique he’s able to catch the startup events and not have to build a custom SQL Server container…check out the details here! Exactly what I’m looking for!
 
Also, as David points out in his post PID 1 is the watchdog process. I totally forgot about that in the code above. So when running the code above, swap -p 1 for the actual PID of the sqlservr process that is the child of PID 1. But a better way the is to use pgrep -P 1  to dynamical get the child process ID of PID 1.
 
So let’s use this technique to connect to the correct PID inside a running SQL Server container. This will attach to the child of PID 1, which will be the base sqlservr process that’s the database engine.
docker run -it \
    --cap-add=SYS_PTRACE \
    --pid=container:sql1 strace /bin/bash -c '/usr/bin/strace -f -p $(pgrep -P 1)' 
 
The Second (and more successful) Attempt

I want to attach strace to the SQL process at startup and the way that I can achieve that is by creating a custom container with SQL Server and strace installed. Then starting that container  telling strace to start up SQL Server process.

So let’s start by creating our custom SQL Server container with strace installed. Here’s the dockerfile for that

FROM ubuntu:16.04

RUN export DEBIAN_FRONTEND=noninteractive && \
    apt-get update && \
    apt-get install -yq curl gnupg apt-transport-https && \
    curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - && \
    curl https://packages.microsoft.com/config/ubuntu/16.04/mssql-server-preview.list | tee /etc/apt/sources.list.d/mssql-server.list && \
    apt-get update && \
    apt-get install -y mssql-server && \
 apt-get install -y strace && \     
apt-get clean && \ rm -rf /var/lib/apt/lists CMD /opt/mssql/bin/sqlservr

This is pretty standard for creating a SQL Server container the key difference here is that we’re installing the strace package in addition to the mssql-server package. Good news is, we can leave the CMD of the container as sqlservr…which means we can use this for general purpose database container as well as strace use cases. We’re going to use another technique to override CMD that when we start the container so that it will start a strace’d sqlservr process for us.

Let’s go ahead and build that container with docker build -t sqlstrace .

Sending build context to Docker daemon  127.1MB
Step 1/3 : FROM ubuntu:16.04
 ---> a3551444fc85
Step 2/3 : RUN export DEBIAN_FRONTEND=noninteractive &&     apt-get update &&     apt-get install -yq curl gnupg apt-transport-https &&     curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - &&     curl https://packages.microsoft.com/config/ubuntu/16.04/mssql-server-preview.list | tee /etc/apt/sources.list.d/mssql-server.list &&     apt-get update &&     apt-get install -y mssql-server &&     apt-get install -y strace &&     apt-get clean &&     rm -rf /var/lib/apt/lists
 ---> Running in 806a3b4b9345
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
...output omitted...
Setting up mssql-server (15.0.1900.25-1) …
...output omitted...
 ---> 42a1ca28ae72
Step 3/3 : CMD /opt/mssql/bin/sqlservr
 ---> Running in 1e57d6759df6
Removing intermediate container 1e57d6759df6
 ---> 6e3f5e82a177
Successfully built 6e3f5e82a177
Successfully tagged sqlstrace:latest

Once that container is built we can override the CMD that’s used to start the container defined in the dockerfile with another executable inside the container…you guessed it, strace.

docker run \
    --name 'sql19strace'  -it  \
    -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD='$PASSWORD \
    -p 1433:1433 \
     sqlstrace /bin/bash -c "/usr/bin/strace -f /opt/mssql/bin/sqlservr"

The first four lines of the docker run command are standard for starting a SQL Server container. But that last line is a bit different, we’re starting our sqlstrace container. Inside that container image we’re starting a bash shell and passing in the command (-c“/usr/bin/strace -f /opt/mssql/bin/sqlservr” which will start strace, following any forked processes (-f) and then start SQL Server (sqlservr). From there SQL Server will start up and strace will have full visibility into the process execution.  The cool thing about this technique is we can adjust our strace parameters as needed at the time we create the container. 

execve("/opt/mssql/bin/sqlservr", ["/opt/mssql/bin/sqlservr"], [/* 9 vars */]) = 0
brk(NULL)                               = 0x55b7bc77c000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
readlink("/proc/self/exe", "/opt/mssql/bin/sqlservr", 4096) = 23
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/opt/mssql/bin/tls/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/opt/mssql/bin/tls/x86_64", 0x7fffe9bc9510) = -1 ENOENT (No such file or directory)
open("/opt/mssql/bin/tls/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/opt/mssql/bin/tls", 0x7fffe9bc9510) = -1 ENOENT (No such file or directory)
open("/opt/mssql/bin/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
...output omitted... 

Above is the output of strace on SQL Server kicking off with an execve which is the system call used after a fork to swap in the new program into the new process space. 

Hopefully, this can help you get into those deep dive debugging/troubleshooting/discovery scenarios you may find yourself working with in SQL Server inside a container