SQL Server on Linux – How I think they did it!
OK, so everyone wants to know how Microsoft did it…how they got SQL Server running on Linux. In this article, I’m going to try to figure out how.
Update: Since the publication of this post, Microsoft has published a blog post detailing the implementation here.
There’s a couple of approaches they could take…a direct port or some abstraction layer…A direct port would have been hard, basically any OS interaction would have had to been looked at and that would have been time consuming and risk prone. Who comes along to save the day? Abstraction. The word you hear about a million times when you take Operating Systems classes in undergrad and grad computer science courses. :)
Well things are finally starting to come to light on how it was done. I had a Twitter conversation this weekend with Slava Oks, who is a leader on the project team and several other very active people in the SQL Community Klaus Aschenbrenner, Ewald Cress, and Lonny Niederstadt. This got my gears turning…to find out…how they did it!
What do we know so far?
So here’s what we know, there’s some level of abstraction going on using a layer called SQL Platform Abstraction Layer (SQLPAL) and also some directly ported code via SQLOSv2. From a design standpoint this it a pretty good compromise. Check out Figure 1, here you can see SQLPAL sits between the Database Engine and the underlying operating system. Whichever one it may be, Windows, Linux and oh yeah “other OS in Future” :)
Figure 1 – SQL Server on Linux – source @SQLRockstar
Background information
So to understand how we got here, it’s worth looking at the Drawbridge project from Microsoft Research. Drawbridge is basically application, or more specifically, process virtualization with a contained OS inside that process space. This is called a picoprocess. Since the process is abstracted away from the underlying operating system, the process will need some part of an OS inside its address space. This is called the Library OS. With that abstracted away…each process has a consistent view of it’s own operating environment. In figure 2, you can see the Library OS and it’s roots into ntoskrnl.dll, which is an NT user-mode kernel. This provides a consistent OS interface for the application. Essentially program code doesn’t need to change.
Now it’s up to the picoprocess as a whole to provide some abstraction back to the actual operating system and that’s where the Platform Abstraction Layer (PAL) comes in. All that’s left is to provide an application binary interface for the picoprocess and you have a completely self-contained process without the need to interact directly the host operating system. This is amazing stuff!
Figure 2 – Drawbridge Architecture – Source MS Research
SQLPAL – SQL Server Platform Abstraction Layer
So, I wanted to see this in action. In the Windows world, hard core SQL people are familiar with attaching a debugger to a SQL process and loading debug symbols to get a view into what’s going on inside of SQL Server. Well in Linux, we can do the same, and it’s a LOT easier. On Linux, there’s a tool called strace, which will give you a view into your programs execution and any interactions it has with the OS. So I launched SQL Server and strace and here’s what I found.
So to launch strace and SQL Server, we add the SQL Server binary as a parameter to strace. Caution, do not do this as root as it may cause a permission issue with log files generated by the sqlservr process. Use sudo to change to the msssql user.
[mssql@rhel1 ~]$ strace /opt/mssql/bin/sqlservr
execve(“/opt/mssql/bin/sqlservr”, [“/opt/mssql/bin/sqlservr”], [/* 24 vars */]) = 0
open(“/lib64/libstdc++.so.6”, O_RDONLY|O_CLOEXEC) = 3
open(“/opt/mssql/bin/../lib/libc++abi.so.1”, O_RDONLY|O_CLOEXEC) = 3
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
open(“/opt/mssql/lib/system.sfp”, O_RDONLY) = 3
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">During the load of system.sfp, we see several libraries, registry and DLL loads that look like they’re responsible for setting up the kernel level abstraction.</span>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
<div>
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, “Win8.dbmanifest\0”, 16, 4704) = 16</span>
</p>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
<div>
<span style="font-size: 12px;">Reading in the registry? Man that’s never going away :)</span>
</div>
<div>
</div>
<div>
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, “windows.hiv\0”, 12, 4753) = 12</span>
</p>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Reading in the NtOsKrn.dll, the NT user-mode kernel</span>
</div>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
<div>
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, “NtOsKrnl.dll\0”, 13, 5123) = 13</span>
</p>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
</div>
</div>
open(“/opt/mssql/lib/system.common.sfp”, O_RDONLY) = 4
pread(4, “kerberos.dll\0”, 13, 15055) = 13
open(“/opt/mssql/lib/sqlservr.sfp”, O_RDONLY) = 7
…omitted
pread(7, “sqlservr.exe\0”, 13, 13398) = 13
…omitted
pread(7, “SqlDK.dll\0”, 10, 14079) = 10
…omitted
pread(7, “sqllang.dll\0″, 12, 14382) = 12
…omitted
pread(7, “SQLOS.dll\0”, 10, 14418) = 10
…omitted
pread(7, “sqlmin.dll\0”, 11, 14511) = 11
nanosleep({999999999, 0}, 2016-11-17 14:11:37.53 Server Microsoft SQL Server vNext (CTP1) – 14.0.1.246 (X64)
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Nov 1 2016 23:24:39 </span>
</p>
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Copyright (c) Microsoft Corporation</span>
</p>
<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">on Linux (Red Hat Enterprise Linux)</span>
</p>
<div>
<span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
2016-11-17 14:11:37.53 Server Logging SQL Server messages in file ‘C:\var\opt\mssql\log\errorlog’.
<h2>
<span style="font-size: 12px;"><strong>SQLOSv2</strong></span>
</h2>
So in that Twitter conversation I had with Slava and others, we learned it’s not straight PAL, but a SQL Server specific PAL. This allows the product team to provide another path to the underlying OS for performance sensitive code. Look back at figure 1, you’ll see two paths from SQL Sever into SQLPAL. One uses the Win32 APIs, likely provided by Drawbridge (or some variant), and the other is perhaps natively compiled code…really that’s just a guess on my part.
Final thoughts
All, this is a pretty awesome time we’re getting into…Microsoft embracing Linux, SQL on Linux, PowerShell on Linux. I’ve said this many times…Windows, Linux…it’s just an OS. I would like to thank Slava for his insight and also the product team for a fantastic preview release. It’s amazing how seamless this really is.
In a sidebar conversation with Ewald, he made the point that as SQL Server professionals that our investment in the understanding of SQL Server’s internals will persist with this implementation. Which I think is a huge relief for those that have invested years into understanding it’s internals!
Please leave some comments on what your thoughts are on how this works. If you want to contact me directly, you can reach me at aen@centinosystems.com or @nocentino
Disclaimer
Well, if you made it this far…awesome! I want you to know, I don’t have any inside knowledge of how this was developed. I basically sat down and traced the code with the techniques I showed here.
References
https://www.microsoft.com/en-us/research/project/drawbridge/
https://blogs.msdn.microsoft.com/wsl/2016/05/23/pico-process-overview/