dbfs – command line access to SQL Server DMVs
With SQL Server on Linux, Microsoft has recognized that they’re opening up their products to a new set of users. People that aren’t used to Windows and it’s tools. In the Linux world we have a set of tools that work with our system performance data and present that to us as text. Specifically, the placeholder for nearly all of the Linux kernel’s performance and configuration data is the /proc virtual file system, procfs. Inside here you can find everything you need that represents the running state of your system. Processes, memory utilization, and disk performance data all of this is presented as files inside of directories inside /proc.
Now, let’s take this idea and extend it to SQL Server. In SQL Server we have DMVs, dynamic management views. These represent to current running state of our SQL Server. SQL Server exposes the data in DMVs as table data that we can query using T-SQL.
So, Microsoft saw the need to bring these two things together, we can expose the internals of SQL Server and its DMVs to the command line via a virtual file system. And that’s exactly what dbfs does, it exposes all of SQL Server’s DMVs as text files in a directory. When you access one of the text files…you’ll execute query against the SQL Server and the query output comes back to you via standard output to you Linux console. From there you can use any of your Linux command line fu…and do what you want with the data returned.
Setting up dbfs
So first, let’s go ahead and set this up. I already have the Microsoft SQL Server repo configured so I can install via yum. If you have SQL on Linux installed, you likely already have this repo too. If not, go ahead and follow the repo setup instructions here. To install dbfs we use yum on RHEL based distributions.
sudo yum install dbfs -y
sqlcmd -H localhost -U sa -p
CREATE LOGIN [dbfs_user] WITH PASSWORD=N'ThisSureIsntSecure', CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF GO
GRANT VIEW SERVER STATE TO [dbfs_user]
mkdir /tmp/dbfs
sudo vi /etc/dbfs.config
[server1] hostname=localhost username=dbfs_user password=ThisSureIsntSecure version=16
Running dbfs
dbfs -c /etc/dbfs.config -m /tmp/dbfs/
Using dbfs
cat dm_exec_connections | column -t
<p class="tab-convert:true lang:batch decode:true ">
And our output looks like this…
</p>
session_id most_recent_session_id connect_time net_transport protocol_type protocol_version endpoint_id encrypt_option auth_scheme node_affinity num_reads num_writes last_read last_write net_packet_size client_net_address client_tcp_port local_net_address local_tcp_port connection_id parent_connection_id most_recent_sql_handle 51 51 Jun 19 2017 09:46:33:660AM TCP TSQL 1946157060 4 FALSE NTLM 0 7 7 Jun 19 2017 09:46:34:103AM Jun 19 2017 09:46:34:107AM 8000 127.0.0.1 37641 127.0.0.1 1433 EDC82F4B-D333-4DCA-88BB-4AB2CE9 02000000a0c09f36765d6d3b8a15a90772d74e103ac8b653000000000000000
cat dm_exec_connections | column -t -s $'\t'
session_id most_recent_session_id connect_time net_transport protocol_type protocol_version endpoint_id encrypt_option auth_scheme node_affinity num_reads num_writes last_read last_write net_packet_size client_net_address client_tcp_port local_net_address local_tcp_port connection_id parent_connection_id most_recent_sql_handle 51 51 Jun 19 2017 09:29:36:340PM TCP TSQL 1895825409 4 FALSE SQL 0 7 7 Jun 19 2017 09:29:36:347PM Jun 19 2017 09:29:36:350PM 4096 127.0.0.1 42784 127.0.0.1 1433 EDFB041F-B319-4098-B4DE-80739A7 01000100f0e88f076013d837050000000000000000000000000000000000000
cat dm_exec_connections | awk '{ print $1,$3,$4,$7 }'
session_id connect_time net_transport endpoint_id 51 Jun 19 TCP
cat dm_exec_connections | awk -F $'\t' '{ print $1,$3,$4,$7 }'
session_id connect_time net_transport endpoint_id 51 Jun 19 2017 02:29:36:340PM TCP 4
cat dm_exec_connections | column -t -s $'\t' -o $'\t' | awk -F $'\t' '{ print $1,$3,$4,$7 }'
session_id connect_time net_transport endpoint_id 51 Jun 19 2017 02:29:36:340PM TCP 4
grep DAC dm_os_schedulers
00000005391c0040 64 1048576 0 VISIBLE ONLINE (DAC) 1 1 1 3 5 2 940180 0 00000005392aa160 000000053906e040 0000000539070040 4000 44 0 0
SQL folks…keep in mind, grep will only output lines matched, so we loose the column headers here since they’re part of the standard output stream when accessing the file/DMV data.
Moving forward with dbfs
We need the ability to execute more complex queries from the command line. Vin Yu mentions this here. As DBAs we already have our scripts that we use day to day to help us access, and more importantly make sense of, the data in the DMVs. So dbfs should allow us to execute those scripts somehow. I’m thinking we can have it read a folder on the local Linux system at runtime, create files for those scripts and throw them in the mounted directory and allow them to be accesses like any of the other DMVs. The other option is we place those scripts as views on the server and access them via dbfs. Pros and cons either way. Since it’s open source…I’m thinking about implementing this myself :)
Next is, somehow we need the ability to maintain column context throughout the output stream, for DBAs it’s going to be tough sell having to deal with that. I know JSON is available, but we’re talking about DBAs and sysadmins here as a target audience.
In closing is a great step forward…giving access into the DMVs from the command line opens up SQL Server to a set of people who are used to accessing performance data this way. Bravo!