Julien Marodon

Développeur indépendant

Spécialiste Programmation système, Bases de données et Embarqué

img/contenu_deco1.png

Deauville - Trouville - Caen - Bayonne

pgcputrack: an Open-source PostgreSQL CPU tracking tool

License: GPLv2, available here on GitHub

What:

A tool for detailed tracking and logging of PostgreSQL on Linux CPU use by origin, database and user.

installation:
Compile on a machine with same Debian version or distribution version and CPU type (instructions shown here for Debian).
Compilation has been tested on Wheezy and Jessie with both g++ and clang++.

user@devel:~$ sudo apt-get install build-essential libprocps-dev
user@devel:~$ git clone http://github.com/JMD-tech/pgcputrack
user@devel:~$ cd pgcputrack
user@devel:~/pgcputrack$ make
g++ -std=c++11 -c -o pgcputrack.o pgcputrack.cpp
cc -lprocps -lstdc++ -lrt -g pgcputrack.o -o pgcputrack
g++ -std=c++11 -c -o showpgcpuse.o showpgcpuse.cpp
cc -lprocps -lstdc++ -lrt -g showpgcpuse.o -o showpgcpuse

Then copy the binaries to the server:
note: runtime version of libprocps is already installed by default for ps/top etc. TODO: Debian package

user@devel:~/pgcputrack$ scp pgcputrack showpgcpuse root@db-server:/root/


usage:
First use the pgcputrack tool to record PostgreSQL CPU use:

root@db-server:~# ./pgcputrack > pgcpustats.txt

Use CTRL+C to stop recording.
Or use nohup+kill to capture without keeping console open:

root@db-server:~# nohup ./pgcputrack > pgcpustats.txt &
root@db-server:~# killall pgcputrack

TODO: add daemon mode

Then use the showpgcpuse tool to display statistics:

root@db-server:~# ./showpgcpuse pgcpustats.txt
CPU BY DB:
1869                  13320 ms  4%
2325                  13330 ms  4%
2809                  20620 ms  7%
7148                 149455 ms 54%
8016                  11310 ms  4%
collector              3805 ms  1%
global                30290 ms 10%
launcher                785 ms  0%
worker                 2825 ms  1%
writer                   65 ms  0%

CPU BY USER:
apiv1                 95875 ms 34%
apiv2                144985 ms 52%
autovacuum             3610 ms  1%
dataupdater           27035 ms  9%
stats                  3805 ms  1%
wal                      65 ms  0%

CPU BY ORIGIN:
10.45.8.79            10070 ms  3%
127.0.0.1              6965 ms  2%
172.19.203.101       134300 ms 48%
172.19.203.102       106165 ms 38%
[local]                 395 ms  0%
process                7480 ms  2%

Total CPU USED: 275.385 s (avg 59.1% during 465.504 s)

note: The totals doesn't add-up as for readability i've removed most of the less significant data (30 subsidiaries) and also changed users, IP and DB names. Doesn't add-up with actual data anyway (about 85%), the rest being connection overhead and very short queries that fall under the polling interval. But still shows which subsidiary had a performance problem.
TODO: make some Lazarus frontend to display nice shiny graphics charts... useless but more marketing value.

Why:

While faced with increasing load on 2 database servers at one of our main customers, we needed to identify where was the most CPU used.
As PostgreSQL forks a client handler per connection, and sets its process title to include database and user, it would not be too difficult to track this by process information tools and APIs, right?
Obviously top and htop only show an instant view, and with about 100+ concurrent connections it isn't practical at all.
After a quick tour of existing PostgreSQL monitoring solutions, none was providing this accurate and persistent picture of DB CPU load.
Also as it was to run on production servers that experienced high load, it was mandatory that it was an external tool with no interaction on the database server process, ie not requiring an active DB connection nor injection of some debugging library or additionnal module into postgres process.
So i decided it was time to write the highly needed but not yet existing tool :)

How:

So the idea was to write a small tool that does exactly 3 things:

This sounds simple, however i stumbled upon many annoyances, each was addressable with various workarounds or switching of API/libs but the process was less straightforward that i first thought.

While searching for process operations notifications, i quickly came across an excellent piece of code by BewareOfGeek that shows an example of netlink sockets for process notification events.
A sample code self contained in a single c file that compiles out of the box on default options, produces a minimalist tool that can already be useful on its own, and as the code is very neat and readable it makes a perfect starting block to build upon for everything process-tracking based.
(another unrelated idea came to mind for a reliable and tamper-proof way to keep track of all commands issued by sudoers even when they would sudo su out of lazy habit or maliciously spawn a shell from some userland app, maybe i'll code this one too if i have the need)

This was easy, straightforward and relaxing.
Handle fork and identify postgres forked processes, handle exit and get cmdline, stime+utime

Next, on to extracting process "title" and resource usage. getrusage sounded good, if only it could retrieve information on any process instead of caller and its child processes.
Some searches and digging into ps source code directed me at libprocps.
Had some doubt on the ability to extract process information on exit, as we are notified by a socket, so the process information we need might as well have been cleaned by the kernel by the time we get the event.
Working around this would simply require storing information we got at start in an associative array by pid and use those data when notified of process exit. Oh, except that stime+utime actually change during process life, so we would need periodic updates too... sounds worse both for implementation cleanliness and CPU overhead.

Decided to first ignore this concern and implement reading data at exit anyway, and see how that'll do.
Worked half-good:
utime and stime data were present, and some tests with a dummy-controlled-load process showed those values to be consistent and accurate. good.
The process name was there in proc_t.cmd but with only 'postgres' name, proc_t.cmdline with full process title was definately empty at exit.
And trying to get it at fork time... obviously at that time the cmdline is simply 'postgres' because the just forked child process has not yet changed its title to include origin, database and user name.
2 options left:

Finally tested various polling interval, and settled to 10ms as it used about 2% CPU use on peak-load while achieving maximal accuracy (utime and stime provided by libprocps are in 0.01s units)