Linux Process Snapper

Linux Process Snapper (pSnapper, psn) is part of my open source 0x.tools suite of tools for Linux performance troubleshooting and always-on profiling of production systems.

pSnapper is a Linux /proc profiler that works by sampling Linux task states and other metrics from /proc/PID/task/TID pseudofiles. pSnapper is designed to be a low-tech passive profiler, it does not attach to your program to slow it down, nor alter your program execution path or signal handling (like strace may inadvertently do).

As pSnapper is just a python script reading /proc files, it does not require software installation, nor install any kernel modules. pSnapper does not even require root access in most cases. The exception is if you want to sample some “private” /proc files (like syscall, and kernel stack) of processes running under other users.

The current release v0.15 is a beta right now. I have many more features to add, some known issues to fix and the output & command line options may change.

You can download it from my GitHub repo (make sure to “*Star” the repo too :-)

Example output

MySQL XFS fsync() metadata syncing bottleneck & inode contention:

$ sudo psn -p "mysqld|kwork" -G syscall,wchan

Linux Process Snapper v0.14 by Tanel Poder [https://0x.tools]
Sampling /proc/syscall, stat, wchan for 5 seconds... finished.


=== Active Threads ========================================================================================

 samples | avg_threads | comm          | state                  | syscall   | wchan                        
-----------------------------------------------------------------------------------------------------------
      25 |        3.12 | (mysqld)      | Disk (Uninterruptible) | fsync     | _xfs_log_force_lsn
      16 |        2.00 | (mysqld)      | Running (ON CPU)       | [running] | 0                            
      14 |        1.75 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | call_rwsem_down_write_failed
       8 |        1.00 | (mysqld)      | Disk (Uninterruptible) | fsync     | submit_bio_wait              
       4 |        0.50 | (mysqld)      | Disk (Uninterruptible) | pread64   | io_schedule                  
       4 |        0.50 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | io_schedule                  
       3 |        0.38 | (mysqld)      | Disk (Uninterruptible) | pread64   | 0                            
       3 |        0.38 | (mysqld)      | Running (ON CPU)       | [running] | io_schedule                  
       3 |        0.38 | (mysqld)      | Running (ON CPU)       | pread64   | 0                            
       2 |        0.25 | (mysqld)      | Disk (Uninterruptible) | [running] | 0                            
       1 |        0.12 | (kworker/*:*) | Running (ON CPU)       | read      | worker_thread                
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | fsync     | io_schedule                  
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | futex     | call_rwsem_down_write_failed 
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | poll      | 0                            
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | _xfs_log_force_lsn           
       1 |        0.12 | (mysqld)      | Running (ON CPU)       | fsync     | submit_bio_wait              
       1 |        0.12 | (mysqld)      | Running (ON CPU)       | futex     | futex_wait_queue_me      

Usage

Here’s the basic syntax:

$ psn -h                              # get help
$ psn --list                          # show all possible reporting columns
$ psn                                 # run pSnapper for 5 seconds, report task comm,state profile (default)
$ psn -d 10                           # same as above, but for 10 seconds
$ psn -p 1234                         # run pSnapper for PID 1234
$ psn -p 1234 -r                      # run pSnapper for PID 1234 and its children
$ psn -p 1234 -a                      # as above, but show all thread states (including idle, interruptible sleep)
$ psn -p "post|kwork"                 # run psnapper on processes  matching regex "post|kwork"
$ sudo psn -g comm,state,syscall      # customize sampled /proc columns (syscall is "private", so need sudo if sampling all procs)
$ psn -g cmdline,state,syscall,wchan  # choose exact sampled /proc columns
$ psn -G syscall,wchan                # append sampled columns in addition to the default "comm,state"
$ psn -G kstack                       # show kernel stack backtraces in addition to the default columns"

Notes

  1. By default, pSnapper shows only threads in state R or D (on CPU & in disk sleep essentially). If you want to see idle threads too, use -a to show all thread states (I, S, etc)
  2. pSnapper version v0.10 silently ignores processes/proc files that it can’t access. So it you see no output or some processes seem to be missing, run psn with default columns (so it doesn’t try to access syscall, stack of other “private” proc files) or run psn with sudo. I’ll improve & add warnings in a future version.

Demo Videos

Here’s a video introducing pSnapper and 0x.tools. If you want to learn more about the /proc filesystem and system calls in general, you can watch the two hacking session videos listed below too.

If you want to read about an example of using /proc filesystem for troubleshooting a Linux process hang, read this blog entry.

Update 2019-04-22: Does my program have an input bottleneck or output bottleneck?

pSnapper 0.14 can also show you the file names of accessed files, so when combined with the syscall sampling, you can easily check whether your program (like tar, dd or some other batch process) is bottlenecked by its input speed, output speed or mostly just CPU usage within the program itself.

Here’s an example of dd reading from a fast disk, writing to the slow disk (most of tie spent in write() syscall to the slow backup device:

$ psn -p 18286 -G syscall,filename

Linux Process Snapper v0.14 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall for 5 seconds... finished.


=== Active Threads ==================================================================================

 samples | avg_threads | comm | state                  | syscall   | filename                        
-----------------------------------------------------------------------------------------------------
      79 |        0.79 | (dd) | Disk (Uninterruptible) | write     | /backup/tanel/test (stdout)
       7 |        0.07 | (dd) | Disk (Uninterruptible) | [running] |                                 
       5 |        0.05 | (dd) | Running (ON CPU)       | write     | /backup/tanel/test (stdout)     
       4 |        0.04 | (dd) | Disk (Uninterruptible) | read      | /reco/fio/mmapfile.0.0 (stdin)  
       3 |        0.03 | (dd) | Running (ON CPU)       | [running] |                                 
       2 |        0.02 | (dd) | Running (ON CPU)       | read      | /reco/fio/mmapfile.0.0 (stdin)  

And in this case the copy flow is reversed, reading from the slow backup device, writing to the fast SSH storage. Most of the time is spent in read() syscalls, reading from the /backup/tanel/test file:

$ psn -p 18859 -G syscall,filename

Linux Process Snapper v0.14 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall for 5 seconds... finished.


=== Active Threads ==============================================================================

 samples | avg_threads | comm | state                  | syscall   | filename                    
-------------------------------------------------------------------------------------------------
      60 |        0.60 | (dd) | Disk (Uninterruptible) | read      | /backup/tanel/test (stdin)
      17 |        0.17 | (dd) | Disk (Uninterruptible) | [running] |                             
      11 |        0.11 | (dd) | Running (ON CPU)       | read      | /backup/tanel/test (stdin)  
       6 |        0.06 | (dd) | Disk (Uninterruptible) | write     | /reco/fio/test (stdout)     
       6 |        0.06 | (dd) | Running (ON CPU)       | [running] |                             

Update 2019-04-07: Unexplained Disk I/O Example

In my lab environment I noticed that some disks were doing pretty heavy I/O and accessing files & directories on the filesystem was pretty slow. That was weird as I knew that no serious workload was running on my machine at that time.

The iostat output is below:

$ iostat -xmd 5.
Linux 4.14.35-1844.3.2.el7uek.x86_64 (linux01.localdomain) 	04/06/2019 	_x86_64_	(32 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme1n1           0.00     0.00    0.20    2.80     0.00     0.02    15.07     0.00    3.47    2.00    3.57   0.00   0.00
nvme0n1           0.00     2.40    4.20   15.60     0.07     0.11    17.79     0.02    3.62    0.86    4.36   0.15   0.30
sdc             773.60   812.00  169.00  128.20    57.11    58.73   798.25     8.78   30.59   40.33   17.75   2.69  80.06
sdd             774.80     0.00  172.60    0.60    59.24     0.00   700.48     1.01    7.46    7.26   66.00   0.68  11.76
sde             775.00   823.40  174.20  130.60    59.76    59.68   802.54     9.74   35.03   46.97   19.11   2.65  80.84
sdb             773.60     0.00  173.60    0.60    59.20     0.00   696.02     0.84    6.30    6.20   35.33   0.61  10.56
md10              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

The sd[b-e] disks had up to 180MB/s read/write I/O each. Interestingly only 2 disks out of 4 had write I/Os against them. These 4 disks actually belong to a software RAID device md10, but that RAID device has absolutely no I/O done against it. How is this possible, where do these extra I/Os come from?

Instead of guessing or hoping to catch some usual suspects, let’s get systematic - pSnapper to the rescue!

Since the sync command was hung and there was I/O going on, I decided to run pSnapper, searching for active tasks with sync or kworker in their name (comm). The kworker threads do all kinds of kernel stuff, including asynchronous I/O. I’m also using sudo here as I want to sample the private /proc/PID/syscall file of root-owned threads. I’m using a new -G command line option (v0.11) to append sampled fields to the existing default ones (comm,state).

$ sudo psn -G syscall,wchan -r -p "sync|kworker"

Linux Process Snapper v0.11 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall, wchan for 5 seconds... finished.


=== Active Threads =====================================================================================

 samples | avg_threads | comm            | state                  | syscall    | wchan                  
--------------------------------------------------------------------------------------------------------
     100 |        1.00 | (sync)          | Disk (Uninterruptible) | sync       | wb_wait_for_completion
      98 |        0.98 | (kworker/u66:0) | Disk (Uninterruptible) | read       | wait_barrier           
      82 |        0.82 | (md10_resync)   | Disk (Uninterruptible) | read       | raise_barrier          
      15 |        0.15 | (md10_resync)   | Disk (Uninterruptible) | read       | md_do_sync             
       3 |        0.03 | (kworker/29:2)  | Disk (Uninterruptible) | read       | rpm_resume             
       3 |        0.03 | (md10_resync)   | Disk (Uninterruptible) | read       | raid10_sync_request    
       2 |        0.02 | (kworker/1:0)   | Disk (Uninterruptible) | read       | hub_event              
       2 |        0.02 | (kworker/29:2)  | Disk (Uninterruptible) | read       | msleep                 
       1 |        0.01 | (kworker/20:1H) | Running (ON CPU)       | read       | worker_thread          
       1 |        0.01 | (kworker/30:0)  | Running (ON CPU)       | [userland] | 0                      
       1 |        0.01 | (kworker/6:0)   | Running (ON CPU)       | [userland] | 0                      
       1 |        0.01 | (kworker/u66:0) | Running (ON CPU)       | [userland] | 0                      
       1 |        0.01 | (kworker/u66:0) | Running (ON CPU)       | read       | wait_barrier           

From the first line of this output table “(sync)” we see that this process is 100% waiting for some wb_wait_for_completion in the kernel while executing the sync syscall. A quick source code search shows that wb_ means WriteBack in the file system layer. But the sync command is not actually doing any I/O directly itself nor using any noticeable CPU. The other tasks that pop up near the top are more interesting though - the md10_resync kernel thread is apparently trying to do disk reads and is waiting for various kernel events like raise_barrier, md_do_sync and raid10_sync_request.

Note that I got lucky here with thread naming, I was looking for my program called sync and the md10_resync also happened to have sync in it, so my regular expression matched it. Had I not gotten lucky and seen the troublemakers above, I would have ran pSnapper without the -p argument next, to see all threads’ activity regardless of their name.

The events above provide already enough hints of what these threads are doing so that I don’t even have to search the source code. Some sort of RAID10 syncing perhaps? Indeed when checking dmesg output, I see RAID (mirror) syncing going on at that time:

$ dmesg -H

[Apr 6 13:07] md/raid10:md10: not clean -- starting background reconstruction
[  +0.000003] md/raid10:md10: active with 4 out of 4 devices
[  +0.079751] md10: detected capacity change from 0 to 8001301774336
[  +0.000233] md: resync of RAID array md10

And that explains it. The RAID manager realized that the disks that were supposed to be mirrored were not in sync (I had just created that software RAID array) and started syncing them in background. That’s why there was no I/O shown against the /dev/md10 device in iostat as it wasn’t userland applications generating the I/O against the RAID device, but the kernel threads accessing the underlying disk devices directly. That also explains why iotop didn’t show these threads showing up. Iotop reads /proc/PID/io as its input, but most kernel threads don’t record any activity in there (as they are issuing their I/Os at much lower level compared to userspace application system calls). Here’s an example from the md10_raid10 kernel thread:

$ sudo cat /proc/`pgrep md10_raid10`/io
rchar: 0
wchar: 0
syscr: 0
syscw: 0
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0

So in the case of troubleshooting unexpected I/O caused by kernel threads, you would need to either use some in-kernel tracing facility or sample /proc filesystem using pSnapper for getting a quick overview first.

Early warning signs of memory shortage (how busy is kswapd?)

kswapd threads are idle:

sudo psn -G syscall,wchan -p kswapd –a

=== Active Threads ============================================================

 samples | avg_threads | comm        | state                 | syscall |wchan  
-------------------------------------------------------------------------------
     100 |        1.00 | (kswapd0:0) | Sleep (Interruptible) | read    |kswapd 
     100 |        1.00 | (kswapd1:0) | Sleep (Interruptible) | read    |kswapd 

kswapd threads are using CPU (likely page scanning), doing I/O (paging) and also freeing reclaimable objects (looks like from XFS filesystem inode cache this time, can also free reclaimable kernel SLAB pages etc):

$ sudo psn -G syscall,wchan -p kswapd –a

=== Active Threads ===============================================================================

 samples | avg_threads | comm        | state                  | syscall   | wchan                 
--------------------------------------------------------------------------------------------------
      77 |        0.77 | (kswapd0:0) | Sleep (Interruptible)  | read      | kswapd                
      75 |        0.75 | (kswapd1:0) | Sleep (Interruptible)  | read      | kswapd                
      24 |        0.24 | (kswapd1:0) | Running (ON CPU)       | [running] | 0                     
      22 |        0.22 | (kswapd0:0) | Running (ON CPU)       | [running] | 0                     
       1 |        0.01 | (kswapd0:0) | Disk (Uninterruptible) | read      | io_schedule        
       1 |        0.01 | (kswapd1:0) | Disk (Uninterruptible) | read      | xfs_reclaim_inodes_ag

Kernel Stack Trace Breakdown

Scroll right to see the full picture:

$ sudo psn -G kstack

Linux Process Snapper v0.13 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, stack for 5 seconds... finished.


=== Active Threads =====================================================================================================================================================================================================================================================================================================================================================================================================================================

 samples | avg_threads | comm              | state                  | kstack                                                                                                                                                                                                                                                                                                                                                                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      55 |        2.75 | (mysqld)          | Disk (Uninterruptible) | SyS_fsync()->do_fsync()->vfs_fsync_range()->xfs_file_fsync()->_xfs_log_force_lsn()                                                                                                                                                                                                                                                                                                
      41 |        2.05 | (mysqld)          | Running (ON CPU)       | -                                                                                                                                                                                                                                                                                                                                                                                 
      12 |        0.60 | (sysbench)        | Running (ON CPU)       | -                                                                                                                                                                                                                                                                                                                                                                                 
       8 |        0.40 | (ksoftirqd/*)     | Running (ON CPU)       | ret_from_fork()->kthread()->smpboot_thread_fn()                                                                                                                                                                                                                                                                                                                                   
       7 |        0.35 | (mysqld)          | Disk (Uninterruptible) | SyS_fsync()->do_fsync()->vfs_fsync_range()->xfs_file_fsync()->xfs_blkdev_issue_flush()->blkdev_issue_flush()->submit_bio_wait()                                                                                                                                                                                                                                                   
       6 |        0.30 | (mysqld)          | Disk (Uninterruptible) | SyS_fsync()->do_fsync()->vfs_fsync_range()->xfs_file_fsync()->file_write_and_wait_range()->__filemap_fdatawait_range()->wait_on_page_bit_common()->io_schedule()                                                                                                                                                                                                                  
       6 |        0.30 | (mysqld)          | Running (ON CPU)       | do_futex()->futex_wait()->futex_wait_queue_me()                                                                                                                                                                                                                                                                                                                                   
       4 |        0.20 | (mysqld)          | Running (ON CPU)       | retint_user()->prepare_exit_to_usermode()->exit_to_usermode_loop()                                                                                                                                                                                                                                                                                                                
       3 |        0.15 | (mysqld)          | Disk (Uninterruptible) | SyS_pwrite64()->vfs_write()->__vfs_write()->xfs_file_write_iter()->xfs_file_buffered_aio_write()->xfs_file_aio_write_checks()->file_update_time()->xfs_vn_update_time()->xfs_ilock()->call_rwsem_down_write_failed()                                                                                                                                                              
       3 |        0.15 | (mysqld)          | Disk (Uninterruptible) | SyS_pwrite64()->vfs_write()->__vfs_write()->xfs_file_write_iter()->xfs_file_buffered_aio_write()->xfs_ilock()->call_rwsem_down_write_failed()                                                                                                                                                                                                                                     
       3 |        0.15 | (mysqld)          | Running (ON CPU)       | SyS_fsync()->do_fsync()->vfs_fsync_range()->xfs_file_fsync()->_xfs_log_force_lsn()                                                                                                                                                                                                                                                                                                
       2 |        0.10 | (collectl)        | Running (ON CPU)       | -                                                                                                                                                                                                                                                                                                                                                                                 
       2 |        0.10 | (mysqld)          | Disk (Uninterruptible) | -                                                                                                                                                                                                                                                                                                                                                                                 
       1 |        0.05 | (kworker/*:*)     | Disk (Uninterruptible) | ret_from_fork()->kthread()->worker_thread()->process_one_work()->pm_runtime_work()->rpm_suspend()->rpm_callback()->__rpm_callback()->usb_runtime_suspend()->usb_suspend_both()->generic_suspend()->usb_port_suspend()->set_port_feature()->usb_control_msg()->usb_start_wait_urb()->usb_submit_urb()->usb_submit_urb.part.6()->usb_hcd_submit_urb()->xhci_hub_control()->msleep() 
       1 |        0.05 | (kworker/*:*)     | Disk (Uninterruptible) | ret_from_fork()->kthread()->worker_thread()->process_one_work()->xfs_end_io()->xfs_iomap_write_unwritten()->xfs_ilock()->call_rwsem_down_write_failed()                                                                                                                                                                                                                           
       1 |        0.05 | (mysqld)          | Disk (Uninterruptible) | SyS_io_submit()->do_io_submit()->aio_write()->xfs_file_write_iter()->xfs_file_buffered_aio_write()->iomap_file_buffered_write()->iomap_apply()->xfs_file_iomap_begin()->xfs_ilock()->call_rwsem_down_write_failed()                                                                                                                                                               
       1 |        0.05 | (mysqld)          | Disk (Uninterruptible) | SyS_io_submit()->do_io_submit()->aio_write()->xfs_file_write_iter()->xfs_file_buffered_aio_write()->xfs_file_aio_write_checks()->file_update_time()->xfs_vn_update_time()->xfs_ilock()->call_rwsem_down_write_failed()                                                                                                                                                            
       1 |        0.05 | (mysqld)          | Running (ON CPU)       | SyS_fsync()->do_fsync()->vfs_fsync_range()->xfs_file_fsync()->file_write_and_wait_range()->__filemap_fdatawait_range()->wait_on_page_bit_common()->io_schedule()                                                                                                                                                                                                                  
       1 |        0.05 | (mysqld)          | Running (ON CPU)       | xfs_trim_extent_eof()                                                                                                                                                                                                                                                                                                                                                             
       1 |        0.05 | (ora_vkrm_linprd) | Running (ON CPU)       | -                                                                                                                                                                                                                                                                                                                                                                                 

Other Usage Examples

Report cmdline,state summary of all “active” threads over 5 seconds:

$ psn

Process Snapper sampling cmdline, stat for 5 seconds...
finished sampling

=== Active Threads ===============================================

 samples | avg_threads | cmdline         | state                  
------------------------------------------------------------------
     316 |        9.58 | fio             | Disk (Uninterruptible) 
     212 |        6.42 | fio             | Running (ON CPU)       
      33 |        1.00 | python          | Running (ON CPU)       
      30 |        0.91 |                 | Running (ON CPU)       
       3 |        0.09 |                 | Disk (Uninterruptible) 
       2 |        0.06 | /usr/bin/perl   | Running (ON CPU)       
       1 |        0.03 | ora_vktm_LINPRD | Running (ON CPU)       
       1 |        0.03 | top             | Running (ON CPU)       

Report all thread states (including sleeping) for processes matching regex ^fio. Group by system call names too:

$ psn -p ^fio -g cmdline,state,syscall -a

Process Snapper sampling cmdline, syscall, stat for 5 seconds...
finished sampling


=== Active Threads ====================================================

 samples | avg_threads | cmdline | state                  | syscall    
-----------------------------------------------------------------------
     759 |        7.59 | fio     | Disk (Uninterruptible) | pread64    
     701 |        7.01 | fio     | Disk (Uninterruptible) | pwrite64   
     100 |        1.00 | fio     | Sleep (Interruptible)  | futex      
      99 |        0.99 | fio     | Sleep (Interruptible)  | nanosleep  
      45 |        0.45 | fio     | Disk (Uninterruptible) | [userland] 
      41 |        0.41 | fio     | Running (ON CPU)       | [userland] 
      37 |        0.37 | fio     | Running (ON CPU)       | pread64    
      17 |        0.17 | fio     | Running (ON CPU)       | pwrite64   
       1 |        0.01 | fio     | Running (ON CPU)       | nanosleep  

Same as above, but include kernel wait event wchan in the profile too:

$ psn -p ^fio -g cmdline,state,syscall,wchan -a

Process Snapper sampling cmdline, syscall, wchan, stat for 5 seconds...
finished sampling


=== Active Threads ==========================================================================

 samples | avg_threads | cmdline | state                  | syscall    | wchan               
---------------------------------------------------------------------------------------------
     705 |        7.05 | fio     | Disk (Uninterruptible) | pread64    | io_schedule         
     661 |        6.61 | fio     | Disk (Uninterruptible) | pwrite64   | io_schedule         
     100 |        1.00 | fio     | Sleep (Interruptible)  | futex      | futex_wait_queue_me 
     100 |        1.00 | fio     | Sleep (Interruptible)  | nanosleep  | hrtimer_nanosleep   
      42 |        0.42 | fio     | Running (ON CPU)       | [userland] | 0                   
      31 |        0.31 | fio     | Disk (Uninterruptible) | [userland] | 0                   
      30 |        0.30 | fio     | Disk (Uninterruptible) | [userland] | io_schedule         
      25 |        0.25 | fio     | Running (ON CPU)       | pread64    | 0                   
      25 |        0.25 | fio     | Running (ON CPU)       | pread64    | io_schedule         
      24 |        0.24 | fio     | Disk (Uninterruptible) | pread64    | 0                   
      20 |        0.20 | fio     | Running (ON CPU)       | [userland] | io_schedule         
      14 |        0.14 | fio     | Running (ON CPU)       | pwrite64   | 0                   
      12 |        0.12 | fio     | Running (ON CPU)       | pwrite64   | io_schedule         
      11 |        0.11 | fio     | Disk (Uninterruptible) | pwrite64   | 0                   

MySQL daemon profile (including Linux idle states that may not be idle in the RDBMS):

$ sudo psn -p mysql -g cmdline,state,syscall,wchan -a

Process Snapper sampling cmdline, syscall, wchan, stat for 5 seconds...
finished sampling


=== Active Threads ==========================================================================================

 samples | avg_threads | cmdline          | state                  | syscall         | wchan                 
-------------------------------------------------------------------------------------------------------------
    1728 |       17.28 | /usr/sbin/mysqld | Sleep (Interruptible)  | futex           | futex_wait_queue_me   
    1000 |       10.00 | /usr/sbin/mysqld | Sleep (Interruptible)  | io_getevents    | read_events           
     200 |        2.00 | /usr/sbin/mysqld | Sleep (Interruptible)  | nanosleep       | hrtimer_nanosleep     
     120 |        1.20 | /usr/sbin/mysqld | Disk (Uninterruptible) | pread64         | io_schedule           
     100 |        1.00 | /usr/sbin/mysqld | Sleep (Interruptible)  | poll            | poll_schedule_timeout 
     100 |        1.00 | /usr/sbin/mysqld | Sleep (Interruptible)  | rt_sigtimedwait | do_sigtimedwait       
      98 |        0.98 | /usr/sbin/mysqld | Running (ON CPU)       | [userland]      | 0                     
      12 |        0.12 | /usr/sbin/mysqld | Running (ON CPU)       | pread64         | 0                     
      10 |        0.10 | /usr/sbin/mysqld | Running (ON CPU)       | [userland]      | poll_schedule_timeout 
       8 |        0.08 | /usr/sbin/mysqld | Running (ON CPU)       | pread64         | io_schedule           
       7 |        0.07 | /usr/sbin/mysqld | Sleep (Interruptible)  | [userland]      | 0                     
       6 |        0.06 | /usr/sbin/mysqld | Disk (Uninterruptible) | [userland]      | io_schedule           
       4 |        0.04 | /usr/sbin/mysqld | Running (ON CPU)       | poll            | 0                     
       3 |        0.03 | /usr/sbin/mysqld | Disk (Uninterruptible) | [userland]      | 0                     
       1 |        0.01 | /usr/sbin/mysqld | Disk (Uninterruptible) | pread64         | 0                     
       1 |        0.01 | /usr/sbin/mysqld | Running (ON CPU)       | [userland]      | io_schedule           
       1 |        0.01 | /usr/sbin/mysqld | Sleep (Interruptible)  | [userland]      | poll_schedule_timeout 
       1 |        0.01 | /usr/sbin/mysqld | Sleep (Interruptible)  | poll            | 0                     

Sample only processes containing “post”. Use Postgres built-in cmdline instrumentation to see “inside” DB activity:

$ psn -p post -g cmdline

Process Snapper sampling cmdline, stat for 5 seconds...
finished sampling


=== Active Threads =============================================================

 samples | avg_threads | cmdline                                                
--------------------------------------------------------------------------------
     106 |        1.06 | postgres: postgres pgbench [local] COMMIT              
      40 |        0.40 | postgres: postgres pgbench [local] UPDATE              
      17 |        0.17 | postgres: postgres pgbench [local] idle in transaction 
       4 |        0.04 | postgres: postgres pgbench [local] SELECT              
       4 |        0.04 | postgres: postgres pgbench [local] idle                
       2 |        0.02 | postgres: postgres pgbench [local] UPDATE waiting      
       2 |        0.02 | postgres: postgres pgbench [local] idle in tran        
       1 |        0.01 | postgres: background writer                            
       1 |        0.01 | postgres: postgres pgbench [local] INSERT              

Profile Postgres active thread states only (sampling /proc/PID/syscall of a different user requires root access):

$ sudo psn -p post -g state,syscall

Process Snapper sampling syscall, stat for 5 seconds...
finished sampling


=== Active Threads ==========================================

 samples | avg_threads | state                  | syscall    
-------------------------------------------------------------
      93 |        0.93 | Disk (Uninterruptible) | fdatasync  
      64 |        0.64 | Running (ON CPU)       | [userland] 
      22 |        0.22 | Running (ON CPU)       | epoll_wait 
       7 |        0.07 | Running (ON CPU)       | futex      
       2 |        0.02 | Disk (Uninterruptible) | [userland] 
       2 |        0.02 | Running (ON CPU)       | fdatasync  

Full details of Postgres processes (including idle thread states) during pgbench startup vacuum:

$ sudo psn -p post -g cmdline,state,syscall,wchan -a

Process Snapper sampling cmdline, syscall, wchan, stat for 5 seconds...
finished sampling


=== Active Threads ==============================================================================================================

 samples | avg_threads | cmdline                                   | state                  | syscall    | wchan                 
---------------------------------------------------------------------------------------------------------------------------------
     100 |        1.00 | /usr/pgsql-11/bin/postmaster              | Sleep (Interruptible)  | select     | poll_schedule_timeout 
     100 |        1.00 | postgres: autovacuum launcher             | Sleep (Interruptible)  | epoll_wait | ep_poll               
     100 |        1.00 | postgres: background writer               | Sleep (Interruptible)  | epoll_wait | ep_poll               
     100 |        1.00 | postgres: checkpointer                    | Sleep (Interruptible)  | epoll_wait | ep_poll               
     100 |        1.00 | postgres: logger                          | Sleep (Interruptible)  | epoll_wait | ep_poll               
     100 |        1.00 | postgres: logical replication launcher    | Sleep (Interruptible)  | epoll_wait | ep_poll               
     100 |        1.00 | postgres: stats collector                 | Sleep (Interruptible)  | epoll_wait | ep_poll               
      99 |        0.99 | postgres: walwriter                       | Sleep (Interruptible)  | epoll_wait | ep_poll               
      51 |        0.51 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | fdatasync  | submit_bio_wait       
      23 |        0.23 | postgres: postgres pgbench [local] VACUUM | Running (ON CPU)       | [userland] | 0                     
      15 |        0.15 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | fdatasync  | io_schedule           
       3 |        0.03 | postgres: postgres pgbench [local] VACUUM | Running (ON CPU)       | fdatasync  | 0                     
       2 |        0.02 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | fdatasync  | 0                     
       2 |        0.02 | postgres: postgres pgbench [local] VACUUM | Running (ON CPU)       | fdatasync  | io_schedule           
       1 |        0.01 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | [userland] | 0                     
       1 |        0.01 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | [userland] | io_schedule           
       1 |        0.01 | postgres: postgres pgbench [local] VACUUM | Disk (Uninterruptible) | [userland] | submit_bio_wait       
       1 |        0.01 | postgres: postgres pgbench [local] VACUUM | Running (ON CPU)       | fdatasync  | submit_bio_wait       
       1 |        0.01 | postgres: walwriter                       | Sleep (Interruptible)  | futex      | futex_wait_queue_me   

Full details of Postgres processes (including idle thread states) during pgbench run:

$ sudo psn -p post -g cmdline,state,syscall,state -a

Process Snapper sampling cmdline, syscall, stat for 5 seconds...
finished sampling


=== Active Threads ============================================================================================================================

 samples | avg_threads | cmdline                                                | state                  | syscall    | state                  
-----------------------------------------------------------------------------------------------------------------------------------------------
     523 |        5.23 | postgres: postgres pgbench [local] COMMIT              | Sleep (Interruptible)  | futex      | Sleep (Interruptible)  
     100 |        1.00 | /usr/pgsql-11/bin/postmaster                           | Sleep (Interruptible)  | select     | Sleep (Interruptible)  
     100 |        1.00 | postgres: autovacuum launcher                          | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
     100 |        1.00 | postgres: logger                                       | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
     100 |        1.00 | postgres: logical replication launcher                 | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
     100 |        1.00 | postgres: postgres pgbench [local] COMMIT              | Disk (Uninterruptible) | fdatasync  | Disk (Uninterruptible) 
     100 |        1.00 | postgres: stats collector                              | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      99 |        0.99 | postgres: background writer                            | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      71 |        0.71 | postgres: checkpointer                                 | Sleep (Interruptible)  | select     | Sleep (Interruptible)  
      57 |        0.57 | postgres: walwriter                                    | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      40 |        0.40 | postgres: walwriter                                    | Sleep (Interruptible)  | futex      | Sleep (Interruptible)  
      30 |        0.30 | postgres: postgres pgbench [local] idle                | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      23 |        0.23 | postgres: postgres pgbench [local] idle in transaction | Running (ON CPU)       | [userland] | Running (ON CPU)       
      20 |        0.20 | postgres: postgres pgbench [local] UPDATE              | Running (ON CPU)       | [userland] | Running (ON CPU)       
      20 |        0.20 | postgres: postgres pgbench [local] UPDATE waiting      | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      18 |        0.18 | postgres: checkpointer                                 | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
      12 |        0.12 | postgres: postgres pgbench [local] idle in transaction | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       9 |        0.09 | postgres: checkpointer                                 | Disk (Uninterruptible) | fsync      | Disk (Uninterruptible) 
       9 |        0.09 | postgres: postgres pgbench [local] UPDATE              | Running (ON CPU)       | epoll_wait | Running (ON CPU)       
       9 |        0.09 | postgres: postgres pgbench [local] idle in transaction | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
       8 |        0.08 | postgres: postgres pgbench [local] COMMIT              | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
       5 |        0.05 | postgres: postgres pgbench [local] SELECT              | Running (ON CPU)       | [userland] | Running (ON CPU)       
       5 |        0.05 | postgres: postgres pgbench [local] idle in transaction | Running (ON CPU)       | epoll_wait | Running (ON CPU)       
       5 |        0.05 | postgres: postgres pgbench [local] idle in transaction | Sleep (Interruptible)  | futex      | Sleep (Interruptible)  
       4 |        0.04 | postgres: postgres pgbench [local] COMMIT              | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       4 |        0.04 | postgres: postgres pgbench [local] UPDATE              | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       3 |        0.03 | postgres: postgres pgbench [local] COMMIT              | Running (ON CPU)       | futex      | Running (ON CPU)       
       3 |        0.03 | postgres: postgres pgbench [local] SELECT              | Running (ON CPU)       | epoll_wait | Running (ON CPU)       
       3 |        0.03 | postgres: postgres pgbench [local] idle                | Running (ON CPU)       | epoll_wait | Running (ON CPU)       
       3 |        0.03 | postgres: postgres pgbench [local] idle                | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       2 |        0.02 | postgres: checkpointer                                 | Running (ON CPU)       | [userland] | Running (ON CPU)       
       2 |        0.02 | postgres: postgres pgbench [local] INSERT              | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       2 |        0.02 | postgres: postgres pgbench [local] INSERT              | Sleep (Interruptible)  | futex      | Sleep (Interruptible)  
       2 |        0.02 | postgres: walwriter                                    | Sleep (Interruptible)  | [userland] | Sleep (Interruptible)  
       1 |        0.01 | postgres: background writer                            | Running (ON CPU)       | [userland] | Running (ON CPU)       
       1 |        0.01 | postgres: postgres pgbench [local] COMMIT              | Running (ON CPU)       | [userland] | Running (ON CPU)       
       1 |        0.01 | postgres: postgres pgbench [local] COMMIT              | Running (ON CPU)       | fdatasync  | Running (ON CPU)       
       1 |        0.01 | postgres: postgres pgbench [local] INSERT              | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
       1 |        0.01 | postgres: postgres pgbench [local] SELECT              | Sleep (Interruptible)  | epoll_wait | Sleep (Interruptible)  
       1 |        0.01 | postgres: postgres pgbench [local] UPDATE waiting      | Running (ON CPU)       | epoll_wait | Running (ON CPU)       
       1 |        0.01 | postgres: postgres pgbench [local] idle                | Running (ON CPU)       | [userland] | Running (ON CPU)       
       1 |        0.01 | postgres: postgres pgbench [local] idle in tran        | Running (ON CPU)       | [userland] | Running (ON CPU)       
       1 |        0.01 | postgres: walwriter                                    | Running (ON CPU)       | futex      | Running (ON CPU)       



samples: 100 (expected: 100)
total processes: 16, threads: 16
runtime: 5.01, measure time: 0.76

That’s it!