Getting to Know TGID and PID in eBPF: Essential for Observability

When working with eBPF, retrieving process and thread information is essential for monitoring and observability. One commonly used helper function for this purpose is bpf_get_current_pid_tgid(). It provides both Thread Group ID (TGID) and the Process ID (PID).

But what exactly are TDID and PID, and how do they differ🤔?

  1. Extracting TGID and PID with eBPF
  2. Understanding TGID vs. PID
  3. Common Pitfall: The Misleading ps Output
  4. Experimenting with the Aya Framework
    1. Step0: Set up the environment for Aya
    2. Step1: Run the eBPF programs
    3. Step2: Build a multi-threaded sample program
    4. Step3. Run the program and check IDs with ps
    5. Step4. Compare IDs between the eBPF Program and ps
  5. Wrap-up

Extracting TGID and PID with eBPF

Let’s take a look at how TGID and PID can be retrieved in an eBPF program.

The bpf_get_current_pid_tgid() function returns a 64-bit unsigned integer where:

  • The upper 32 bits contain the TGID (Thread Group ID).
  • The lower 32 bits contain the PID (Process ID).

Here’s how you can extract these values in an eBPF program written in C:

u64 pid_tgid = bpf_get_current_pid_tgid();
u32 tgid = pid_tgid >> 32;            // Extract TGID from the upper 32 bits
u32 pid = pid_tgid & 0xFFFFFFFF;      // Extract PID from the lower 32 bits

Understanding TGID vs. PID

DMeaningCharacteristicsRepresentation in ps Command
TGIDThread Group ID• Identifies a process.
• All threads in a multi-threaded process share the same TGID.
• Typically, the TGID matches the PID of the first thread in the process.
PID
(This is confusing😵‍💫)
PIDProcess ID• Unique identifier for each thread.
• In a single-threaded process, the TGID and PID are the same.
• In a multi-threaded process, each thread has a different PID while sharing the same TGID.
SPID (or LWP)

Common Pitfall: The Misleading ps Output

A common source of confusion arises when using the ps command to check process information. The ps command displays the TGID under the “PID” column, which is misleading. The actual PID, or thread identifier, is shown as SPID (or LWP in some variations).

Experimenting with the Aya Framework

Let’s use the Aya framework to experiment with bpf_get_current_pid_tgid(). The code for this experiment is available in my GitHub Repository: yukinakanaka/aya-lab.

Step0: Set up the environment for Aya

Prerequisites

  • Linux
  • Rust nightly
  • bpf-linker
  • bindgen-cli
  • bpftool

If you can use lima, please use my configuration here.

Step1: Run the eBPF programs

Run the eBPF program that traces forkexecve, and exit, printing the TGID and PID:

cd tgid-pid && cargo xtask codegen && cargo xtask run

Step2: Build a multi-threaded sample program

Build a sample program written in Rust.

cd tools && cargo build --bin multithread

Step3. Run the program and check IDs with ps

The program will create 11 threads. Run it and observe the process details using ps:

./target/debug/multithread 1>/dev/null 2>&1 & ps -efL | grep -e UI[D] -e multithrea[d]

Step4. Compare IDs between the eBPF Program and ps

Here are examples:

Example Output from eBPF Program (TGID = 32125)

 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32131      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32130      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32137      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32134      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32129      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32132      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32135      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32133      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32136      uid: 502    
 INFO ebpf::exit: comm: multithread     tgid: 32125     pid: 32125      uid: 502  

Example Output from ps (PID = 32125)

UID          PID    PPID     LWP  C NLWP STIME TTY          TIME CMD
yukinak+   32125   10466   32125  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32128  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32129  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32130  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32131  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32132  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32133  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32134  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32135  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32136  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread
yukinak+   32125   10466   32137  0   11 22:20 pts/3    00:00:00 ./target/debug/multithread

The TGID from the eBPF logs matches the PID column in ps. The LWP values correspond to the thread-specific PIDs observed in the eBPF logs. This experiment highlights the differences between TGID and PID in multi-threaded applications.

Wrap-up

In this blog, I explained TGID and PID, and also demonstrated their representation using the ps command through sample programs. When writing eBPF programs, it’s important to have a clear understanding of TGID and PID and use them correctly.