This document provides step-by-step instructions for identifying a process and/or the native code stack that is sending or generating a kill signal (e.g., kill -3, kill -9, etc) to an application using IBM Java for AIX.
Overview
Details


Overview


In situations that a Java process (or any process) is unexpectedly being sent a kill signal (e.g., kill -3 or kill -9) and the process or user sending that signal can not be identified, please follow these steps to identify the process sending the signal.

By using an AIX specific trace utility called probevue, it is possible to easily and quickly identify the process(es) sending kill signals either using the kill command or the kill() system call.


Note:

All of the steps in this document must be performed by the root user or a user with system authority.

This document and the files on the download site are provided as a courtesy to our AIX and IBM Java for AIX users.

The information in this document and all files provided on the download site are provided AS-IS and without warranty. That is, there will be no support for the downloaded files.

It is expected that users will review and test all procedures and scripts prior to implementing in production.


Step A

Download Probevue Script


1. Download the two sample probevue scripts: trace_kill.pv and trace_kill_by_pid_only.pv

2. Save the scripts as /tmp/trace_kill.pv and /tmp/trace_kill_by_pid_only.pv on the system (LPAR) to be monitored.

3. Read and agree to the condition of use and license information in the trace_kill.pv and /tmp/trace_kill_by_pid_only.pv, by executing the following command in a command window:

# pg /tmp/trace_kill.pv
# pg /tmp/trace_kill_by_pid_only.pv


Step B

Choose Monitoring Script and Mode


1. Choose a script based on the symptoms of the issue:

a. When a signal is being generating from within an existing process, procede to Step B.2: Working with the trace_kill_by_pid_only.pv script.

b. When a signal is being sent to a process from another process, procede to Step B.3: Working with the trace_kill.pv script.

Please make sure to read the "Note:" areas for each step and at the bottom of this section.


2. Working with the trace_kill_by_pid_only.pv script

The trace_kill_by_pid_only.pv can be used to monitor signals being generated within a single process. The script can monitor all signals or a single signal.
This script will include trace information for pthread_kill() and abort() system calls that is not available with trace_kill.pv.

The script syntax is:


# probevue /tmp/trace_kill_by_pid_only.pv PID SIGNAL

where: (all parameters are required)

PID is the process id (pid) of an active process (to monitor a specific process receiving the signal(s))
SIGNAL is 0 (to monitor all signals) or a specific signal (to monitor a specific signal) ** Use only positive signal values (e.g., use "3" not "-3" )


Note: The current version only supports monitoring a single process and all signals or only a single signal.


Examples


a. Monitor any kill signal sent to a specific process receiving signals:


{12345678 would be replaced with the actual process id receiving signals}

# probevue /tmp/trace_kill_by_pid_only.pv 12345678 0



b. Monitor a specific signal for a specific process receiving signals:

{12345678 would be replaced with the actual process id and 3 could be replaced with any value signal}

# probevue /tmp/trace_kill_by_pid_only.pv 12345678 3



3. Working with the trace_kill.pv script

The trace_kill.pv script supports four monitoring modes. The modes are determined by the values passed to the probevue script when started.

The script syntax is:


# probevue /tmp/trace_kill.pv PID SIGNAL

where: (all parameters are required)

PID is 0 (to monitor all processes receiving signals) or the process id (pid) of an active process (to monitor a specific process receiving the signal(s))
SIGNAL is 0 (to monitor all signals) or a specific signal (to monitor a specific signal) ** Use only positive signal values (e.g., use "3" not "-3" )


Note: The current version only supports monitoring all process or a single process and all signals or only a single signal.


Examples


a. Monitor any kill signal being sent to any processes receiving signals:

# probevue /tmp/trace_kill.pv 0 0


b. Monitor any kill signal sent to a specific process receiving signals:


{12345678 would be replaced with the actual process id receiving signals}

# probevue /tmp/trace_kill.pv 12345678 0



c. Monitor a specific signal for any process receiving signals:

{3 could be replaced with any value signal - see /usr/include/sys/signal.h for list of commonly used signals}

# probevue /tmp/trace_kill.pv 0 3



d. Monitor a specific signal for a specific process receiving signals:

{12345678 would be replaced with the actual process id and 3 could be replaced with any value signal}

# probevue /tmp/trace_kill.pv 12345678 3



Note:

The probevue provides no error checking or debug messages.

Use positive values for the signals. For example, use "3", not "-3".

By default, the scripts are configured to show native code stacks from the processes that are sending and/or generating the signals. To disable this feature, modify the probevue script being used by changing the line:

bSHOW_STACK = 1;

to

bSHOW_STACK = 0;

then saving the file. To re-enable the feature to display the code stacks, simply change the value of bSHOW_STACK back to 1 and save the file.


Step C

Start Probevue Script


1. Start the chosen script:

a. To start the trace_kill_by_pid_only.pv script, execute the following commands (as the root user) from a command prompt:

{replace PID with 0 to monitor all processes receiving signals or the process id of the active process receiving signals}
{replace SIGNAL with 0 to monitor all signals or the number of the signal to monitor}


# cd /tmp
# nohup probevue /tmp/trace_kill_by_pid_only.pv PID SIGNAL > /tmp/monitor_kill.out 2>&1 &


The following example would monitor the process 1234567 generating a kill -3 (SIGQUIT) signal.

# cd /tmp
# nohup probevue /tmp/trace_kill_by_pid_only.pv 1234567 3 > /tmp/monitor_kill.out 2>&1 &


b. To start the trace_kil.pv scirpt, execute the following commands (as the root user) from a command prompt:

{replace PID with 0 to monitor all processes receiving signals or the process id of the active process receiving signals}
{replace SIGNAL with 0 to monitor all signals or the number of the signal to monitor}


# cd /tmp
# nohup probevue /tmp/trace_kill.pv PID SIGNAL > /tmp/monitor_kill.out 2>&1 &


The following example would monitor all process receiving a kill -3 (SIGQUIT) signal.

# cd /tmp
# nohup probevue /tmp/trace_kill.pv 0 3 > /tmp/monitor_kill.out 2>&1 &



2. To confirm that the probevue script has started:

Execute the following commands from a command prompt:

# ps -ef | grep trace_kill

root ##### #### 0 14:53:02 pts/1 0:00 probevue /tmp/trace_kill ......


# head -2 /tmp/monitor_kill.out

[INIT] pid= 1234567 sig= 3
or
[INIT] pid= 0 sig= 3


Step D

Monitor Probevue Script Log


When using the above instructions, monitor the status of the kill signal(s) being sent to the process(es) by executing the following command from a command prompt:


# tail -f /tmp/monitor_kill.out

The log messages reported in the output file will be similar to the following examples:

{the values for names, process ids (pids), thread ids (tids), user ids (uids), and signals will be different than the ones shown}

[MATCH] kill(): SOURCE( NAME= ksh PID= 5111900 TID= 28835983 UID= 0 ) TARGET( PID= 3866818 SIGNAL= 3 )
[MATCH] kill(): SOURCE( NAME= java PID= 21341233 TID= 51314123 UID= 0 ) TARGET( PID= 4128932 SIGNAL= 3 )


where:

SOURCE-NAME is the name of the process sending the matching kill signal(s)
SOURCE-PID is the process id of the process sending the matching kill signal(s)
SOURCE-TID is the process thread id of the process sending the matching kill signal(s)
SOURCE-UID is the user id of the process sending the matching kill signal(s)

TARGET-PID is the process receiving the matching kill signal(s)
TARGET-SIGNAL is the signal that was received by the process TARGET-PID


Once the process sending the signal(s) to another process has been identified, proceed to the Step E. Stop Probevue Script


Step E

Stop Probevue Script


To stop the probevue script, execute the following commands from a command prompt:

{if a name other than trace_kill was used to name the script, then replace trace_kill in the following commands with the name actually used}

# kill -9 $( ps -ef | grep trace_kill | grep -v grep | awk '{print $2}' )

Section 7

Section 8

Section 9

Section 10

Section 11

Section 12

Section 13

Section 14

Section 15

Section 16

Section 17

Section 18

Section 19

Section 20


Contact IBM Support


If, after reading and following the above instructions, further assistance is required, please complete the following steps:

1. Confirm that you have review and completed all of the above steps.

2. Contact IBM and open a new IBM service request (i.e., a new IBM PMR).

3. Collect and upload data as per the data collection procedures noted in the above sections or package and upload the current data and details by following the instructions on this web page:


IBM Java for AIX MustGather: How to upload diagnostic data and testcases to IBM

Document Type: Technical Document
Content Type: General
Hardware: all Power
Operating System: all AIX Versions
IBM Java: all Java Versions
Author(s): Roger Leuckie
Reviewer(s): NA
Click here to submit feedback for this document.