If experiencing degraded performance by a JVM, first determine if there is a resource contrained system situation by checking the CPU, memory, and swap resources for the server/LPAR your JVM resides on.
If you have aresource contrained system situation on the system, then it may be more detrimental to run the automated 'perfpmr' or 'pdump' scripts.
In some situations, the 'perfpmr' or 'pdump' scripts can hang the whole operating system.
In these resource contrained system situaitons you will still need to gather and upload a minimum amount of data for the IBM representative to analyze, so they can get still get a full picture of what is going on with the OS and JVM (or as close to a full picture of the siutation as possible).
This will require you to manually run commands to produce output files that can then be packaged and uploaded for analysis by an IBM representative.
Jump to section: (Prepare) (High Level View of Resources) (Data Gather) (Additional Data) (Checklist) (Package Data) (Upload Data)
The instructions in this document make references to generic terms in Italics that will need to be replaced with information specific to the support call and the environment. It is very important that consistent and accurate values be used in place of the Italicized generic terms when collecting the data to ensure the prompt and correct delivery of the data when uploaded.
Generic Term Replace with
TMP_PATH A temporary directory with a minimum of 10 GB of free space (e.g. /large_fs).
MM-DD The current month and day (e.g. ,01-31).
PMR The full IBM PMR number (e.g. , PMR12345.b678.c000).
JAVA_PID
LOG_FILE
NEW_PATH
The process id of the active Java process (e.g. use "ps" command to check the PID column to identify the process).
The actual name of the log file.
An actual path to a new location for the log files
Overview
Step-by-Step Instructions
Examples / Tips / Hints / Comments / Descriptions

Step 1:

Prepare

A. To determine whether or not you should manually run the commands to gather the minimal set of data for the IBM representative to analyze:

# vmstat -lI 1 25

*The '-l' (lower case l) flag displays an extra "large-page" section

*The '-I' (capital i) flag displays I/O oriented view columns


B. Enable Java verbose garbage collection (GC)

Add the Java command line options:

-verbose:gc

-Xverbosegclog:/TMP_PATH/gc.log (e.g., /tmp/gc.log)

to your Java command line or process startup profile/script. This will require the process to be restarted


C. Redirect or save standard error (stderr) messages to a file

Commonly used application servers may already save standard out and standard error messages to a log file (e.g., SystemOut.log native_stdout.log, SystemErr.log, native_stderr.log) or to the application log file.

For custom applications, redirect the standard error messages by appending "2>&/TMP_PATH/LOG_FILE" or to redirect both the stdout and stderr to a file append ">/TMP_PATH/LOG_FILE 2>&1".


D. Relogin, then restart your application

Perform the following actions inorder for the changes to take effect:

- Stop the application (and node agent/manager, if applicable)

- Relogin as the USERID used in Step 1.B

- Confirm that full core is enabled and the new ulimits are in effect by executing the commands:

# ulimit -a

# lsattr -D -c sys -a fullcore -H

- Restart the application (e.g., node agent/manager) from the new login session

A. Example:

# vmstat -lI 1 25

System Configuration: lcpu=56 mem=65536MB

kthr memory page faults cpu large-page
----- ----------- ------------- ------------ ----------- -----------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa alp flp
0 0 1415347 13044604 0 0 0 0 0 0 46 193 153 45 44 0 0 0 0
0 0 1415350 13044601 0 0 0 0 0 0 25 411 311 77 18 0 4 0 0
0 0 1415350 13044601 0 0 0 0 0 0 43 281 298 25 74 0 1 0 0
0 0 1415350 13044601 0 0 0 0 0 0 30 62 262 33 66 0 3 0 0
0 0 1415350 13044601 0 0 0 0 0 0 32 403 324 62 37 0 1 0 0
0 0 1415350 13044601 0 0 0 0 0 0 35 193 275 40 59 0 2 0 0
0 0 1415350 13044601 0 0 0 0 0 0 40 145 265 79 20 0 0 0 0
0 0 1415350 13044601 0 0 0 0 0 0 28 59 262 65 34 0 1 0 0
0 0 1415350 13044601 0 0 0 0 0 0 55 61 280 23 76 0 1 0 0
0 0 1415350 13044601 0 0 0 0 0 0 48 63 261 99 0 0 1 0 0
........


NOTES:

- If the idle and wait ('id' and 'wa') columns have very low numbers, your system is in a high CPU utilization situaiton.
(The example above shows a CPU constrained system)

- Look at the largest value for avm as reported by the vmstat command. Multiply that by 4 K to get the number of bytes and then compare that to the number of bytes of RAM on the system. Ideally, avm should be smaller than total RAM. If not, some amount of virtual memory paging will occur. How much paging occurs will depend on the difference between the two values.
(The example above shows the system is not constrained on memory/paging)



B. When the full core dump options are not enabled and the core dumps are uploaded, in most cases, the core dumps will be incomplete or truncated. Not setting these options will prevent the support specialist from analyzing the data and will also delay the resolution of the reported issue.

When using J2E (or J2EE) application servers such as IBM WebSphere or Oracle WebLogic, for the changes to take effect, both the node agent (manager) and the application (manager) servers have to be stopped and restarted (and relogin before restarting)..

Examples of commands to be executed:

# chdev -l sys0 -a fullcore=true

# chuser fsize=-1 data=-1 core=-1 wasadmin

The AIX core file is generated in the current working directory of the process. Use the AIX environment variables:

IBM_COREDIR=NEW_PATH

to specify an alternate location for the AIX (process) core dump. Likewise, use the AIX environment variable:

IBM_JAVACOREDIR=NEW_PATH

to specify an alternate location for the javacore.*.txt files.

Both the IBM_COREDIR and IBM_JAVACOREDIR variables have to be configured for the process prior to the it being started (i.e., as part of its startup procedure and the process has to be restarted).

Step 2:

High Level View of Resources

If the 'vmstat -lI 1 25' command does NOT report an issue of a resource contrained system, then you can use use the technote titled "IBM Java for AIX MustGather: Data collection procedure for high CPU utilization with Java applications" for a less manual and more automated data gathering process.

http://www-01.ibm.com/support/docview.wss?uid=isg3T1022749

Example:

# vmstat -lI 1 25

System Configuration: lcpu=56 mem=65536MB

kthr memory page faults cpu large-page
-------- ----------- ------------------------ ------------ ----------- -----------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa alp flp
0 0 0 1426827 13032304 0 0 0 0 0 0 38 1987 1334 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 28 391 310 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 25 447 337 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 19 225 285 0 0 99 0 0 0
0 0 0 1426827 13032303 0 0 0 0 0 0 48 346 312 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 50 401 318 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 49 487 436 0 0 99 0 0 0
0 0 0 1426827 13032304 0 0 0 0 0 0 72 1752 1356 0 0 99 0 0 0
0 0 0 1426843 13032288 0 0 0 0 0 0 57 2390 314 0 0 99 0 0 0
0 0 0 1426843 13032288 0 0 0 0 0 0 55 116 267 0 0 99 0 0 0
........


NOTE: The example above does NOT show any resource constrictions

Step 3:

Data Gather

If the 'vmstat -lI 1 25' command DOES report an issue of a resource contrained system, then using the more automated data gathering process could cause even more of an issue.
The 'perfpmr' and 'pdump' scripts contain many of the commands in this document, but they also contain many other commnds such as system / kernel traces which can cause even more of a performance issue on your system.
In these cases, you can use the following commands to gather the minimal, yet pertinent information to begin the troubleshooting of the JVM's issue with respect to the resource contrained system.

A. Run the following commands in the order shown, prior to experiencing the issue.

# mkdir -p /TMP_PATH/PMR/MM-DD/data/perf
# cd /TMP_PATH/PMR/MM-DD/data/perf

# netstat -Aan > netstat.out 2>&1
# vmstat -tl 1 >> vmstat.out 2>&1 &
*Make sure to use the '&' at the end of the 'vmstat' command, as we need this command to continuously run in the background.
The 'vmstat' command is not resource intensive and has little, if any, affect on the system.


B. Run the following commands in the order shown, at the time you are experiencing the issue:

# tprof -skeuj -x sleep 10
# gencore JAVA-PID core.001
# kill -3 JAVA_PID
# sleep 5
# kill -3 JAVA_PID
# sleep 5
# kill -3 JAVA_PID

# prtconf > prtconf.out 2>&1
# lslpp -hac > lslpp-hac.out 2>&1
# emgr -lv3 > emgr-lv3.out 2>&1
# ipcs -saPrX > ipcs.out 2>&1
# errpt -a > errpt-a.out 2>&1
# JAVA_HOME/bin/java -version > java-version.out 2>&1

# lparstat -me lparstat-me.out 2>&1
# lparstat -th lparstat-h.out 2>&1
# lparstat -tH lparstat-H.out 2>&1
# lparstat -ti lparstat-i.out 2>&1
# mpstat -wh mpstat-h.out 2>&1
# mpstat -ws mpstat-s.out 2>&1
# mpstat -wd mpstat-d.out 2>&1
# mpstat -wv mpstat-v.out 2>&1
Note: The -v flag is available only for POWER8 processors, and later.

# ps avwwwg > ps-all.out 2>&1
# ps -Xemo THREAD > ps-THREAD.out 2>&1
# svmon -P > svmon-P.pid.out 2>&1
# svmon -G -O unit=auto,timestamp=on,pgsz=on,affinity=detail > svmon-G.out 2>&1

*Running the 'kill -3 JAVA_PID' command will result in javacore.*.txt files located in the current working directory of the process.

# cd {location of javacore.*.txt files}
# cp javacore.*.txt TMP_PATH/PMR/MM-DD/data


NOTE: Running the 'kill -3 JAVA_PID' command will result in javacore.*.txt files located in the current working directory of the process.

Step 4:

Additional Data

Other files to place in 'TMP_PATH/PMR/MM-DD/data' for upload:

- Standard error (stderr)
- Standard output (stdout)
- SystemOut
- SystemErr
- Application logs
- GC log
- Verbose JIT log (if available)
- Any other logs generated

Step 5:

Checklist

The following files are mandatory when uploading the package of data:

- prtconf.out
- lslpp-hac.out
- emgr-lv3.out
- ipcs.out
- errpt-a.out
- java-version.out
- vmstat.out
- lparstat-d.out
- lparstat-me.out
- lparstat-h.out
- lparstat-H.out
- lparstat-i.out
- mpstat-h.out
- mpstat-s.out
- mpstat-d.out
- mpstat-v.out (if using a p8 server)
- ps-all.out
- ps-THREAD.out
- svmon-P.pid.out
- svmon-G.out
- sleep.prof
- javacore.*.txt files
- Standard error (stderr)
- Standard output (stdout)
- SystemOut
- SystemErr
- Application logs
- GC log
- Any other logs generated

NOTE: These output files are mandatory to troubleshoot the issue

Step 6:

Package Data

Package all of the data that has been gathered:

# cd TMP_PATH/PMR/MM-DD
# tar -cvf - data | gzip -c > PMR.MM-DD.tgz

Step 7:

Upload Data

Upload the packaged data to IBM secured servers using one of upload options provided on the "IBM Java for AIX MustGather: How to upload diagnostic data and testcases to IBM" web page:

http://www-01.ibm.com/support/docview.wss?uid=isg3T1022619

Document Type: Instruction
Content Type: Troubleshooting
Hardware: all Power
Operating System: AIX 6 | AIX 7
IBM Java: all Java Versions
Author(s): Christopher C. D. Peters
Reviewer(s): Rama Tenjarla
Click here to submit feedback for this document.