Use this page to define health policy condition properties while
creating a new health policy. To view this administrative console page, click Operational
Policies > Health Policies > New.
Extended information about the health policy properties follows:
Age-based condition:
Maximum age |
This field sets the age value so that the policy restarts the associated
members when their age reaches that value. Acceptable values are positive
whole numbers in days or hours between 1 hour and 365 days. Decimal numbers
are not supported. To use fractions of days, convert to hours. For example,
for 1.5 days, use 36 hours. |
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations on the Runtime tasks page.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
Restart server: Restarts the server. For the age-based condition
policy, the action must be Restart server. |
Excessive response time condition:
Response time |
This field is available for the excessive response time condition
health policy. The excessive response time policy restarts members when the
average number of requests completed exceeds this period of time. Acceptable
values for this field are between and including 1 millisecond and 60 minutes. |
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
Restart server: Restarts the server. For the excessive response
time condition policy, the action must be Restart server. |
Excessive request timeout condition:
Total memory used |
The excessive memory policy restarts members when the memory usage
exceeds a percentage of your heap size over a period of time. The total memory
used percentage is used with the time over memory threshold value to determine
when to restart members. Acceptable values for this field are whole numbers
from 1 to 99. |
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
- Take thread dumps: Takes thread dumps on IBM Java Development Kit
(JDK).
- Restart server: Restarts the server.
|
Memory condition: excessive memory:
JVM heap size |
Threshold value for the percentage of the maximum heap size used
for the Java Virtual Machine process. Acceptable values for this field are
whole numbers from 1 to 99. |
Offending time period |
Time period over which the JVM heap threshold must breach. The time
that the total memory must be over the threshold value prior to corrective
action. Acceptable values for this field are between, and including, 1 second
and 60 minutes. |
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
Restart server: Restarts the server. For the memory condition:
excessive memory condition policy, the action must be Restart server. |
Memory condition: memory leak:
Detection level for condition |
You can choose from the following detection levels. For each level
there is a trade-off between the speed and accuracy of detecting suspected
memory leaks. - Faster detection, higher probability of false alarms: A faster
detection policy detects a potential memory leak quickly, however it has a
greater chance of falsely identifying a memory leak than a slower detection
policy because it analyzes before the Java heap has expanded to its maximum
configured size.
- Standard detection, standard probability of false alarms: A standard
detection policy is more accurate than a faster one, but not as quick to identify
a potential memory leak. The standard and faster settings require the same
amount of historical data, but the standard setting analyzes after the Java
heap has expanded to its maximum configured size.
- Slower detection, lower probability of false alarms: A slower detection
policy is the most accurate, however it does not detect a potential memory
leak as quickly as the faster detection policy does. The slower setting requires
the most historical data.
|
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
- Take JVM heap dumps on IBM Java Development Kit (JDK) only: Takes
heap dumps on IBM JDK.
- Restart server: Restarts the server.
|
Storm drain condition
Detection level for condition |
You can choose from the following detection levels. For each level
there is a trade-off between the speed and accuracy of detecting suspected
memory leaks. - Standard detection, normal probability of false alarms: A standard
detection policy is less accurate than a slower one, but quicker to identify
a potential memory leak. This policy uses fewer samples (N=10) for both response
times and deployment workload manager weights and tries to detect a change
point in each of the metrics based on the sample set. It reaches a conclusion
faster because it waits for 20 samples, 10 for the left mean and 10 for the
right mean, for calculating a difference of means and looking for a local
maximum. The samples are collected at intervals of 15 seconds. Storm drain
can be detected within five minutes of its occurrence. Because the number
of samples is smaller, if the samples have a lot of transient peaks or dips,
there is a higher probability false alarms.
- Slower detection, lower probability of false alarms: A slower detection
policy is the most accurate, however it does not detect a potential memory
leak as quickly as the standard detection policy does. This policy uses more
samples (N=15) for both response times and deployment workload manager weights.
It reaches a conclusion slower because it has to wait for 30 samples (15 for
the left mean and 15 for the right mean) for calculating a difference of means.
The detection time is seven minutes and 30 seconds. Because the number of
samples is higher, the presence of a few samples with transient peaks or dips
does not overtly affect the means and the probability of false alarms is lower.
|
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach. |
Restart server: Restarts the server. For the storm drain condition
policy, the action must be Restart server. |
Workload condition:
Total requests |
In this field you can assign a numerical request value to your workload
policy. The workload condition policy restarts members when this number of
requests is serviced. An acceptable request value must be a whole number between
1000 and 9223372036854775807. |
Reaction mode |
- Supervise: Indicates the health policies are active and recommendations
for appropriate actions are being sent to the administrator, who can accept
or decline the recommendations.
- Automatic: Indicates the health policies are active, and the system
is both logging data and taking action.
|
Select actions to take on health condition breach |
Restart server: Restarts the server. For the workload condition
policy, the action must be Restart server. |
When you complete the fields, click Next.