NOTE: The features and functionality of this beta software may not match the documentation exactly. Please contact us or visit our web site if you have any questions.
This is an analysis of the data contained in the file sample.nmon. The data was collected on 11/04/2008, from 09:01:24 to 16:56:28, from partition name testsystem (partition number 1) on the 9119-595 system 'testsystem'. There were 95 data records used to produce this analysis. The operating system used to produce the nmon data was Release 5.3.0.61 of AIX. Nmon reports that 2 processors were configured and 2 were active. 15 gigabytes of memory were seen in the system configuration data.
Table of Contents
When the data was collected, no CPU bottleneck could be detected. No memory bottleneck was seen. No significant I/O bottleneck was seen. A change to at least one tunable parameter has been recommended.
All recommendations contained in this report are based on the conditions which were present when the performance data was collected. It is possible that conditions which were not present at that time may cause some of these recommendations to result in worse performance. To minimize this risk, analyze data from several different days, implement only regularly occurring recommendations, and implement them one at a time or as groups of related parameters.
NOTE: The following 4 vmo changes should be made all at once. These are IBM's recommendations for using the improved memory management algorithm. The older algorithm (lru_file_repage = 1) is used as the default on pre-6.1 operating systems.
Change the value of the lru_file_repage parameter from 1 to 0 with the command 'vmo -o lru_file_repage=0'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'vmo -p -o lru_file_repage=0'. The lru_file_repage parameter is used to change the algorithms used by the LRUD (page stealing daemon).
Change the value of the maxclient% parameter from 20 to 90 with the command 'vmo -o maxclient%=90'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'vmo -p -o maxclient%=90'.
Change the value of the maxperm% parameter from 20 to 90 with the command 'vmo -o maxperm%=90'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'vmo -p -o maxperm%=90'.
Change the value of the minperm% parameter from 10 to 3 with the command 'vmo -o minperm%=3'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'vmo -p -o minperm%=3'.
This is the end of this set of vmo parameter changes that should be implemented together.
Change the value of maxfree from 2048 to 1216 with the command 'vmo -o maxfree=1216'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'vmo -p -o maxfree=1216'. This change is recommended based on formulas discussed at IBM's pSeries Technical University. The formula used to generate the maxfree recommendation was maxfree = (maxreadahead * actvcpus / mempoolsval) + minfree. The number of lcpus reported by nmon was 2. The number of memory pools seen was 1. The j2_maxPageReadAhead value used was 128. The value for minfree was 960 and the value for maxpgahead was 8. The maxreadahead value used was 128.
Change the value of the j2_nRandomCluster parameter from 0 to 32 with the command 'ioo -o j2_nRandomCluster=32'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'ioo -p -o j2_nRandomCluster=32'. This recommendation will increase how far apart writes need to be from each other to be considered random. This is likely to increase the amount of data kept in RAM between writes and may improve performance. This recommendation is being made because a significant amount of random write activity was likely and this change may help to optimize that I/O.
Change the value of the j2_maxRandomWrite parameter from 0 to 128 with the command 'ioo -o j2_maxRandomWrite=128'. The -o flag changes the value of a parameter only until the next reboot. To make the change permanent, use the command 'ioo -p -o j2_maxRandomWrite=128'. This recommendation will reduce the number of pages of random writes that are allowed to collect in memory before they are flushed to disk by the write behind algorithm. This recommendation is being made because a significant amount of random write activity was likely and this change may help to optimize that I/O. The performance impact of changes to this parameter is very dependent on the application. In some cases it will hurt performance and should not be permanently implemented.
Set the sys0 attribute iostat to "true" with the command 'chdev -l sys0 -a iostat=true'. This will cause AIX to produce more I/O statistics that can be read by nmon. There is a small increase in overhead associated with this attribute but the benefit should outweigh the cost.
An average of 15.42 percent of this partition's entitled CPU capacity (%entc) was used during the monitoring period. The percentage peaked at 23.93 from 14:41:26 to 14:46:26. There were 5.50 physical processors in use when the percentage of entitled CPU capacity was at its peak.

The average number of physical processors consumed by this partition (physc) was 3.55. The peak number of physical processors consumed was 5.50 from 14:41:26 to 14:46:26.
Information in this paragraph is taken from nmon. This information may not be completely accurate on a micropartitioned system and is provided because people are used to seeing it. Average CPU utilization (User% + Sys%) was only 52.5 percent. This indicates that spare CPU capacity exists. If any performance problems were seen during the entire monitoring period, they were not caused by a lack of CPU power. CPU utilization peaked at 62.7 percent from 09:01:24 to 09:06:24. The CPU was waiting for I/O (Wait%) an average of 0.1 percent of the time.

The preceding graph shows the relationship between %entc data and the sum of User% and Sys%. The %entc data is more accurate and should be used instead of the traditional User% and Sys% metrics. The Wait% column is probably not very accurate but higher values are likely to indicate times of greater I/O activity. Because the User%, Sys%, and Wait% data is not accurate on micropartitioned systems, it has not been used to calculate the percent of time that the system was idle.
Both of the system's processors were active.
The minimum multiprogramming level (v_min_process in schedo) has been set to 2. This is a safe value for small configurations and may be low for larger configurations. This parameter is very dependent of workload and the correct value cannot be determined with nmon data. Since no significant memory bottleneck was seen, no changes are needed. More information can be found on the web by using your favorite search engine.
The average rate of page outs to the paging spaces was either zero or was too small for nmon to measure. Paging activity to the paging spaces is unlikely on a system with a memory surplus.
The recorded setting for maxpin% leaves 3270.20 megabytes of memory unpinnable. No recommendation was made because no problem was seen.

NOTE: In the above graph, the values of maxpin and %pinned were measured in different ways. To see the interaction between maxpin and the percent of memory actually pinned, use a maxpin value of 76.67 percent instead of 80 percent.
The following graph and table show the relationship between used memory, maxperm%, maxclient%, numperm, numclient, and minperm%. Because the values of maxperm% and maxclient% are the same, only one can be seen in the graph.

| VMM Statistics | ||
|---|---|---|
| Metric | Average | Range |
| Memory in use: The percentage of memory being used for either file or non-file pages | 45.5% | 44.7 - 46.7 |
| Non-file: IBM frequently calls this 'computational memory'. | 28.3% | 27.5 - 29.4 |
| numperm: Memory which holds all cached file pages (JFS, JFS2, GPFS, NFS, etc.) | 17.3% | 17.2 - 17.8 |
| numclient: Memory which holds all cached file pages except JFS. | 17.3% | 17.2 - 17.8 |
| Parameter | Value | |
| maxperm% | 19.2 | |
| maxclient% | 19.2 | |
| minperm% | 9.6 | |
There was a discrepancy between the values reported by vmstat -v and nmon's MEMUSE. This discreparcy exists because not all memory pages are "lruable" and nmon cannot take this into account. In the table and graph above, the values calculated by nmon are used consistently even these values are not the ones used to formulate recommendations. This does not indicate a problem but it points out the complexity of the underlying statistics.
A recommendation was made to change the value of maxfree from 2048 to 1216. This change is recommended based on formulas discussed at IBM's pSeries Technical University. The formula used to generate the maxfree recommendation was maxfree = (maxreadahead * actvcpus / mempoolsval) + minfree. The number of lcpus reported by nmon was 2. The number of memory pools seen was 1. The j2_maxPageReadAhead value used was 128. The value for maxpgahead was 8. The maxreadahead value used was 128.
No I/O bottleneck was seen in the statistics, therefore no changes are recommended for maxpgahead. The value of minpgahead was set to 2. This is the kind of small value that typically works best in most environments.
No I/O bottleneck was seen in the nmon data, therefore no changes are recommended for j2_maxPageReadAhead. The value of j2_minPageReadAhead was set to 2. This is the kind of small value that typically works best in most environments.
The value of numclust is 1. If fast disk devices, disk arrays, or striped logical volumes are in use, the performance of disk writes could be improved by increasing this value. NmonCheck does not have access to enough information about the system's disk devices to make any specific recommendation for tuning numclust.
The value of maxrandwrt was 0. This value causes random JFS writes to stay in RAM until a sync operation.
The value of j2_nPagesPerWriteBehindCluster was 32. This value determines the number of additional pages to be kept in RAM before scheduling them for I/O when the pattern is sequential.
The value of j2_maxRandomWrite was 0. This value causes random JFS2 writes to stay in RAM until a sync operation. A change to the value of j2_maxRandomWrite has been recommended in order to assure that there aren't enough writes to cause performance problems during a sync operation. The performance impact of changes to this parameter is very dependent on the application. In some cases it will hurt performance and should not be permanently implemented.
The value of j2_nRandomCluster was 0. This value determines how far apart writes need to be to be considered random by the JFS2 write behind algorithm. A change to the value of j2_nRandomCluster has been made because the values for j2_nRandomCluster and j2_maxRandomWrite should both be set to non-zero values for the change to be effective.
The average system-wide local I/O rate as measured by the DISKXFER data in nmon was 42.0 per second. This I/O rate peaked at 158.4 per second from 10:16:24 to 10:21:24. The average size of an I/O based on the r+w/s and kb/s columns was 9.8 kilobytes, or 2.4 pages. The iostat utility reports that 43.0 percent of disk data transferred were writes and the rest were reads. Most of the I/O seen on this system was random. None of the filesystem activity seen on this system was JFS. Because a significant amount of this system's I/O seemed to be random, mounting filesystem with the DIO/CIO option would probably hurt performance because filesystem caching would be disabled.
A graph of total disk I/O data was not created because there was no data to graph.
The following graph shows the average/peak percent busy and average service time for 2 disks.

The -dtoo switch has been used to format disk statistics into the following table.
Please note that if RAID devices were present, %busy statistics reported for them are likely to be inaccurate and should be viewed skeptically. The presence of a RAID device is frequently invisible to the operating system and therefore invisible to this program.
The disk device hdisk1 was busy an average of 8.60 percent of the time. This indicates that the device is not a performance bottleneck. During the peak interval from 10:16:24 to 10:21:24, the disk was 41.3 percent busy. Peak disk busy statistics can be used to help understand performance problems. If performance was worst when the disk was busiest, then a performance bottleneck may be that disk. The average service time reported for this device and its accompanying disk subsystem was 5.1 milliseconds. This is indicative of a very fast disk or a disk controller with cache. Service time is the delay between the time a request was sent to a device and the time that the device signaled completion of the request. Nmon reports that the capacity of this disk was 73.40 gigabytes and it was attached by SCSI.
The disk device hdisk0 was busy an average of 9.62 percent of the time. This indicates that the device is not a performance bottleneck. During the peak interval from 10:16:24 to 10:21:24, the disk was 43.4 percent busy. The average service time reported for this device and its accompanying disk subsystem was 5.5 milliseconds. This is indicative of a very fast disk or a disk controller with cache. Nmon reports that the capacity of this disk was 73.40 gigabytes and it was attached by SCSI.
This section is designed to provide the user with a rudimentary linear capacity planning model and should be used for rough approximations only. These estimates assume that an increase in workload will affect the usage of all resources equally. These estimates should be used on days when the load is heaviest to determine approximately how much spare capacity remains at peak times.
WARNING: Data in this section may be inaccurate because the length of the average sampling interval was only 5.00 minutes. When the interval is less than 10 minutes, peak statistics are likely to underestimate the remaining amount of CPU or disk capacity.
Based on the data available, the system should be able to support approximately a 73 percent increase in workload at peak times before the first resource bottleneck affects performance or reliability, and that bottleneck is likely to be disk I/O. See the following paragraphs for additional information.

The CPU can support an increase in workload of at least 100 percent at peak times. Due to the lack of page outs or swapping activity, the amount of memory present should be able to support a significantly greater load. The busiest disk can support a workload increase of approximately 73 percent at peak times. For more information on peak resource utilization, refer to the Resource Analysis section of this report.
Please note: In no event can Aptitune Corporation be held responsible for any damages, including incidental or consequent damages, in connection with or arising out of the use or inability to use this software. All trademarks belong to their respective owners.
This is beta quality software and is to be used only in conjunction with a beta test program. This software is likely to contain defects and its recommendations should be regarded skeptically. This software provided for the exclusive use of: test. This software expires on 12/29/2009 (mm/dd/yyyy). Code version: 1.01.00. Serial number: 49493939.
(c) copyright 1995-2009 by Aptitune Corporation, Portsmouth NH, USA, All Rights Reserved. http://www.sarcheck.com
| Statistics for system, testsystem | ||||
|---|---|---|---|---|
| Start of peak interval | End of peak interval | Date of peak interval | ||
| System ID in nmon data, | 006789ABC00 | |||
| LPAR ID, | 1 | |||
| LPAR Name, | testsystem | |||
| System model number is, | 9119-595 | |||
| Statistics collected on, | 11/04/2008 | |||
| Average phys processors consumed, | 3.55 | |||
| Peak phys processors consumed, | 5.50 | 14:41:26 | 14:46:26 | 11/04/2008 |
| Average entitled capacity consumed, | 15.42% | |||
| Peak entitled capacity consumed, | 23.93% | 14:41:26 | 14:46:26 | 11/04/2008 |
| Average CPU utilization, | 52.5% | |||
| Peak CPU utilization, | 63% | 09:01:24 | 09:06:24 | 11/04/2008 |
| Average user CPU utilization, | 36.8% | |||
| Average sys CPU utilization, | 15.7% | |||
| Average waiting for I/O, | 0.1% | |||
| Peak page replacement cycle rate, | 0.00 / sec | |||
| Average page stealer scan rate, | 0.0 MB/sec | |||
| Peak page stealer scan rate, | 0.0 MB/sec | |||
| Average % memory in use, | 45.5% | |||
| Average % non-file pages, | 28.3% | |||
| Average numperm value, | 17.3% | |||
| Average numclient value, | 17.3% | |||
| Average context switch rate, | 0.00 / sec | |||
| Disk device w/highest peak, | hdisk0 | |||
| Avg pct busy for that disk, | 9.6% | |||
| Peak pct busy for that disk, | 43% | 10:16:24 | 10:21:24 | |
| Approx CPU capacity remaining, | 100%+ | |||
| Approx I/O bandwidth remaining, | 72.8% | |||
| Can memory support add'l load, | Yes | |||