
                    OPTIMIZATIONS FOR CONGO TESTS

This document describes potential optimizations used by PST tests in the Congo Release.
________________________________________________________________________________
________________________________________________________________________________

MULTICAST

Reliable multicast or point-to-point.  Multicast should only be used at sufficient scale, since point-to-point is more efficient for small number of nodes.
________________________________________________________________________________

SERIALIZATION

Reduce serialization overhead with DataSerializable.
________________________________________________________________________________

SOCKETS conserveSockets=false

Instructs threads not to share sockets with other threads.  It can be used to improve performance in small-scale tests.  Any test setting this to false must make sure that the system does not automatically switch over to shared sockets during the test run due to heavy load, since this hurts performance more than sharing sockets from the outset.
________________________________________________________________________________

TCP p2p.nodirectBuffers=true

Instructs the TCP conduit to use heap byte buffers for NIO operations instead of direct byte buffers.  Given a direct byte buffer, the Java virtual machine will make a best effort to perform native I/O operations directly upon it.  That is, it will attempt to avoid copying the buffer's content to (or from) an intermediate buffer before (or after) each invocation of one of the underlying operating system's native I/O operations.  Direct byte buffers typically have somewhat higher allocation and de-allocation costs than non-direct buffers.  The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious.  It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations. In general it is best to allocate direct buffers only when they yield a measurable gain in program performance.
________________________________________________________________________________

NIO -XX:MaxDirectMemorySize=256m

Specifies the maximum amount of memory in bytes that the Java NIO library can allocate for direct memory buffers.  The default is 64 megabytes, which corresponds to 64m.  The use of direct memory buffers can minimize the copying cost when doing I/O operations.

Darrel said this switch is no longer pertinent.
________________________________________________________________________________

NIO DistributionManager.enqueueOrderedMessages=false

Inlines incoming message processing.
________________________________________________________________________________

EVICTION CONTROL lru.maxSearchEntries=10

Limits LRU eviction search to the specified number of entries.  Improves eviction performance as the number of entries searched decreases.  However, the semantics are no longer LRU.  See LRUStatistics.greedyReturns for a count of how many evictions were carried out in a not-completely-LRU manner.
________________________________________________________________________________

JAVA VM -Xms, -Xmx

Try not to exceed 80% of available RAM.  Exceeding available RAM results in swapping to disk.
________________________________________________________________________________

JAVA VM -XX:CompileThreshold=100

Shortens the HotSpot compilation warm-up period.  The default setting for the server VM is 10,000, which cause warm-up to take minutes.  The default for the client VM is only 1500, but hydra defaults to using server VMs.  Too low a value delays application startup, due to HotSpot compiling just about every method it discovers into native code.
________________________________________________________________________________

JAVA GC

The default garbage collector should be the first choice for garbage collection and will be adequate for the majority of applications. Each of the other collectors have some added overhead and/or complexity, which is the price for specialized behavior. If the application doesn't need the specialized behavior of the alternate collectors, use the default collector.
________________________________________________________________________________

JAVA GC -XX:+UseConcMarkSweepGC

The concurrent low pause collector.  Collects tenured generation concurrently with the application, thereby reducing the duration of pauses.  Useful for applications with large sets of long-lived data.
________________________________________________________________________________

JAVA GC -XX:+UseParNewGC

Enables multithreaded young generation collection.
________________________________________________________________________________

JAVA GC -XX:NewSize=16m and -XX:MaxNewSize=16m

Initial and maximum size of the young generation, that is, the amount of heap used for object creation.  A larger young generation yields fewer minor GCs, but also shrinks the tenured generation, increasing the frequency of major collections.  And vice versa.
________________________________________________________________________________

JAVA GC -XX:NewRatio=1

This denotes the relative size of the tenured generation to the young generation.  By default, the young generation size is controlled by NewRatio. For example, setting -XX:NewRatio=3 means that the ratio between the young and tenured generation is 1:3. In other words, the combined size of the eden and survivor spaces will be one fourth of the total heap size. 
________________________________________________________________________________

JAVA GC -XX:+AggressiveHeap

For large applications that are heavily threaded and run on hardware with a large amount of memory and a large number of processors, first try the aggressive heap option.  This option inspects the machine resources (size of memory and number of processors) and attempts to set various parameters to be optimal for long-running, memory allocation-intensive jobs. It was originally intended for machines with large amounts of memory and a large number of CPUs, but in the J2SE platform, version 1.4.1 and later it has shown itself to be useful even on four processor machines.

With this option the throughput collector (-XX:+UseParallelGC) is used along with adaptive sizing (-XX:+UseAdaptiveSizePolicy). The physical memory on the machines must be at least 256MB before AggressiveHeap can be used. The size of the initial heap is calculated based on the size of the physical memory and attempts to make maximal use of the physical memory for the heap (i.e., the algorithms attempt to use heaps nearly as large as the total physical memory).
________________________________________________________________________________

JAVA GC -XX:+DisableExplicitGC

Turns off explicit GC.
________________________________________________________________________________

JAVA GC -XX:MaxGCPauseMillis=n

Not recommended.  Does not work well.
________________________________________________________________________________
