Optimizing the HPCC Randomaccess Benchmark on Blue Gene/L Supercomputer Rahul Garg IBM India Research Lab Block-I, IIT Delhi, Hauz Khas-16 New Delhi, India Yogish Sabharwal IBM India Research Lab Block-I, IIT Delhi, Hauz Khas-16 New Delhi, India grahul@in.ibm.com Categories and Subject Descriptors: C.4 [Performance of Systems]: Measurement techniques General Terms: Measurement, Performance. Keywords: High Performance Computing, Benchmarks, Randomaccess ysabharwal@in.ibm.com (say ai ), the most signi cant bits are selected to index into the distributed table T . The entry (which may reside on a remote node) is xor ed with the random number ai . The performance of the system is measured by the number of giga updates per second (GUPS) performed by the system. Two types of performance numbers are reported. A baseline run is obtained by compiling and running the supplied code without making any changes (except for linking certain optimized libraries such as ESSL). In an optimized run, the function that implements the benchmark may be replaced by an optimized, system-speci c implementation. This implementation must (a) use the same random number generator as in the baseline code (b) ensure that at any stage, the number of pending updates stored at any node does not
/lp/association-for-computing-machinery/optimizing-the-hpcc-randomaccess-benchmark-on-blue-gene-l-l08LpdKjrw