Shared Memory Programming in Metacomputing Environment

As a part of the Supercomputing'95 IWAY project, Global Arrays shared-memory model was implemented in the metacomputing environment that consisted of multiple MPPs connected via WAN (Wide Area Network). The NUMA shared-memory paradigm can be used accross the network of supercomputers with global arrays being distributed on multiple supercomputers but still accessed with the same asynchronous one-sided communication operations. Two versions of GA were implemented:

Metacomputing Environment

The metacomputing environment, in addition to providing potentially enormous computational power, poses some challenging problems that include:

Applications

Initial experiments with GA test programs and NWChem, a complex chemistry package that runs on top of GA, showed that the metacomputer cannot be programmed as a single homogenous machine neglecting the performance characteristics of the network connection.

A new GA functionality, that allows the user to create and operate on global arrays objects mirrored on each MPP and merge the results when required, was very succesful in addressing the high-latency and low-bandwidth of the network.

This figure demonstrates execution time for one iteration of the SCF calculation (C4H10) molecule on single Paragon and two Paragons (at Caltech and San Diego Supercomputer Center) connected with WAN (30ms latency and 70kBytes/s bandwidth). The red bar corresponds to fully-distributed GA that ignores WAN memory hierarchy and green corresponds to the mirrored approach.
This figure demonstrates performance of the SCF calculation for a larger problem (18-crown-ether) on the Intel Paragons at Caltech and in San Diego.
This figure illustrates the speedup for the same problem on the IBM SP1.5 at Argonne and IBM SP-2 at Cornell connected with WAN (150ms latency, 50kBytes/s bandwidth) using mirrored approach. In addition, to simulate the effect of better WAN bandwidth, the program was executed on two partitions of the same machine connected with Ethernet (1MB/s).

More information

The MPEG movie shows visualization of the geometry optimization based on results of these calculations.

A paper (HPDC-5 Best Paper Award) describes results of this work.

Jarek Nieplocha <j_nieplocha@pnl.gov> 12.13.96