From owner-nwchem-users@emsl.pnl.gov Wed Oct 27 16:22:15 2004 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.12.10/8.12.10) with ESMTP id i9RNMF5J000165 for ; Wed, 27 Oct 2004 16:22:15 -0700 (PDT) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.12.10/8.12.10/Submit) id i9RNMFvU000164 for nwchem-users-outgoing; Wed, 27 Oct 2004 16:22:15 -0700 (PDT) Date: Wed, 27 Oct 2004 16:22:13 -0700 From: Edoardo Apra Subject: Re: problems with parallel performance In-reply-to: To: Kirk Peterson Cc: nwchem-users@emsl.pnl.gov Message-id: <41802DA5.7040800@pnl.gov> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7bit X-Accept-Language: en-us, en User-Agent: Mozilla Thunderbird 0.7.3 (X11/20040803) References: Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Kirk could you give more details about the nwchem binaries your are using? More precisely how where these binaries compiled (USE_MPI=? NWCHEM_TARGET=? ARMCI_NETWORK=?) Edo Kirk Peterson wrote: > Hi, > > we've recently gotten NWChem (4.6) running on our Athlon Linux cluster > (Red Hat, dual processor nodes), which uses myrinet for MPI > communication. We observe two problems, which seem to be unrelated. > > The first is that when we run 2-way parallel across nodes, i.e., 1 > processor on each node, nwchem starts up 2 processes on each node for a > total of 4 processes. Each of these seems to accumulate essentially > the same amounts of CPU cycles. If I run a job 2-way on 1 node, > everything looks normal. I seem to remember a similar problem when I > linked up Molpro to version 3.3.1 of the global array library. > > The 2nd problem is in regards to parallel performance. I've only > really tested this for one of the TCE coupled cluster test runs (H2O), > but in general the parallel runs are much slower than a single > processor run, e.g., 31 sec for 1 proc, 78 sec for 2 procs on 1 node, > and 100 sec for 2 procs over 2 nodes (with the problems noted above). > I see similar behavior with just a simple MP2 calculation (Au-H2O), 9 > sec for 1 proc and 13 sec for 2 procs. Certainly I would like to test > this for more nodes, but not until the issue above is resolved. > > thanks in advance, > > Kirk > > PS - is it really true that i functions are not supported yet? > -- Edoardo Apra` - PNNL - P.O. Box 999, MS K8-91 - Richland, WA 99352 Tel +1-509-376-1280 Fax +1-509-376-0420