From owner-nwchem-users@emsl.pnl.gov Mon Mar 10 16:18:27 2008 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.14.1/8.14.1) with ESMTP id m2ANIQpr004309 for ; Mon, 10 Mar 2008 16:18:27 -0700 (PDT) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.14.1/8.14.1/Submit) id m2ANIQjp004308 for nwchem-users-outgoing-0915; Mon, 10 Mar 2008 16:18:26 -0700 (PDT) X-Authentication-Warning: odyssey.emsl.pnl.gov: majordom set sender to owner-nwchem-users@emsl.pnl.gov using -f X-Ironport-SG: OK_Domains X-Ironport-SBRS: 3.9 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlABABpf1UdA6aa2i2dsb2JhbACRAwEBAQgEBAkKEQWTZYVt X-IronPort-AV: E=Sophos;i="4.25,476,1199692800"; d="scan'208";a="45651749" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=cnUyM7HMEHNTjXdiN+MbK5Q8ivp+O50IbpgxVodAI0U=; b=iWook1k5sDLE4u+syBBy4Dy4iXFSydmldN8H1em2wQl3CtIEeEuxC8TThmxRZWHupzJLXtK003ejK5d6k29y6ylXFokCbLEW6/WuZ8T3vP6NmgYKnKF774hGEftTM9vWFI5Qrh3/m1FtK8g7wMrXn/wbzDMnwAWm1QrEsN3y6cA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=jDiwQ8Gk6lbQY951zbyncNnJtVe17zVBeCuh9+2dbFLochObAHIhAU48EYf6ki8186l+1euQgzp36IA1jNFwk/iPV5JSjfe6NG38/q/PoF3NwGXjJG+ZWEmAHcCPZVIMprI6catEf1KlHPSc0sKdGnXQQOpeXwBYFgNBqtWhGIk= Message-ID: <96f4bb620803101618w3df9a2a1yafd5069a7360863e@mail.gmail.com> Date: Mon, 10 Mar 2008 18:18:08 -0500 From: "Jeff Hammond" To: "Eric Sun" Subject: Re: [NWCHEM] Questions on BlueGene system Cc: "NWChem Users" In-Reply-To: <22f693f60803101530m349f5826hb03012e700d6db63@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <22f693f60803101530m349f5826hb03012e700d6db63@mail.gmail.com> Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Eric, First, the job you refer to "CCSD(T) calculations on systems with 1376 basis functions" was most likely done without disk on MPP2 using a huge number of processors. Detailed specs on MPP2 are given here: http://mscf.emsl.pnl.gov/hardware/config_mpp2.shtml. MPP2 is very different machine from BGL because it has 10 times as much memory per node. Since the RPI BGL is only 2048 nodes, it is unlikely that you can reproduce the biggest NWChem jobs on MPP2 on your machine. I wonder if you're using the optimal settings in TCE for your benzene CCSD(T)/aug-cc-pVTZ job. If you send me the input file I can figure out what might be changed. The memory bottleneck is the two-electron integral transformation and Karol has added very efficient methods for this to NWChem 5.1 which you should definitely be using on BGL. The most efficient algorithms are available only for "in-core" calculations, and it is worth using the additional processors it takes to use this option since the slowdown from disking is going to be substantial. ARMCI_NETWORK and LARGE_FILES need only be set at compile time. I don't set LIB_DEFINES when I compile on BGL. You're best bet is to run on BGL in coprocessor mode so that you maximize the memory per process. After working with the Argonne folks, it seems that is good to run NWChem with "-e BGLMPIO_COMM=1" as a command line option when you submit your job. Best, Jeff On Mon, Mar 10, 2008 at 5:30 PM, Eric Sun wrote: > Dear NWChem developers, > > I just installed NWChem 5.1 on a BlueGene system here. This system is new > and the load is still low at this time. So I'm trying to explore its limit > by doing some "large" calculations. But quite frustrating that even a > CCSD(T) calculation on benzene with aug-cc-pVTZ basis set didn't go through. > It looks like a problem of disk quota. Probably I have a quota of 250GB. > There is no scratch space on this system. After the crash, I checked the > temporary files, total size is about 226 GB. It's about right. In a report > written by Bert with title "NWChem Status and Future Directions", I noticed > that CCSD(T) calculations on systems with 1376 basis functions have been > done by NWChem. I'm wondering how big scratch space is required for such > calculations. What's the detailed configuration of the computer system where > the calculations were done? Here, it should be no problem to request 2048 > CPUs for a single job. Each CPU could have 512M or 1GB RAM. I'm wondering if > it's possible to do similar (or at least half the size) calculations as in > the 8 H2O cluster case. > > Two more questions: > 1) The environment variables ARMCI_NETWORK and LARGE_FILES have been set > when compiling. Do I still need to set them at run-time? In my testing jobs, > I didn't set them because I still don't know how to pass the variables by > LoadLeveler. > 2) In the file INSTALL about the installation on BlueGene system, the > environment variable LIB_DEFINES is not mentioned. I'm wondering if it's > because this variable is not necessary for BlueGene system. > > Thanks > -- > Yiyang Sun > Department of Physics, Applied Physics, and Astronomy > Rensselaer Polytechnic Institute > Troy, NY 12180, USA -- Jeff Hammond The University of Chicago