From owner-nwchem-users@emsl.pnl.gov Tue May 2 06:39:40 2006 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.13.6/8.13.6) with ESMTP id k42Dder1015337 for ; Tue, 2 May 2006 06:39:40 -0700 (PDT) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.13.6/8.13.6/Submit) id k42Dde0s015336 for nwchem-users-outgoing-0915; Tue, 2 May 2006 06:39:40 -0700 (PDT) Date: Tue, 02 May 2006 12:21:37 +0530 From: Siv Chand Koripella Subject: [NWCHEM] Prolem with MPI To: nwchem-users@emsl.pnl.gov Message-id: <6eebede30605012351x50b51a06j20f7e939165addf2@mail.gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8; format=flowed Content-disposition: inline DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=SoVdQJusVEre1xXs/ksGY8VwLJRj8GEKFIbMeEEXzJlPLrjgv1y7XKEsr+PRcNMHs9NbsTUDrFdlHbFFiwHDe+QBfmbMP5fo9sc5NhvpcEc9Xx3rmnxl5Mb3koD1xxfxPPXdwy3wrUFj3lhvt+usEJRFEKNhta/+svPvMwLx/nE= Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by odyssey.emsl.pnl.gov id k42DddMN015328 Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk I have tried using nwchem with lam-mpi. OS: Rocks 4.0 CC: gcc F77: g77 I compiled the code as per directions. when I submitted a job using... mpirun -v -np 4 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem input.inp the output is: 22514 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n0 (o) 11129 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n1 11230 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n2 11354 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n3 ----------------------------------------------------------------------------- It seems that [at least] one of the processes that was started with mpirun did not invoke MPI_INIT before quitting (it is possible that more than one process did not invoke MPI_INIT -- mpirun was only notified of the first one, which was on node n0). mpirun can *only* be used with MPI programs (i.e., programs that invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program to run non-MPI programs over the lambooted nodes. ----------------------------------------------------------------------------- So I did submit using lamexec with same options.. the output is: 22626 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n0 (o) 11198 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n1 11301 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n2 11426 /export/apps/nwchem-mpi-4.7/bin/LINUX/nwchem running on n3 22626 (n0) exited due to signal 11 I understand the signal 11 is segmentation fault. Can anybody help me on this... -- Siv Chand Koripella, Quot Homines Tot Sententiae - Terence