From owner-nwchem-users@emsl.pnl.gov Wed Mar 9 16:24:57 2005 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.12.10/8.12.10) with ESMTP id j2A0Ovi0023058 for ; Wed, 9 Mar 2005 16:24:57 -0800 (PST) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.12.10/8.12.10/Submit) id j2A0OvCF023057 for nwchem-users-outgoing; Wed, 9 Mar 2005 16:24:57 -0800 (PST) Date: Wed, 09 Mar 2005 17:24:44 -0700 From: Waldemar Lysz Subject: NWCHEM on multiprocessor SGI Origin To: NWChem list Message-id: <422F93CC.7BDF6AEB@ualberta.ca> Organization: University of Alberta MIME-version: 1.0 X-Mailer: Mozilla 4.8C-SGI [en] (X11; U; IRIX 6.5 IP32) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Accept-Language: en, pl, sv References: <1110405127.3155.21.camel@paul-adamsons-computer.local> Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Hello, I've been trying to run NWCHEM on a 256 cpu SGI Origin3000. I am able to run small, single-proc test jobs but when I try to run a 128 proc/ 128 MB job, it crashes with the following cryptic error printed to the output file. Could you suggest a possible solution. Thanks. 126:126:nga_put:cannot locate region: [1199871849:1199976622 ,1199871849:1199976622 ]:: -999 126: ARMCI aborting -999 (0xfffffffffffffc19). 96:SigIntHandler: interrupt signal was caught: 2 96: ARMCI aborting 2 (0x2). 123:SigIntHandler: interrupt signal was caught: 2 94:SigIntHandler: interrupt signal was caught: 2 98:SigIntHandler: interrupt signal was caught: 2 58:SigIntHandler: interrupt signal was caught: 2 48:SigIntHandler: interrupt signal was caught: 2 123: ARMCI aborting 2 (0x2). 94: ARMCI aborting 2 (0x2). 98: ARMCI aborting 2 (0x2). 58: ARMCI aborting 2 (0x2). (more like this) Can you help me sort this out? Thanks, PS. Here is the input file I used: START phenol TITLE "RHF-CISD(T)/aug-cc-pVDZ Phenol" CHARGE 0 GEOMETRY C -0.48746044 1.1296317376 0. C -0.9223291325 -0.2003564861 0. C 0.011610309 -1.2451721503 0. C 1.3758284802 -0.9514041974 0. C 1.8222228007 0.3774494266 0. C 0.8841213298 1.4125410552 0. H -1.2150635021 1.9397189938 0. H -0.3496468217 -2.269193065 0. H 2.0954780233 -1.7662430343 0. H 2.8852721415 0.5997974606 0. H 1.2134383547 2.4485473551 0. H -2.7633989653 0.263700623 0. O -2.2462549142 -0.5440580809 0. END BASIS * library aug-cc-pVDZ END SCF RHF Singlet THRESH 1.0e-10 TOL2E 1.0e-10 END TCE CCSD(T) END TASK TCE ENERGY ____________________________ 126:126:nga_put:cannot locate region: [1199871849:1199976622 ,1199871849:1199976622 ]:: -999 Last System Error Message from Task 126:: No such file or directory 126: ARMCI aborting -999 (0xfffffffffffffc19). system error message: No such file or directory 96:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 96:: No such file or directory 96: ARMCI aborting 2 (0x2). system error message: Invalid argument 123:SigIntHandler: interrupt signal was caught: 2 94:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 123:: No such file or directory 98:SigIntHandler: interrupt signal was caught: 2 58:SigIntHandler: interrupt signal was caught: 2 123: ARMLast System Error Message from Task 94:CI aborting: 2 (0x2)No such file or directory. system error message: Invalid argumentLast System Error Message from Task 98: : Last System Error Message from Task 58:No such file or directory: 94: ARMNo such file or directoryCI aborting 2 (0x2). system error message: Invalid argument 98: ARMCI aborting 2 (0x2). system error message: 58: ARMInvalid argumentCI aborting 2 (0x2). system error message: Invalid argument 117:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 117:: No such file or directory 117: ARMCI aborting 2 (0x2). system error message: Invalid argument 102:SigIntHandler: interrupt signal was caught: 2 99:SigIntHandler: interrupt signal was caught: 2 124:SigIntHandler: interrupt signal was caught: 2 44:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 124:: No such file or directory 35:SigIntHandler: interrupt signal was caught: 2 124: ARMLast System Error Message from Task 44:CI aborting: 55:SigIn120:SigI 2 (0x2)No such file or directoryLast System Error Message from Task 35:ntHandler: interrupt signal was caught. : : 2 system error message: Invalid argument 53:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 53:: No such file or directory 53: ARMCI aborting 2 (0x2). system error message: Invalid argument 49:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 49:: No such file or directory 49: ARMCI aborting 2 (0x2). system error message: Invalid argument 108:SigIntHandler: interrupt signal was caught: 2 80:SigIntHandler: interrupt signal was caught: 2 107:SigIntHandler: interrupt signal was caught: 2 76:SigIntHandler: interrupt signal was caught: 2 71:SigIntHandler: interrupt signal was caught: 2 50:SigIntHandler: interrupt signal was caught: 2 41:SigInLast System Error Message from Task 108:tHandler: interrupt signal was caught: Last System Error Message from Task 80:85:SigIn: 2 51:SigInNo such file or directoryLast System Error Message from Task 107:: tHandler: interrupt signal was caughtLast System Error Message from Task 76:tHandler: interrupt signal was caught Last System Error Message from Task 71:: No such file or directoryLast System Error Message from Task 50:: 2 : : 2 : No such file or directoryLast System Error Message from Task 41:: No such file or directory108: ARMNo such file or directory : No such file or directory CI aborting 80: ARMNo such file or directory Last System Error Message from Task 51:118:SigI 2 (0x2)CI aborting Last System Error Message from Task 85:: 76: ARM. ntHandler: interrupt signal was caught107: ARM 2 (0x2) 71: ARM 50: ARMNo such file or directory: CI abortingsystem error message: 2 CI aborting. 125:SigICI aborting 41: ARMCI abortingNo such file or directory 2 (0x2): system error messagentHandler: interrupt signal was caught 2 (0x2)CI aborting 2 (0x2). Last System Error Message from Task 118: 2 (0x2)Invalid argument: : 2 . 2 (0x2) 51: ARM . system error message: . Invalid argumentsystem error message. system error message: No such file or directorysystem error message 85: ARMCI aborting : 10:SigInsystem error message: Invalid argumentLast System Error Message from Task 125: : 2 (0x2)CI abortingInvalid argumenttHandler: interrupt signal was caught: Invalid argument : 2 (0x2). : 2 Invalid argument118: ARMInvalid argument system error message. No such file or directory CI aborting system error message: 2 (0x2)Invalid argument: . Invalid argument Last System Error Message from Task 10:system error message: : Invalid argument 92:SigIntHandler: interrupt signal was caught : 2 24:SigIntHandler: interrupt signal was caught: 2 111:SigIntHandler: interrupt signal was caught: 2 37:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 111:: No such file or directory 125: ARMCI aborting 2 (0x2). system error message: Invalid argument 111: ARMCI aborting 2 (0x2). system error message: Invalid argument 115:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 115:: No such file or directory 115: ARMCI aborting 2 (0x2). system error message: Invalid argument 77:SigIntHandler: interrupt signal was caught: 2 68:SigIntHandler: interrupt signal was caught: 2 18:SigIntHandler: interrupt signal was caught: 2 28:SigIntHandler: interrupt signal was caught: 2 112:SigIntHandler: interrupt signal was caught: 2 104:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 112:Last System Error Message from Task 104:: : No such file or directoryNo such file or directory 14:SigIntHandler: interrupt signal was caught17:SigIn: 2 tHandler: interrupt signal was caught20:SigIn: 2 112: ARMtHandler: interrupt signal was caught104: ARMCI aborting73:SigIn: 2 CI aborting 2 (0x2)tHandler: interrupt signal was caught 2 (0x2). : 2 . Last System Error Message from Task 14:system error messagesystem error message: 90:SigIn: Last System Error Message from Task 17:: No such file or directorytHandler: interrupt signal was caughtInvalid argument: Invalid argumentLast System Error Message from Task 20: : 2 Last System Error Message from Task 73: No such file or directory : : 14: ARMNo such file or directoryNo such file or directoryCI aborting 45:SigIn 2 (0x2) 17: ARMtHandler: interrupt signal was caught. CI aborting: 2 20: ARMsystem error message 73: ARM 2 (0x2)CI aborting: CI aborting. 2 (0x2)Invalid argument 2 (0x2)system error message. . : system error messagesystem error messageInvalid argument: : Invalid argumentInvalid argument 59:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 59:: No such file or directory 97:SigIntHandler: interrupt signal was caught: 2 7:SigInt 59: ARMHandler: interrupt signal was caughtCI aborting: 2 2 (0x2). system error message: 69:SigInLast System Error Message from Task 97:Invalid argumenttHandler: interrupt signal was caught: : 2 No such file or directory 78:SigInLast System Error Message from Task 7:tHandler: interrupt signal was caught: : 2 97: ARMNo such file or directoryCI aborting 2 (0x2). system error messageLast System Error Message from Task 69:70:SigIn: : tHandler: interrupt signal was caughtInvalid argument 7: ARMNo such file or directory: 2 Last System Error Message from Task 78: CI aborting : 2 (0x2)No such file or directory61:SigIn. tHandler: interrupt signal was caughtsystem error message 69: ARM: 2 Last System Error Message from Task 70:: CI abortingInvalid argument 2 (0x2) . system error message93:SigIn: tHandler: interrupt signal was caughtInvalid argument: 2 Last System Error Message from Task 93:: No such file or directory 82:SigIntHandler: interrupt signal was caught: 2 93: ARMCI aborting 2 (0x2). system error message: Invalid argument 87:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 87:: No such file or directory 87: ARMCI aborting 2 (0x2). system error message: Invalid argument 88:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 88:: No such file or directory 88: ARMCI aborting 2 (0x2). system error message: Invalid argument 30:SigIntHandler: interrupt signal was caught: 2 2:SigIntHandler: interrupt signal was caught: 2 54:SigIntHandler: interrupt signal was caught113:SigI: 2 ntHandler: interrupt signal was caught: 2 Last System Error Message from Task 113:: No such file or directory Last System Error Message from Task 54:: No such file or directory 113: ARMCI aborting 2 (0x2). system error message: Invalid argument 54: ARMCI aborting 2 (0x2). system error message: Invalid argument 0:Child process terminated prematurely, status=: 256 Last System Error Message from Task 0:109:SigI: ntHandler: interrupt signal was caughtNo such file or directory66:SigIn: 2 tHandler: interrupt signal was caught: 2 Last System Error Message from Task 109: 0: ARM: CI abortingNo such file or directory 256 (0x Last System Error Message from Task 66:100: ). No such file or directorysystem error message43:SigIn 109: ARM: tHandler: interrupt signal was caughtCI abortingInvalid argument: 2 2 (0x2) 66: ARM. CI abortingsystem error message 2 (0x2): . Invalid argumentsystem error messageLast System Error Message from Task 43: : : Invalid argument34:SigInNo such file or directory tHandler: interrupt signal was caught : 2 116:SigIntHandler: interrupt signal was caught: 2 43: ARMCI aborting 2 (0x2). system error message: Invalid argument Last System Error Message from Task 116:: No such file or directory 116: ARMCI aborting 2 (0x2). system error message: Invalid argument 95:SigIn110:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 110:84:SigIn: tHandler: interrupt signal was caughtNo such file or directory16:SigIn: 2 tHandler: interrupt signal was caught: 2 110: ARMCI aborting 2 (0x2). system error message: Invalid argument Last System Error Message from Task 84:: No such file or directory Last System Error Message from Task 16:: No such file or directory 84: ARMCI aborting 2 (0x2). system error message: Invalid argument 16: ARMCI aborting 2 (0x2). system error message: Invalid argument 74:SigIn114:SigItHandler: interrupt signal was caughtntHandler: interrupt signal was caught: 2 : 2 Last System Error Message from Task 114:: No such file or directory 122:SigIntHandler: interrupt signal was caught: 2 114: ARMLast System Error Message from Task 74:CI aborting: 2 (0x2)No such file or directory. system error messageLast System Error Message from Task 122:: : Invalid argumentNo such file or directory 74: ARMCI aborting 2 (0x2). system error message: Invalid argument 122: ARMCI aborting 2 (0x2). system error message: Invalid argument 13:SigIntHandler: interrupt signal was caught: 2 89:SigIntHandler: interrupt signal was caught: 2 32:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 89:: No such file or directory 89: ARMCI aborting 2 (0x2). Last System Error Message from Task 32:system error message: 29:SigIn: No such file or directorytHandler: interrupt signal was caughtInvalid argument : 2 32: ARMCI aborting 2 (0x2). system error message: Invalid argument 36:SigIn42:SigIntHandler: interrupt signal was caught: 2 121:SigIntHandler: interrupt signal was caught: 2 105:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 42:: No such file or directoryLast System Error Message from Task 121: : No such file or directory Last System Error Message from Task 105:: No such file or directory 42: ARMCI aborting 2 (0x2)121: ARM. CI aborting105: ARMsystem error message 2 (0x2)CI aborting: . 2 (0x2)127:SigIInvalid argumentsystem error message. ntHandler: interrupt signal was caught 119:SigI: system error message: 2 ntHandler: interrupt signal was caughtInvalid argument: : 2 Invalid argument 101:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 127:: No such file or directory 12:SigIntHandler: interrupt signal was caught: 2 127: ARMCI aborting 2 (0x2). system error message: Invalid argument Last System Error Message from Task 101:1:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 12:: No such file or directory 103:SigIntHandler: interrupt signal was caught: 2 12: ARMCI aborting 2 (0x2). system error messageLast System Error Message from Task 1:: : Invalid argumentNo such file or directory Last System Error Message from Task 103:: No such file or directory 103: ARMCI aborting 2 (0x2). system error message: Invalid argument 62:SigIntHandler: interrupt signal was caught: 2 5:SigIntHandler: interrupt signal was caught: 2 106:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 106:: No such file or directory 23:SigIntHandler: interrupt signal was caught: 2 106: ARMCI aborting 2 (0x2). system error message: Invalid argument 75:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 23:: No such file or directory 9:SigIntHandler: interrupt signal was caught: 2 23: ARMCI aborting 2 (0x2). system error message: Invalid argument 39:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 39:: No such file or directory 39: ARMCI aborting 2 (0x2). system error message: Invalid argument 67:SigIntHandler: interrupt signal was caught: 2 64:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 67:: No such file or directory Last System Error Message from Task 64:: No such file or directory 83:SigIntHandler: interrupt signal was caught: 2 67: ARMCI aborting 2 (0x2). 64: ARMsystem error messageCI aborting: 2 (0x2)Invalid argument. Last System Error Message from Task 83:system error message: : No such file or directoryInvalid argument 15:SigIn 83: ARMtHandler: interrupt signal was caughtCI aborting 2 (0x2). 81:SigInsystem error messagetHandler: interrupt signal was caught: : 2 Invalid argument 38:SigIntHandler: interrupt signal was caught: 2 60:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 60:: No such file or directory 60: ARMCI aborting 2 (0x2). system error message: Invalid argument 57:SigIntHandler: interrupt signal was caught: 2 40:SigIntHandler: interrupt signal was caught: 2 4:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 57:: Last System Error Message from Task 40:No such file or directory: No such file or directoryLast System Error Message from Task 4: : No such file or directory 72:SigIn 57: ARMtHandler: interrupt signal was caughtCI aborting 40: ARM: 2 4: ARM 2 (0x2)CI abortingCI aborting. 2 (0x2) 2 (0x2)system error message. . : system error messagesystem error messageInvalid argument: Last System Error Message from Task 72:: Invalid argument: Invalid argument No such file or directory 72: ARMCI aborting 2 (0x2). system error message: Invalid argument 21:SigIntHandler: interrupt signal was caught: 2 19:SigIntHandler: interrupt signal was caught: 2 3:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 21:: No such file or directory Last System Error Message from Task 19:: No such file or directoryLast System Error Message from Task 3: : No such file or directory 21: ARMCI aborting 2 (0x2). system error message: Invalid argument 19: ARMCI aborting 2 (0x2). 3: ARMsystem error messageCI aborting: 2 (0x2)Invalid argument. system error message: Invalid argument 79:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 79:: No such file or directory 79: ARMCI aborting 2 (0x2). system error message: Invalid argument 86:SigIntHandler: interrupt signal was caught: 2 65:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 86:: No such file or directory 86: ARMCI aborting 2 (0x2). system error message: Invalid argument Last System Error Message from Task 65:: No such file or directory 33:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 33:: No such file or directory 33: ARMCI aborting 2 (0x2). system error message: Invalid argument 52:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 52:: No such file or directory 25:SigIn 52: ARMtHandler: interrupt signal was caughtCI aborting: 2 2 (0x2). system error message: Invalid argument 27:SigIntHandler: interrupt signal was caught: 2 Last System Error Message from Task 27:: No such file or directory 27: ARMCI aborting 2 (0x2). system error message: Invalid argument 128: interrupt(1) WaitAll: No children or error in wait? -- Waldemar D. Lysz | Computing & Network Services wdl@ualberta.ca | Research Computing Support Tel.780 492-9306 Fax.780 492-1729 | University of Alberta http://www.ualberta.ca/CNS/RESEARCH | Edmonton, Alberta, CANADA