[check] Error = error 11 encountered when sending messages to CRSD [message #617633] |
Wed, 02 July 2014 06:59  |
 |
juniordbanewbie
Messages: 250 Registered: April 2014
|
Senior Member |
|
|
I've been trying to figure out why my crsd daemon has not been able to start.
grid@ORAC02:~> crsctl check cluster
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
grid@ORAC02:~>
grid@ORAC02:~> crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE orac02 Started
ora.crsd
1 ONLINE OFFLINE
ora.cssd
1 ONLINE ONLINE orac02
ora.cssdmonitor
1 ONLINE ONLINE orac02
ora.ctssd
1 ONLINE ONLINE orac02 OBSERVER
ora.diskmon
1 ONLINE ONLINE orac02
ora.evmd
1 ONLINE ONLINE orac02
ora.gipcd
1 ONLINE ONLINE orac02
ora.gpnpd
1 ONLINE ONLINE orac02
ora.mdnsd
1 ONLINE ONLINE orac02
I've follow both MOS Troubleshoot Grid Infrastructure Startup Issues (Doc ID 1050908.1) and 11gR2 Clusterware and Grid Home - What You Need to Know (Doc ID 1053147.1)
I mostly like stuck at Level 1 OHASD rootagent since level 3 CRSD is not spawned. Please correct me if I'm wrong.
when I look at $GRID_HOME/log/orac02/agent/ohasd/orarootagent_root/orarootagent_root.log
I found the following that could be the problem
374 2014-07-02 18:40:16.350: [ora.crsd][2944395008] [start] PID will be looked for in /u01/app/11.2.0/grid/crs/init/orac02.pid
375 2014-07-02 18:40:16.350: [ora.crsd][2944395008] [start] PID which will be monitored will be 11841
376 2014-07-02 18:40:16.350: [ora.crsd][2944395008] [start] }DaemonAgent::start
377 2014-07-02 18:40:16.351: [ora.crsd][2944395008] [start] clsn_agent::start }
378 2014-07-02 18:40:16.351: [ AGFW][2944395008] Command: start for resource: ora.crsd 1 1 completed with status: SUCCESS
379 2014-07-02 18:40:16.351: [ AGFW][2944395008] Executing command: check for resource: ora.crsd 1 1
380 2014-07-02 18:40:16.352: [ AGFW][2927609600] Agent sending reply for: RESOURCE_START[ora.crsd 1 1] ID 4098:629
381 2014-07-02 18:40:16.354: [ora.crsd][2944395008] [check] clsdmc_respget return: status=0, ecode=ffffff
382 2014-07-02 18:40:16.354: [ USRTHRD][2944395008] Thread:[DaemonCheck:crsd]start {
383 2014-07-02 18:40:16.354: [ USRTHRD][2944395008] Thread:[DaemonCheck:crsd]start }
384 2014-07-02 18:40:16.354: [ora.crsd][2944395008] [check] DaemonAgent::check returned 0
385 2014-07-02 18:40:16.355: [ AGFW][2944395008] check for resource: ora.crsd 1 1 completed with status: PARTIAL
386 2014-07-02 18:40:16.355: [ AGFW][2927609600] ora.crsd 1 1 state changed from: STARTING to: PARTIAL
387 2014-07-02 18:40:16.355: [ AGFW][2927609600] Started implicit monitor for:ora.crsd 1 1
388 2014-07-02 18:40:16.355: [ AGFW][2927609600] Agent sending last reply for: RESOURCE_START[ora.crsd 1 1] ID 4098:629
389 2014-07-02 18:40:19.416: [ USRTHRD][2793391872] Thread:[DaemonCheck:crsd]Thread exiting
390 2014-07-02 18:40:19.416: [ USRTHRD][2793391872] Thread:[DaemonCheck:crsd]Initiating a check action
391 2014-07-02 18:40:19.417: [ USRTHRD][2793391872] Check action requested by agent etnry point for ora.crsd
392 2014-07-02 18:40:19.417: [ AGFW][2927609600] Agent received the message: RESOURCE_PROBE[ora.crsd 1 1] ID 4097:81
393 2014-07-02 18:40:19.417: [ AGFW][2927609600] Preparing CHECK command for: ora.crsd 1 1
394 2014-07-02 18:40:19.417: [CLSFRAME][3030554368] TM [MultiThread] is changing desired thread # to 3. Current # is 2
395 2014-07-02 18:40:19.417: [ AGFW][2944395008] Executing command: check for resource: ora.crsd 1 1
396 2014-07-02 18:40:19.418: [ COMMCRS][2944395008]clscsendx: (0x7f4fb00453d0) Physical connection (0x7f4fb0044f30) not active
397
398 [ clsdmc][2944395008]Failed to send meta message to connection [(ADDRESS=(PROTOCOL=ipc)(KEY=orac02DBG_CRSD))][11]
399 2014-07-02 18:40:19.418: [ora.crsd][2944395008] [check] Error = error 11 encountered when sending messages to CRSD
400 2014-07-02 18:40:19.418: [ora.crsd][2944395008] [check] Calling PID check for daemon
401 2014-07-02 18:40:19.418: [ora.crsd][2944395008] [check] Trying to check PID = 11841
402 2014-07-02 18:40:19.418: [ COMMCRS][2944395008]clscsendx: (0x7f4fb00453d0) Connection not active
403
404 [ clsdmc][2944395008]Failed to send meta message to connection [(ADDRESS=(PROTOCOL=ipc)(KEY=orac02DBG_CRSD))][6]
405 2014-07-02 18:40:19.418: [ora.crsd][2944395008] [check] Error = error 6 encountered when sending messages to CRSD
406 2014-07-02 18:40:19.418: [ora.crsd][2944395008] [check] DaemonAgent::check returned 1
407 2014-07-02 18:40:19.418: [ AGFW][2944395008] check for resource: ora.crsd 1 1 completed with status: OFFLINE
I've done a pid check
grid@ORAC02:~> cat /u01/app/11.2.0/grid/crs/init/orac02.pid
14270
grid@ORAC02:~> ps -ef | grep 14270
grid 27210 17918 0 19:40 pts/1 00:00:00 grep 14270
there is process with pid 14270. Also if you observe, the pid I obtained is different from what is reflected in orarootagent_root.log
neither could I found any crsd pid 11841 in the os
grid@ORAC02:~> ps -ef | grep 11841
grid 27913 17918 0 19:43 pts/1 00:00:00 grep 11841
I also did a asm cluvfy check
grid@ORAC02:~> cluvfy comp asm -n orac02 -verbose
Verifying ASM Integrity
Task ASM Integrity check started...
Starting check to see if ASM is running on all cluster nodes...
PRVF-5137 : Failure while checking ASM status on node "orac02"
Starting Disk Groups check to see if at least one Disk Group configured...
PRVF-5112 : An Exception occurred while checking for Disk Groups
PRVF-5114 : Disk Group check failed. No Disk Groups configured
Task ASM Integrity check failed...
Verification of ASM Integrity was unsuccessful on all the specified nodes.
the above error is something i expected since aocording to 1053147.1 crsd will spawn oraagent which will in turn spawn ASM Resouce - ASM Instance(s) resource
however what I found puzzling is that
the +ASM2 instance is not only on, both ocr and ocr mirror diskgroup are mounted.
SYS@+ASM2>SELECT name, state FROM v$asm_diskgroup ORDER BY name;
NAME STATE
------------------------------ -----------
DATA DISMOUNTED
FRA DISMOUNTED
OCR_VOTE MOUNTED
OCR_VOTE_MIRROR MOUNTED
What also I do not understand is that on one hand, cluvfy says that asm instance has error, but on the other hand sqlplus tell there's no error. Why the contradictions?
Most important of all what should i execute the make sure crsd is started?
thanks a lot!
|
|
|
|
|
|
|
|