I’m very exited to annuced that I have tested Mysql 5.1.47 clustered as a service/application on Oracle clusterware 11.2.0.1 using the new ACFS filesystem as a shared filesystem between 2 nodes. The trick is to create two virtual resources mysql.rg / mysql.res to enclose the dependencies on all resource like this ( → means depend on ) :
mysql.rg → mysql.db → acfs → vip → mysql.res
It works very well, and the performance are good on a SAN, and the switchig is very fast!
Raccomended





Matteo,
i’m trying to cluster some third party application on Oracle Clusterware 11.2.0.1, but it doesn’t work. The VIP resource working well but when I try to start the resource (the third party application)I receive the result
[root@jb00 bin]# ./crsctl start resource apptest
CRS-2672: Attempting to start ‘apptest’ on ‘jb00′
CRS-2674: Start of ‘apptest’ on ‘jb00′ failed
CRS-2679: Attempting to clean ‘apptest’ on ‘jb00′
CRS-2678: ‘apptest’ on ‘jb00′ has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-4000: Command Start failed, or completed with errors.
Reading the crsd.log it seams Cluisterware do not acces correctly to the ActionScript. I have the same error in the AIX 6.1 system (the production environment) and in the Red Hat 5.3 system (the lab environment)
Could you please help me sending the configuration for the resource, i can find where is the error in my configuration.
Thanks in advance.
Marco
Hello,
and then create the profile with dependecies
you have to first prepare the actions script in perl
Could you print your perl scripts? I have no problem with apache mysql 5.1.47 & postgresql 8.4.4
Mat
Hello Mat,
thanks for your help.
Find the shell i use for the action script,
[root@jb00 TEST]# cat test.scr
echo “———————————————-” 1>> /tmp/TEST/test.log
date 1>> /tmp/TEST/test.log
echo “Entro nello script” 1>> /tmp/TEST/test.log
case $1 in
‘start’)
echo “sono qui prima dopo start” 1>> /tmp/TEST/test.log
exit 0
;;
‘stop’)
echo “sono qui prima dopo stop ” 1>> /tmp/TEST/test.log
exit 0
;;
‘check’)
echo “sono qui prima dopo checkk” 1>> /tmp/TEST/test.log
exit 0
;;
‘clean’)
echo “sono qui prima dopo clean” 1>> /tmp/TEST/test.log
exit 0
;;
*)
echo “il parametro “$1″ non è ammesso” 1>> /tmp/TEST/test.log
;;
esac
[root@jb00 TEST]# ls -l
total 12
-rwxrwxrwx 1 root root 557 Jun 21 00:29 test.scr
If i try to strart the resource
[root@jb00 bin]# ./crsctl start resource apptest
CRS-2679: Attempting to clean ‘apptest’ on ‘jb00′
CRS-2680: Clean of ‘apptest’ on ‘jb00′ failed
CRS-4000: Command Start failed, or completed with errors.
[root@jb00 bin]#
I have this in the crsd.log
2010-06-24 11:20:06.116: [UiServer][2572053392] Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-2679: Attempting to clean 'apptest' on 'jb00']
MSGTYPE:
TextMessage[3]
OBJID:
TextMessage[apptest]
WAIT:
TextMessage[0]
]
2010-06-24 11:20:06.121: [ AGFW][2586762128] Agfw Proxy Server received the message: RESOURCE_CLEAN[apptest 1 1] ID 4100:2699
2010-06-24 11:20:06.121: [ AGFW][2586762128] Starting the agent: /u01/app/grid/bin/scriptagent with user id: root and incarnation:3
2010-06-24 11:20:06.399: [ AGFW][2586762128] Starting the HB [Interval = 30000, misscount = 6kill allowed=1] for agent: /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.400: [ AGFW][2586762128] Could not forward message [RESOURCE_CLEAN[apptest 1 1] ID 4100:2699] to agent. /u01/app/grid/bin/scriptagent_root is not running
2010-06-24 11:20:06.401: [ AGFW][2586762128] Starting of the agent: /u01/app/grid/bin/scriptagent with user id root is already in progress.
2010-06-24 11:20:06.816: [CLSFRAME][2593065872] New IPC Member:{Relative|Node:0|Process:6|Type:3}:AGENT
2010-06-24 11:20:06.816: [CLSFRAME][2593065872] New process connected to us ID:{Relative|Node:0|Process:6|Type:3} Info:AGENT
2010-06-24 11:20:06.853: [ AGFW][2586762128] Agfw Proxy Server received the message: AGENT_HANDSHAKE[Proxy] ID 20484:14
2010-06-24 11:20:06.853: [ AGFW][2586762128] Agent /u01/app/grid/bin/scriptagent_root with pid:11356 connected to server.
2010-06-24 11:20:06.854: [ AGFW][2586762128] Agfw Proxy Server sending message: RESTYPE_ADD[cluster_resource] ID 8196:2715 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.889: [ AGFW][2586762128] Agfw Proxy Server sending message: RESTYPE_ADD[local_resource] ID 8196:2717 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.904: [ AGFW][2586762128] Agfw Proxy Server sending message: RESTYPE_ADD[ora.cluster_resource.type] ID 8196:2719 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.914: [ AGFW][2586762128] Agfw Proxy Server sending message: RESTYPE_ADD[ora.local_resource.type] ID 8196:2721 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.920: [ AGFW][2586762128] Agfw Proxy Server sending message: RESTYPE_ADD[ora.oc4j.type] ID 8196:2723 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.928: [ AGFW][2586762128] Agfw Proxy Server sending message: RESOURCE_ADD[apptest 1 1] ID 4356:2725 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.931: [ AGFW][2586762128] Agfw Proxy Server forwarding the message: RESOURCE_CLEAN[apptest 1 1] ID 4100:2699 to the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:06.934: [ AGFW][2586762128] Agfw Proxy Server replying to the message: AGENT_HANDSHAKE[Proxy] ID 20484:14
2010-06-24 11:20:07.074: [ AGFW][2586762128] Received the reply to the message: RESTYPE_ADD[cluster_resource] ID 8196:2715 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.075: [ AGFW][2586762128] Received the reply to the message: RESTYPE_ADD[local_resource] ID 8196:2717 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.076: [ AGFW][2586762128] Received the reply to the message: RESTYPE_ADD[ora.cluster_resource.type] ID 8196:2719 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.077: [ AGFW][2586762128] Received the reply to the message: RESTYPE_ADD[ora.local_resource.type] ID 8196:2721 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.078: [ AGFW][2586762128] Received the reply to the message: RESTYPE_ADD[ora.oc4j.type] ID 8196:2723 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.079: [ AGFW][2586762128] Received the reply to the message: RESOURCE_ADD[apptest 1 1] ID 4356:2725 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.082: [ AGFW][2586762128] Received the reply to the message: RESOURCE_CLEAN[apptest 1 1] ID 4100:2726 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.082: [ AGFW][2586762128] Agfw Proxy Server sending the reply to PE for message:RESOURCE_CLEAN[apptest 1 1] ID 4100:2699
2010-06-24 11:20:07.082: [ CRSPE][2576255888] Received reply to action [Clean] message ID: 2699
2010-06-24 11:20:07.082: [ CRSPE][2576255888] Clean action failed with error code: 0
2010-06-24 11:20:07.083: [ CRSRPT][2574154640] Publishing event: Cluster Resource Action Failed Event : 0xa210aa70
2010-06-24 11:20:07.083: [ CRSRPT][2574154640] Publish to eons buffered event : 0xa210aa70
2010-06-24 11:20:07.163: [ AGFW][2586762128] Received the reply to the message: RESOURCE_CLEAN[apptest 1 1] ID 4100:2726 from the agent /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.164: [ AGFW][2586762128] Agfw Proxy Server sending the last reply to PE for message:RESOURCE_CLEAN[apptest 1 1] ID 4100:2699
2010-06-24 11:20:07.164: [ CRSPE][2576255888] Received reply to action [Clean] message ID: 2699
2010-06-24 11:20:07.165: [ CRSPE][2576255888] CRS-2680: Clean of ‘apptest’ on ‘jb00′ failed
2010-06-24 11:20:07.166: [ CRSPE][2576255888] Sequencer for [apptest 1 1] has completed with error: CRS-0216: Could not stop resource ‘apptest’.
2010-06-24 11:20:07.168: [ CRSPE][2576255888] PE Command [ Start Resource : 0xc4568d0 ] has completed
2010-06-24 11:20:07.168: [ CRSPE][2576255888] UI Command [Start Resource : 0xc4568d0] is replying to sender.
2010-06-24 11:20:07.170: [UiServer][2572053392] Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-2680: Clean of 'apptest' on 'jb00' failed]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[apptest]
WAIT:
TextMessage[0]
]
2010-06-24 11:20:07.171: [UiServer][2572053392] Container [ Name: UI_DATA
apptest:
TextMessage[215]
]
2010-06-24 11:20:07.171: [UiServer][2572053392] Done for ctx=0xa2103450
2010-06-24 11:20:07.171: [ AGFW][2586762128] Agfw Proxy Server received the message: CMD_COMPLETED[Proxy] ID 20482:2746
2010-06-24 11:20:07.171: [ AGFW][2586762128] Agfw Proxy Server replying to the message: CMD_COMPLETED[Proxy] ID 20482:2746
2010-06-24 11:20:07.205: [ AGFW][2586762128] Agfw Proxy Server received the message: AGENT_SUICIDE[Proxy] ID 20486:42
2010-06-24 11:20:07.206: [ AGFW][2586762128] Suicide request received from /u01/app/grid/bin/scriptagent_root
2010-06-24 11:20:07.206: [ AGFW][2586762128] Agfw Proxy Server replying to the message: AGENT_SUICIDE[Proxy] ID 20486:42
2010-06-24 11:20:07.220: [ CRSCOMM][2593065872][FFAIL] Couldnt clscreceive message, no message: 11
2010-06-24 11:20:07.220: [ CRSCOMM][2593065872] Client disconnected.
2010-06-24 11:20:07.221: [ CRSCOMM][2593065872][FFAIL] Listener got clsc error 11 for memNum. 6
2010-06-24 11:20:07.221: [ CRSCOMM][2593065872] IPC listener connection to member 6 has been removed
2010-06-24 11:20:07.221: [CLSFRAME][2593065872] Removing IPC Member:{Relative|Node:0|Process:6|Type:3}
2010-06-24 11:20:07.221: [CLSFRAME][2593065872] Disconnected from AGENT process: {Relative|Node:0|Process:6|Type:3}
2010-06-24 11:20:07.222: [ CRSPE][2576255888] Disconnected from server:
2010-06-24 11:20:07.224: [ AGFW][2586762128] Agfw Proxy Server received process disconnected notification, count=1
2010-06-24 11:20:07.224: [ AGFW][2586762128] /u01/app/grid/bin/scriptagent_root disconnected.
2010-06-24 11:20:07.224: [ AGFW][2586762128] Agent /u01/app/grid/bin/scriptagent_root[11356] stopped!
2010-06-24 11:20:07.224: [ CRSCOMM][2586762128] removeConnection: Member 6 does not exist.
[root@jb00 crsd]#
And this in the cluster_resource_root.log
2010-06-24 11:20:06.981: [ AGFW][2810366864] AGFW assuming CLEAN entry point defined in script.
2010-06-24 11:20:06.981: [ AGFW][2810366864] Added new restype: ora.local_resource.type
2010-06-24 11:20:06.981: [ AGFW][2810366864] Agent sending last reply for: RESTYPE_ADD[ora.local_resource.type] ID 8196:2721
2010-06-24 11:20:06.981: [ AGFW][2810366864] Agent received the message: RESTYPE_ADD[ora.oc4j.type] ID 8196:2723
2010-06-24 11:20:06.982: [ AGFW][2810366864] Agent does not have the type: ora.oc4j.type
2010-06-24 11:20:06.982: [ AGFW][2810366864] Agent do not have any action entries defined for type: ora.oc4j.type
2010-06-24 11:20:06.982: [ AGFW][2810366864] Could not find the action entry: START
2010-06-24 11:20:06.982: [ AGFW][2810366864] AGFW assuming START entry point defined in script.
2010-06-24 11:20:06.982: [ AGFW][2810366864] Could not find the action entry: STOP
2010-06-24 11:20:06.982: [ AGFW][2810366864] AGFW assuming STOP entry point defined in script.
2010-06-24 11:20:06.982: [ AGFW][2810366864] Could not find the action entry: CHECK
2010-06-24 11:20:06.982: [ AGFW][2810366864] AGFW assuming CHECK entry point defined in script.
2010-06-24 11:20:06.982: [ AGFW][2810366864] Could not find the action entry: CLEAN
2010-06-24 11:20:06.982: [ AGFW][2810366864] AGFW assuming CLEAN entry point defined in script.
2010-06-24 11:20:06.982: [ AGFW][2810366864] Added new restype: ora.oc4j.type
2010-06-24 11:20:06.982: [ AGFW][2810366864] Agent sending last reply for: RESTYPE_ADD[ora.oc4j.type] ID 8196:2723
2010-06-24 11:20:06.983: [ AGFW][2810366864] Agent received the message: RESOURCE_ADD[apptest 1 1] ID 4356:2725
2010-06-24 11:20:06.983: [ AGFW][2810366864] Added new resource: apptest 1 1 to the agfw
2010-06-24 11:20:06.985: [ AGFW][2810366864] Agent sending last reply for: RESOURCE_ADD[apptest 1 1] ID 4356:2725
2010-06-24 11:20:06.985: [ AGFW][2810366864] Agent received the message: RESOURCE_CLEAN[apptest 1 1] ID 4100:2726
2010-06-24 11:20:06.985: [ AGFW][2810366864] Preparing CLEAN command for: apptest 1 1
2010-06-24 11:20:06.985: [ AGFW][2810366864] apptest 1 1 state changed from: UNKNOWN to: CLEANING
2010-06-24 11:20:07.017: [ AGFW][2877483920] Executing command: clean for resource: apptest 1 1
2010-06-24 11:20:07.017: [ AGFW][2877483920] Entering script entry point…
2010-06-24 11:20:07.017: [apptest][2877483920] [clean] Executing action script: /tmp/TEST/test.scr[clean]
2010-06-24 11:20:07.079: [ AGFW][2877483920] Command: clean for resource: apptest 1 1 completed with invalid status: 209
2010-06-24 11:20:07.080: [ AGFW][2810366864] Agent sending reply for: RESOURCE_CLEAN[apptest 1 1] ID 4100:2726
2010-06-24 11:20:07.083: [ AGFW][2843925392] Executing command: check for resource: apptest 1 1
2010-06-24 11:20:07.083: [ AGFW][2843925392] Entering script entry point…
2010-06-24 11:20:07.083: [apptest][2843925392] [check] Executing action script: /tmp/TEST/test.scr[check]
2010-06-24 11:20:07.091: [CRSTIMER][2743147408] Timer Thread Starting.
2010-06-24 11:20:07.160: [ AGFW][2843925392] Received unknown resource status code: 209
2010-06-24 11:20:07.160: [ AGFW][2843925392] check for resource: apptest 1 1 completed with status: UNKNOWN
2010-06-24 11:20:07.160: [ AGFW][2810366864] apptest 1 1 state changed from: CLEANING to: UNKNOWN
2010-06-24 11:20:07.161: [ AGFW][2810366864] Agent sending last reply for: RESOURCE_CLEAN[apptest 1 1] ID 4100:2726
2010-06-24 11:20:07.161: [ AGFW][2810366864] Agent has no resources to be monitored.Sending suicide request.
2010-06-24 11:20:07.161: [ AGFW][2810366864] Agent sending message to PE: AGENT_SUICIDE[Proxy] ID 20486:42
2010-06-24 11:20:07.207: [ AGFW][2810366864] Agent is commiting suicide.
2010-06-24 11:20:07.207: [ USRTHRD][2810366864] Script agent is exiting..
2010-06-24 11:20:07.208: [ AGFW][2810366864] Agent is exiting with exit code: 1
The configuration of the resource is the following
[root@jb00 bin]# ./crsctl status resource apptest -f
NAME=apptest
TYPE=cluster_resource
STATE=UNKNOWN
TARGET=ONLINE
ACL=owner:root:rwx,pgrp:root:rwx,other::rwx,user:oragrp:rwx,group:asmadmin:rwx,group:dba:rwx,group:asmdba:rwx
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=/tmp/TEST/test.scr
ACTIVE_PLACEMENT=0
AGENT_FILENAME=%CRS_HOME%/bin/scriptagent
AUTO_START=restore
CARDINALITY=1
CARDINALITY_ID=0
CHECK_INTERVAL=30
CREATION_SEED=35
CURRENT_RCOUNT=0
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=
ENABLED=1
FAILOVER_DELAY=0
FAILURE_COUNT=0
FAILURE_HISTORY=
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
ID=apptest
INCARNATION=0
LAST_FAULT=0
LAST_RESTART=0
LAST_SERVER=
LOAD=1
LOGGING_LEVEL=5
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=restricted
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=2
SCRIPT_TIMEOUT=60
SERVER_POOLS=apptest_sp
START_DEPENDENCIES=hard(app.appvip)
START_TIMEOUT=0
STATE_CHANGE_TEMPLATE=
STATE_CHANGE_VERS=0
STATE_DETAILS=
STOP_DEPENDENCIES=hard(app.appvip)
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1h
Sorry if i am verbose, but i hope giving you all the needed information .
Thanks in advance for your attention
Marco
Hi Marco,
as soon as possible I will print a new article with the scripts and methods to correct do the 3-third part application (almost for mysql / postgresql & apache ) with perl script that I developed.
Normally I not use shell script, I think you have to diagnose :
1) permission of the script and the directory taht contains the script
2) control if the script run correctly in manual mode ( for example I not see the #!/usr/bin/sh for identified the shell script)
3) use “set -x” and redirection of the errors on the logs you used with ” 2>&1 ”
The cluster seemed to correct try to inizialize the resource by the way the script returned an error and so the cluster tried to clean and check with the same script. Because the script failed also in the clean anc check the status becomed unknown.
Kind Regards
Mat
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
· · · · · · · · · · · · · · · · · · ·
M I D A T I – P L A Y T H E C H A N G E
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
· · · · · · · · · · · · · · · · · · ·
by Miriade Spa
Matt,
thanks for your help, you confirm me it is a possible permission problem. It works fine form command line, the problem is when Clusterware try to rin it.
I’m not able with perl, but i can try with “set -x” and changing permission and script location.
Ciao Marco
Matt,
it works now, I’m using Clusterware to manage CFT and Beta 48 Agent.
Here my experience http://casaprocida.blogspot.com/
Thanks for your time
Took me time to read the whole article, the article is great but the comments bring more brainstorm ideas, thanks.
- Johnson