Unsolved
This post is more than 5 years old
26 Posts
0
1324
March 29th, 2007 03:00
AutoStart 5.2 Solaris Oracle module failed to detect oracle database
Hi Fork,
We have Oracle 10g (10.2.0.3) running in Solaris 9. After we monitor database with AutoStart we found node the database running normal with no issue but once we move database to other node we found the database tester detect data base failed and trying to restart database (that alway success).
If we do disable monitor and manual check the database, the database still running with no issue. So I'm not sure what wrong with the testing process that cause AutoStart restart database.
I check log of oracle module and see messages that database are in shutdown in progress??? That wired, and this happen only one node of cluster.
Any idea?
Regards,
Kawin.
We have Oracle 10g (10.2.0.3) running in Solaris 9. After we monitor database with AutoStart we found node the database running normal with no issue but once we move database to other node we found the database tester detect data base failed and trying to restart database (that alway success).
If we do disable monitor and manual check the database, the database still running with no issue. So I'm not sure what wrong with the testing process that cause AutoStart restart database.
I check log of oracle module and see messages that database are in shutdown in progress??? That wired, and this happen only one node of cluster.
Any idea?
Regards,
Kawin.



tribicic
157 Posts
0
March 29th, 2007 03:00
treesk
26 Posts
0
March 29th, 2007 03:00
The thing that I saw in the screen is the Oracle process is become red (failed) and then bring the whole group down. I check oracle log and oracle module log with no error on it (as I said the database is look fine and I can connect database or even run query during the tester indicate that database is failed. (monitor is disabled)
Regards,
Kawin.
treesk
26 Posts
0
March 29th, 2007 04:00
I just remember, the fail is not occur during database startup but it failed about 5 minutes after database is started. So the existing test (check pmon,smon process) might not an issue.
Regards,
Kawin.
tribicic
157 Posts
0
March 29th, 2007 04:00
As I said above, first do a failover with the monitoring on the group disabled - that will prevent cluster to do a group restart if the oracle fails and allow you to figure out what's going on and see if all the processes that the existence monitor is looking for exist.
treesk
26 Posts
0
March 29th, 2007 04:00
Note: I have downtime to test it again tomorrow.
tribicic
157 Posts
0
March 29th, 2007 04:00
Happy troubleshooting.
tribicic
157 Posts
0
March 29th, 2007 05:00
treesk
26 Posts
0
March 29th, 2007 06:00
So that why I said it so wired. If problem is inside the AutoStart it should happen on both node but this happen only one node of cluster.
tribicic
157 Posts
1
March 29th, 2007 06:00
You can actually check why the existence monitor script is failing:
- redirect the standard output and standard error of the oracle process to a file
- set the variable FT_TRACE_TESTS to 1 in the oracle process configuration
This way next time you start the process the output of the existence monitor script will be redirected to the specified output file with _exist suffix.
treesk
26 Posts
0
March 30th, 2007 06:00
I think it's maybe response test rather than the existence test.
The error messages on the oraproc.log show ORA-03135 connection lost contact.
Any idea?
treesk
26 Posts
0
April 1st, 2007 20:00
Regards,
Kawin.