PlateSpin克隆复制出的Oracle数据库服务器,往往启动数据库实例都会遇到一些杂七杂八的问题。今天测试DR环境时又遇到了一个特殊场景,在此之前,我已经遇到了下面两起案例:
如下所示,在启动数据库实例时,遇到"ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []"错误
检查告警日志,就会看到下面详细信息(部分告警日志):
Successful mount of redo thread 1, with mount id 4228022561
Tue Nov 22 15:11:33 CST 2016
Database mounted in Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Nov 22 15:11:33 CST 2016
ALTER DATABASE OPEN
Tue Nov 22 15:11:35 CST 2016
Beginning crash recovery of 1 threads
Tue Nov 22 15:11:35 CST 2016
Started redo scan
Tue Nov 22 15:11:35 CST 2016
Errors in file /u01/app/oracle/admin/SCM2/udump/scm2_ora_4465.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
Tue Nov 22 15:11:36 CST 2016
Aborting crash recovery due to error 600
Tue Nov 22 15:11:36 CST 2016
Errors in file /u01/app/oracle/admin/SCM2/udump/scm2_ora_4465.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...
遇到这个错误,大部分情况是出现在磁盘出现故障导致数据库崩溃后,实例启动失败(After a disk failure that caused the database to crash, the instance fails to start up with ORA-00600: arguments: [kcratr1_lastbwr].),因为服务器是通过PlateSpin复制克隆过来的,在操作系统层面的复制,很难保证数据块的完全一致性,毕竟数据库服务器会频繁进行IO操作,除非关闭数据库实例,否则就有可能出现数据块不一致的情况,很有可能出现了写丢失,这种情况跟磁盘故障也有点类似(当然具体原理不是很清楚)。此时我只能关闭数据库实例,然后startup mount后,recover database。如下所示
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 3.4360E+10 bytes
Fixed Size 2159376 bytes
Variable Size 2.4931E+10 bytes
Database Buffers 9378463744 bytes
Redo Buffers 48168960 bytes
Database mounted.
SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [52], [280690], [218384498],
[], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280690)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
SQL>
结果恢复时又遇到了ORA-00600: internal error code, arguments: [3020] 与 ORA-10567: Redo is inconsistent with data block 错误。
PMON started with pid=2, OS id=4508
PSP0 started with pid=3, OS id=4510
MMAN started with pid=4, OS id=4512
DBW0 started with pid=5, OS id=4514
LGWR started with pid=6, OS id=4516
CKPT started with pid=7, OS id=4518
SMON started with pid=8, OS id=4520
RECO started with pid=9, OS id=4522
CJQ0 started with pid=10, OS id=4524
MMON started with pid=11, OS id=4526
Tue Nov 22 15:14:52 CST 2016
starting up 8 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=12, OS id=4528
Tue Nov 22 15:14:53 CST 2016
starting up 12 shared server(s) ...
Oracle Data Guard is not available in this edition of Oracle.
Tue Nov 22 15:14:53 CST 2016
ALTER DATABASE MOUNT
Tue Nov 22 15:14:59 CST 2016
Setting recovery target incarnation to 5
Tue Nov 22 15:14:59 CST 2016
Successful mount of redo thread 1, with mount id 4228071407
Tue Nov 22 15:14:59 CST 2016
Database mounted in Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Nov 22 15:15:15 CST 2016
ALTER DATABASE RECOVER database
Tue Nov 22 15:15:15 CST 2016
Media Recovery Start
Tue Nov 22 15:15:15 CST 2016
Recovery of Online Redo Log: Thread 1 Group 6 Seq 64605 Reading mem 0
Mem# 0: /u01/oradata/SCM2/redo06.log
Mem# 1: /u03/oradata/SCM2/redo06.log
Tue Nov 22 15:15:21 CST 2016
Errors in file /u01/app/oracle/admin/SCM2/udump/scm2_ora_4570.trc:
ORA-00600: internal error code, arguments: [3020], [52], [280690], [218384498], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280690)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
Tue Nov 22 15:15:22 CST 2016
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER database ...
出现这个错误后,恢复(Recover)中断,无法继续,这个是因为恢复时应用Redo log时发现Redo里面的记录的信息与被恢复的数据块信息不一致,导致恢复无法继续。官方文档介绍如下:
CAUSE
Recovery stops because of failed consistency checks, a problem called stuck recovery. Stuck recovery can occur when an underlying operating system or storage system loses a write issued by the database during normal operation. There is an inconsistency between the information stored in the redo and the information stored in a database block being recovered.
The database signals an internal error when applying the redo. This problem can be caused by an Oracle Database bug or may be because of I/O problem ( hardware or O/S related issue )
There is a known EMC issue related to an RDBMS ORA-600 [3020] where the root-cause is on OS/Hardware level.
Details from EMC on the nature of the fix (problem with Symmetrix microcode)
ID: emc230687
Domain: EMC1
Solution Class: 3.X Compatibility
SOLUTION
When media recovery encounters a problem, the alert log may indicate that recovery can continue if it is allowed to corrupt the data block causing the problem. The alert log contains information about the block: its block type, block address, the tablespace it belongs to, and so forth. For blocks containing user data, the alert log may also report the data object number.
In this case, the database can proceed with recovery if it is allowed to mark the problem block as corrupt. Nevertheless, this response is not always advisable. For example, if the block is an important block in the SYSTEM tablespace, marking the block as corrupt can eventually prevent you from opening the recovered database. Another consideration is whether the recovery problem is isolated. If this problem is followed immediately by many other problems in the redo stream, then you may want to open the database with the RESETLOGS option.
For a block containing user data, you can usually query the database to find out which object or table owns this block. If the database is not open, then you should be able to open the database read-only, even if you are recovering a whole database backup. The following example cancels recovery and opens read-only:
…………………………………………………………………………………………………………………………………………………………………………………………………………………
尝试了一下能否以read only打开数据库,结果报ORA-16005: database requires recovery错误。检查数据文件和控制文件,发现checkpoint都是一致的。
SQL> alter database open read only;
alter database open read only
*
ERROR at line 1:
ORA-16005: database requires recovery
SQL> col checkpoint_change# for 9999999999999999
SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
23493915876
SQL> select file#, nvl(last_change#,0),checkpoint_change#
2 from v$datafile
3 where file#=52;
FILE# NVL(LAST_CHANGE#,0) CHECKPOINT_CHANGE#
---------- ------------------- ------------------
52 0 23493915876
SQL>
然后我就按照官方文档使用recover database allow 1 corruption 将不一致的块标记为损坏,然后可以执行进一步的恢复尝试来恢复数据库。恢复过程中,发现不一致性的数据块数量不止一个,反复执行下面命令,终于熬到了Media recovery complete了
SQL> startup mount;
ORACLE instance started.
Total System Global Area 3.4360E+10 bytes
Fixed Size 2159376 bytes
Variable Size 2.4931E+10 bytes
Database Buffers 9378463744 bytes
Redo Buffers 48168960 bytes
Database mounted.
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [52], [280688], [218384496], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280688)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [39], [95934], [163673790], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 39, block# 95934)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 39: '/u02/oradata/SCM2/escmowner_d02.dbf'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 358585
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [39], [95936], [163673792], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 39, block# 95936)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 39: '/u02/oradata/SCM2/escmowner_d02.dbf'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 358585
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [39], [95936], [163673792], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 39, block# 95936)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 39: '/u02/oradata/SCM2/escmowner_d02.dbf'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 358585
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [52], [280689], [218384497], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280689)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
SQL> recover database allow 10 corruption;
ORA-10588: Can only allow 1 corruption for normal media/standby recovery
SQL> recover database allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [73], [1639], [306185831], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 73, block# 1639)
ORA-10564: tablespace ESCMOWNER_IDX
ORA-01110: data file 73: '/u03/oradata/SCM2/escmowner_x10.dbf'
ORA-10560: block type '0'
SQL> recover database allow 1 corruption;
Media recovery complete.
SQL>
操作过程中对应的部分告警日志内容如下所示:
ALTER DATABASE RECOVER database allow 1 corruption
Wed Nov 23 09:42:11 CST 2016
Media Recovery Start
ALLOW CORRUPTION option must use serial recovery
Wed Nov 23 09:42:11 CST 2016
Recovery of Online Redo Log: Thread 1 Group 6 Seq 64605 Reading mem 0
Mem# 0: /u01/oradata/SCM2/redo06.log
Mem# 1: /u03/oradata/SCM2/redo06.log
CORRUPTING BLOCK 280690 OF FILE 52 AND CONTINUING RECOVERY
Wed Nov 23 09:42:12 CST 2016
Errors in file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc:
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280690)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
Hex dump of (file 64, block 290183) in trace file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc
Corrupt block relative dba: 0x10046d87 (file 64, block 290183)
Fractured block found during media recovery
Data in bad block:
type: 6 format: 2 rdba: 0x10046d87
last change scn: 0x0005.785a119c seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x9d890601
check value in block header: 0x3110
computed block checksum: 0x8e14
Reread of rdba: 0x10046d87 (file 64, block 290183) found same corrupted data
Hex dump of (file 52, block 280687) in trace file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc
Corrupt block relative dba: 0x0d04486f (file 52, block 280687)
Fractured block found during media recovery
Data in bad block:
type: 6 format: 2 rdba: 0x0d04486f
last change scn: 0x0005.785a119c seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xac6e1b02
check value in block header: 0xf67b
computed block checksum: 0x17c1
Reread of rdba: 0x0d04486f (file 52, block 280687) found same corrupted data
Wed Nov 23 09:42:25 CST 2016
Errors in file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc:
ORA-00600: internal error code, arguments: [3020], [52], [280688], [218384496], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 52, block# 280688)
ORA-10564: tablespace ESCMOWNER_DATA
ORA-01110: data file 52: '/u02/oradata/SCM2/escmowner_d07.dbf'
ORA-10560: block type 'LOB BLOCK'
Wed Nov 23 09:42:25 CST 2016
Hex dump of (file 52, block 280687) in trace file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc
Corrupt block relative dba: 0x0d04486f (file 52, block 280687)
Fractured block found during in-flux buffer recovery
Data in bad block:
type: 6 format: 2 rdba: 0x0d04486f
last change scn: 0x0005.785a119c seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xac6e1b02
check value in block header: 0xf67b
computed block checksum: 0x17c1
Reread of rdba: 0x0d04486f (file 52, block 280687) found same corrupted data
Hex dump of (file 64, block 290183) in trace file /u01/app/oracle/admin/SCM2/udump/scm2_ora_20164.trc
Corrupt block relative dba: 0x10046d87 (file 64, block 290183)
Fractured block found during in-flux buffer recovery
Data in bad block:
type: 6 format: 2 rdba: 0x10046d87
last change scn: 0x0005.785a119c seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x9d890601
check value in block header: 0x3110
computed block checksum: 0x8e14
Reread of rdba: 0x10046d87 (file 64, block 290183) found same corrupted data
Wed Nov 23 09:42:25 CST 2016
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER database allow 1 corruption ...
但是在打开数据库的时候,又遇到了ORA-01578与ORA-01110错误,如下所示:
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 2
ORA-01578: ORACLE data block corrupted (file # 1, block # 98)
ORA-01110: data file 1: '/u01/oradata/SCM2/system01.dbf'
出现ORA-01578,意味着数据库出现了坏块,使用dbv命令检查数据库坏块,发现标记为坏块的有32个之多。
dbv file=/u01/oradata/SCM2/system01.dbf blocksize=8192
.............................................................
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0040f1ca
last change scn: 0x0005.781d41ff seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x1daf0601
check value in block header: 0x9cf8
computed block checksum: 0x5e57
Page 63922 is influx - most likely media corrupt
Corrupt block relative dba: 0x0040f9b2 (file 1, block 63922)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0040f9b2
last change scn: 0x0005.78210b86 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x4eff0601
check value in block header: 0x8642
computed block checksum: 0x47b4
Page 64336 is influx - most likely media corrupt
Corrupt block relative dba: 0x0040fb50 (file 1, block 64336)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0040fb50
last change scn: 0x0005.780c413f seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x7a550601
check value in block header: 0x9e5f
computed block checksum: 0x46a
Page 73356 is influx - most likely media corrupt
Corrupt block relative dba: 0x00411e8c (file 1, block 73356)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00411e8c
last change scn: 0x0005.75c8101d seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x18b70601
check value in block header: 0x964a
computed block checksum: 0x8aa
Page 73902 is influx - most likely media corrupt
Corrupt block relative dba: 0x004120ae (file 1, block 73902)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x004120ae
last change scn: 0x0005.77a57f68 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xfa4d0601
check value in block header: 0x729d
computed block checksum: 0xdf7f
Page 75400 is influx - most likely media corrupt
Corrupt block relative dba: 0x00412688 (file 1, block 75400)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00412688
last change scn: 0x0005.76ce99af seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xc0ad0601
check value in block header: 0x25b
computed block checksum: 0x5ea1
Page 75448 is influx - most likely media corrupt
Corrupt block relative dba: 0x004126b8 (file 1, block 75448)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x004126b8
last change scn: 0x0005.78211186 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x72500601
check value in block header: 0x5969
computed block checksum: 0x63d6
Page 75512 is influx - most likely media corrupt
Corrupt block relative dba: 0x004126f8 (file 1, block 75512)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x004126f8
last change scn: 0x0005.78579e34 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x52460601
check value in block header: 0x17b2
computed block checksum: 0xda45
Page 75830 is influx - most likely media corrupt
Corrupt block relative dba: 0x00412836 (file 1, block 75830)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00412836
last change scn: 0x0005.78211186 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x1cf10601
check value in block header: 0x5455
computed block checksum: 0x1f7a
Page 79066 is influx - most likely media corrupt
Corrupt block relative dba: 0x004134da (file 1, block 79066)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x004134da
last change scn: 0x0005.78548bf1 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xf0150601
check value in block header: 0x29aa
computed block checksum: 0x7ada
Page 79416 is influx - most likely media corrupt
Corrupt block relative dba: 0x00413638 (file 1, block 79416)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00413638
last change scn: 0x0005.782111d1 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x1b610601
check value in block header: 0xd763
computed block checksum: 0xbf6e
Page 80102 is influx - most likely media corrupt
Corrupt block relative dba: 0x004138e6 (file 1, block 80102)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x004138e6
last change scn: 0x0005.78196cc8 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x8a210601
check value in block header: 0xe562
computed block checksum: 0xe6e9
Page 80420 is influx - most likely media corrupt
Corrupt block relative dba: 0x00413a24 (file 1, block 80420)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00413a24
last change scn: 0x0005.77bd6fba seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x2bf80602
check value in block header: 0xadf
computed block checksum: 0x5e98
Page 80930 is influx - most likely media corrupt
Corrupt block relative dba: 0x00413c22 (file 1, block 80930)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00413c22
last change scn: 0x0005.77302b26 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x2ae30601
check value in block header: 0x21ad
computed block checksum: 0x4ae6
Page 81250 is influx - most likely media corrupt
Corrupt block relative dba: 0x00413d62 (file 1, block 81250)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00413d62
last change scn: 0x0005.7821123c seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x7fba0602
check value in block header: 0xf36
computed block checksum: 0x42a6
Page 81472 is influx - most likely media corrupt
Corrupt block relative dba: 0x00413e40 (file 1, block 81472)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00413e40
last change scn: 0x0005.7847cd31 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x8cdb0601
check value in block header: 0xeb69
computed block checksum: 0x41ea
Page 85757 is influx - most likely media corrupt
Corrupt block relative dba: 0x00414efd (file 1, block 85757)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00414efd
last change scn: 0x0005.73cab1bc seq: 0x2 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x4a100601
check value in block header: 0xf053
computed block checksum: 0x700
Page 85996 is influx - most likely media corrupt
Corrupt block relative dba: 0x00414fec (file 1, block 85996)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00414fec
last change scn: 0x0005.78210a33 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xe96e0601
check value in block header: 0x1880
computed block checksum: 0xe35c
Page 87348 is influx - most likely media corrupt
Corrupt block relative dba: 0x00415534 (file 1, block 87348)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00415534
last change scn: 0x0005.781f1f02 seq: 0x2 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x2cfd0602
check value in block header: 0xb8ae
computed block checksum: 0x596
DBVERIFY - Verification complete
Total Pages Examined : 88320
Total Pages Processed (Data) : 53676
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 16588
Total Pages Failing (Index): 0
Total Pages Processed (Other): 3306
Total Pages Processed (Seg) : 1
Total Pages Failing (Seg) : 0
Total Pages Empty : 14718
Total Pages Marked Corrupt : 32
Total Pages Influx : 32
Highest block SCN : 2019287292 (5.2019287292)
看来只能用备份(本机的备份)恢复来修复坏块,结果才想起这个实例时标准版,无法使用这个功能。
RMAN> blockrecover datafile 1 block 63922;
Starting blockrecover at 23-NOV-16
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of blockrecover command at 11/23/2016 12:24:41
RMAN-05009: Block Media Recovery requires Enterprise Edition
使用RMAN还原备份文件结果遭遇ORA-19870与ORA-19599错误
RMAN> restore datafile 1;
Starting restore at 23-NOV-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
restoring datafile 00001 to /u01/oradata/SCM2/system01.dbf
channel ORA_DISK_1: reading from backup piece /u05/backup/backupsets/ora_df928290157_s27337_s1
ORA-19870: error reading backup piece /u05/backup/backupsets/ora_df928290157_s27337_s1
ORA-19599: block number 39555 is corrupt in backup piece /u05/backup/backupsets/ora_df928290157_s27337_s1
failover to previous backup
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 11/23/2016 14:07:56
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 1 found to restore
出现ORA-19599错误,意味着备份集文件也出现了坏块。如下描述所示,只能通过拷贝生产环境的正常的备份集来恢复了。折腾了这么久,对PlateSpin这个DR方案颇感失望。
Error: ORA-19599 block number %s is corrupt in %s %s
----------------------------------------------------------------------------
Cause: A corrupt block was found in a control file, archivelog, or backup
piece that is being read for a backup or copy. Corruption shall not be
tolerated in control files, archivelogs, or backup pieces.
Action: None. The copy or backup operation fails. Note that in the case of a
backup set, the conversation is still active and the piece may be
retried.
参考资料:
Stuck recovery of database ORA-00600[3020] (文档 ID 283269.1)
https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=346216290638124&parent=DOCUMENT&sourceId=1088018.1&id=28814.1&_afrWindowMode=0&_adf.ctrl-state=15uednnal0_132