2019년 3월 20일 수요일

Informix crash due to UPDATE after VARCHAR column size reduction

Hi. I am posting a problem with a system using Informix 12.10FC12 or later.

After this problem occurs, the Informix process will not start and is a very fatal defect.

The conditions under which the problem may occur are listed below.
1. Add a VARCHAR column to the table.
2. Reduce the size of the added VARCHAR column.

If you execute an UPDATE statement in the above situation, the Informix process will crash with an error.

The process is not always down, and it may or may not go down depending on the length of the data stored in the table.

The following is an online log message when I reproduce process crashes in Informix 12.10 and 14.10.


## 12.10.FC12W1
11:20:01  Assert Failed: Buffer modified in inconsistent chunk.
11:20:01  IBM Informix Dynamic Server Version 12.10.FC12W1WE
11:20:01   Who: Session(87, informix@pilma01, 39452766, 700000020593858)
                Thread(222, sqlexec, 700000020553068, 10)
                File: rsdebug.c Line: 1047
11:20:01   Results: Chunk 13 is being taken OFFLINE.
11:20:01   Action: Restore space containing this chunk from the archive.
11:20:01  stack trace for pid 36175890 written to /work2/ifx1210fc12w1we/tmp/af.4c6a350
11:20:02   See Also: /work2/ifx1210fc12w1we/tmp/af.4c6a350, shmem.4c6a350.0
11:20:10  Buffer modified in inconsistent chunk.
11:20:11  Assert Failed: INFORMIX-OnLine Must ABORT
        Critical media failure.
11:20:11  IBM Informix Dynamic Server Version 12.10.FC12W1WE
11:20:11   Who: Session(87, informix@pilma01, 39452766, 700000020593858)
                Thread(222, sqlexec, 700000020553068, 10)
                File: rsmirror.c Line: 2080
11:20:11  stack trace for pid 36175890 written to /work2/ifx1210fc12w1we/tmp/af.4c6a350
11:20:12   See Also: /work2/ifx1210fc12w1we/tmp/af.4c6a350
11:20:19  Thread ID 222 will NOT be suspended because
          it is in a critical section.
11:20:19   See Also: /work2/ifx1210fc12w1we/tmp/af.4c6a350
11:20:19  rsmirror.c, line 2080, thread 222, proc id 36175890, INFORMIX-OnLine Must ABORT
        Critical media failure..
11:20:19  Fatal error in ADM VP at mt_fn.c:14593
11:20:19  Unexpected virtual processor termination: pid = 36175890, exit status = 0x1.
11:20:19  PANIC: Attempting to bring system down


## 14.10.FC1DE
12:46:54  Assert Failed: Buffer modified in inconsistent chunk.
12:46:54  IBM Informix Dynamic Server Version 14.10.FC1DE
12:46:54   Who: Session(46, informix@ifxdb1, 62804, 0x4526fbc8)
                Thread(55, sqlexec, 4522d8c8, 1)
                File: rsdebug.c Line: 908
12:46:54   Results: Chunk 1 is being taken OFFLINE.
12:46:54   Action: Restore space containing this chunk from the archive.
12:46:54  stack trace for pid 62571 written to /opt/IBM/Informix_Software_Bundle/tmp/af.41fb7ad
12:46:54   See Also: /opt/IBM/Informix_Software_Bundle/tmp/af.41fb7ad, shmem.41fb7ad.0
12:46:57  Buffer modified in inconsistent chunk.
12:46:58  Assert Failed: INFORMIX-OnLine Must ABORT
        Critical media failure.
12:46:58  IBM Informix Dynamic Server Version 14.10.FC1DE
12:46:58   Who: Session(46, informix@ifxdb1, 62804, 0x4526fbc8)
                Thread(55, sqlexec, 4522d8c8, 1)
                File: rsmirror.c Line: 2062
12:46:58  stack trace for pid 62571 written to /opt/IBM/Informix_Software_Bundle/tmp/af.41fb7ad
12:46:58   See Also: /opt/IBM/Informix_Software_Bundle/tmp/af.41fb7ad
12:47:02  Thread ID 55 will NOT be suspended because
          it is in a critical section.
12:47:02   See Also: /opt/IBM/Informix_Software_Bundle/tmp/af.41fb7ad
12:47:02  Starting crash time check of:
12:47:02  1. memory block headers
12:47:02  2. stacks
12:47:02  Crash time checking found no problems
12:47:02  rsmirror.c, line 2062, thread 55, proc id 62571, INFORMIX-OnLine Must ABORT
        Critical media failure..
12:47:02  The Master Daemon Died
12:47:02  PANIC: Attempting to bring system down



I requested technical support from IBM for the above system failure.

The problem was that there were already cases from other customers, and there were no specific scenarios to reproduce.

Below is a link to the document that describes the problem.

At the time of the failure, when I looked at the contents of the assert failure file in the online message log, I could see the stack trace of that user thread.


0x00000001000af9cc (oninit)afstack
0x00000001000aeb5c (oninit)afhandler
0x00000001000af038 (oninit)affail_interface
0x00000001001b8844 (oninit)buffcheck
0x00000001002371a0 (oninit)buffput
0x0000000100b88640 (oninit)ckpgversion
0x0000000100b87af4 (oninit)rewrecord
0x0000000100b870ec (oninit)rsrewrec
0x000000010071ab00 (oninit)fmrewrec
0x00000001008382a0 (oninit)aud_sqisrewrec
0x0000000100d40a90 (oninit)doupdate
0x0000000100d3ff2c (oninit)chkrowcons
0x000000010114ea04 (oninit)dodmlrow
0x0000000101150eac (oninit)dodelupd
0x000000010083ee30 (oninit)aud_dodelupd
0x0000000100d1ec24 (oninit)excommand
0x00000001008c8590 (oninit)sq_execute
0x00000001008103ac (oninit)sqmain
0x00000001014d6898 (oninit)listen_verify
0x00000001014d530c (oninit)spawn_thread
0x0000000101482ae0 (oninit)th_init_initgls
0x00000001018f86e0 (oninit)startup


I found a stack trace that is very similar to the one described for defect IT27997. The ckpgversion function part of the stack trace seems to be the problem.

And since version 12.10.FC12, it is known that the in-place alter function has been improved. I think that this improvement has caused the defect.

In the presentation of IIUG2018, Jeff McMahon and Nick Geib published What's New in Informix, which shows that the type that changes between VARCHAR and VARCHAR (smaller or larger) size is made by in-place alteration.

Below is a script that can reproduce an instance crash based on the information described in IT27997 above.

It is easy to reproduce the test data by making a fixed length field of 60 characters.

drop table test;
create table test (a varchar(60));

load from test.unl insert into test;

alter table test add b int;
alter table test add c varchar(5);
alter table test add d varchar(5);
alter table test add e varchar(5);
alter table test modify c varchar(1);
alter table test modify d varchar(1);
alter table test modify e varchar(1);

update test set a=' qui officia deserunt mollit anim id est laborum.Lorem ip';

This issue occurs with versions 12.10.FC12 and 12.10.FC12W1, so if you are using that version, you will not have a problem if you request a patch version from IBM or downgrade to version 12.10.FC11 or below.

I hope that you can use Informix with stability.