Login Register Actian.com  

Actian Community Forum


Go Back   Actian Community Forums > Development Tools > Application Development using ESQL or Character Based Tools
 

Reply
 
LinkBack Thread Tools Display Modes
Old 2012-02-08   #1 (permalink)
Junior Member
 
Join Date: May 2010
Posts: 6
Question Problems with ESQL after upgrade to 9.2 (E_OP0791, E_SC0206, and E_AD5505)

Hi,

I wonder if anyone out there can help me ? I'm investigating an issue involving an
ESQLC program which has recently been rebuilt as part of an upgrade of our product
from Ingres 9.1 to version 9.2 (still quite an old version, I know). This is on Solaris.

The program passed test here, but is hitting some strange error on a customer's
site.

The error that comes up most of the time is:
E_OP0791 consistency check - Error generating an instruction
This is being reported on a fairly simple, non-dynamic query along the lines of:

Code:
EXEC SQL SELECT keysuffix,
                chgaltkey,
                posttypeind,
                rechrgstart,
                rechrgend,
                rateable,
                grossval,
                rentpoints,
                TRIM(refactcode),
                TRIM(vatind),
                TRIM(wrateind),
                TRIM(wauth1id),
                TRIM(wauth2id),
                TRIM(wauth3id),
                TRIM(CHAR(reviewdte)),
                rvw_status
INTO     :propfixed.keysuffix,
                :propfixed.chgaltkey,
                :propfixed.posttypeind,
                :propfixed.rechrgstart,
                :propfixed.rechrgend,
                :propfixed.rateable,
                :propfixed.grossval,
                :hv_points,
                :propfixed.refactcode,
                :propfixed.vatind,
                :propfixed.wrateind,
                :propfixed.wauth1id,
                :propfixed.wauth2id,
                :propfixed.wauth3id,
                :propfixed.reviewdte,
                :propfixed.rvw_status
FROM     rechgpd
WHERE    propref = :propfixed.propref
                                   AND      rechrgstart <= :bparmrent.contdate1
                                   AND     (rechrgend    > :bparmrent.contdate1
                                            OR       rechrgend    = '');
This doesn't always seem to occur on the same values of propfixed.propref. The query
seems to run fine if run in isql.

One of our team thought that we might be able to cure this by running optimizedb on the table in question.

We did this, and got a different error, from a different part of the code:
E_SC0206 An internal error prevents further processing of this query.
Associated error messages
Looking in the error log, it seems as if this is being prompted by an earlier error:
E_AD5505_UNALIGNED < Expected internal error code ... No message needed. >
ADE has detected an attempt to compile an unaligned piece of data.
We had thought that the optimizedb had either fixed the original problem, and that we
were now seeing a different problem, or that it had "broken" something else. However, we have since seen the original error ("consistency check") occurring again, so it may
just have been coincidence, and nothing to do with the optimizedb.

This second error is in the FETCH at the bottom of a loop that encloses the SELECT that
was producing the first error.

Does anyone know of anything I can do to investigate this further ?

Thanks,
Alun.
alun is offline   Reply With Quote
Old 2012-02-09   #2 (permalink)
Ingres Community
 
kschendel's Avatar
 
Join Date: Mar 2007
Location: Pittsburgh, PA
Posts: 1,661
Default

By strange coincidence, I saw that AD5505 / SC0206 combination yesterday for the first time. It was from the terminal monitor, though, not ESQL. Very mysterious, and it vanished when I exited the terminal monitor and went back in.

I suspect that the optimizedb had no effect and you're seeing something random. I don't know if there is any way you can pin it down other than running a debug dbms server and getting it to happen. If you have support, Tech Support might be able to offer some help (e.g. a diagnostic server).
kschendel is offline   Reply With Quote
Old 2012-02-09   #3 (permalink)
Junior Member
 
Join Date: May 2010
Posts: 6
Default

Hi Kschendel,

It could well have some randomness to it - I've Googled the AD5505 error and turned up some Ingres
code that returns it in:

WebSVN - ingres - Rev 4643 - /main/src/common/adf/ade/adecompile.c

(This probably wont be exactly the same as the 9.2 code (given its age), but there's probably not too
much difference)

It looks as if its complaining about certain pointers in a structure parameter not being aligned with
2 or 4 byte boundaries (depending on data type). I've not been able to trace this back any further yet,
but it sounds as if its in the same sort of area as the "consistency check" error: compiling SQL to actual
instructions.

As to what might cause this, I can only guess at data corruption - either in memory by the program or
Ingres libraries, or in the database (possibly during the move to 9.2).

The customer is going to drop the database and re-run the migration from 9.1 in the next few days, so
hopefully that will either fix it, or rule out DB corruption introduced by the migration.

Tech support want us to give them an SQL script to reproduce the problem, which is proving difficult.

It seems to be sticking to producing the OP0791 "consistency check" error at the moment.

By the way, which version of Ingres did you see it on ?

Cheers,
Alun.
alun is offline   Reply With Quote
Old 2012-02-09   #4 (permalink)
Ingres Community
 
kschendel's Avatar
 
Join Date: Mar 2007
Location: Pittsburgh, PA
Posts: 1,661
Default

I very much doubt that it has anything to do with database corruption, although I've been wrong once or twice before.

I got an AD5505 in a variant of the most up-to-date 10.1 code line. It happened once, and then vanished. I'm thinking that it more or less had to occur via adf_func(), but I can't pin down why, and it's not happening today. We'll get it, but it might take a little time.
kschendel is offline   Reply With Quote
Old 2012-02-13   #5 (permalink)
Junior Member
 
Join Date: May 2010
Posts: 6
Default

My guess would be that its something at the C level, either some pointer misuse somewhere causing corrupting a pointer that should be aligned, or that some part of the system isn't using the same alignment rules.

However, it is just a guess, we might know more when the customer has done the migration again.

Also, the AD5505 error may have been a coincidence, as it hasn't happened since as far as I know,
but the original OP0791 is still happening.
alun is offline   Reply With Quote

Reply



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


© 2011 Actian Corporation. All Rights Reserved