Login Register Actian.com  

Actian Community Wiki

Navigation
Learn About
Developing With
Ingres Talk
Information
Toolbox

Idiom

From Ingres Community Wiki

Jump to: navigation, search

Contents

Idiom

Introduction

Idiom is a tool designed to help migrate database procedures from non-Ingres databases systems into Ingres.

Idiom works by parsing the original procedure database code, and producing an internal model of the procedure definition represented as an abstract tree. This tree is then translated into a corresponding Ingres abstract tree which is finally translated into Ingres source code. This source code represents an intial conversion of the original procedure, with a few caveats:

  • The resultant source code may contain commented out sections of the original source code which could not be converted.
  • Any special formatting of code/comments may be lost.

Technical Overview

The database procedure parsers are written as ANTLR grammars. For grammar development, the ANTLRWorks development enviroment is a useful aid to debugging the parsers.

The database grammars produce an Abstract Syntax Tree (AST) which represents the original source code. For example, an assignment to a local variable "x := 1" may be represented as (TN_ASSIGN x 1). This tree of nodes is then passed into a tree parser named IngresAST which rewrites the non-Ingres nodes into Ingres nodes where possible, nodes which cannot be parsed are placed within a special TODO node.

This rewritten tree is then passed to a "pretty printer" which converts the Ingres tree back into an Ingres dialect.

An example flow is the conversion of the Oracle function SYSDATE into the Ingres equivalent:

In PLSQL.g, the keyword SYSDATE is mapped into a tree consisting of a single node named ORA_SYSDATE.

In IngresAST.g, a pattern match for this node appears:

   plsql_sysdate
       :    ORA_SYSDATE -> ^(TN_FUNC ID["DATE"] ^(TN_PARAM STRING_LIT["'now'"]))
       ;

In essence, the grammar states that in places where a function can be expected, the SYSDATE function should be rewritten as an Ingres function whose name is DATE, and whose parameters consist of a string whose value is 'now'.

Getting Started

  • Download the source code from the Subversion repository located at http://code.ingres.com/apps/idiom/trunk
  • Within Eclipse, create a new Java Project.
    • Name the project Idiom.
    • Specify that you want to create the project from existing source.
    • Browse to the directory where you downloaded the Idiom source
    • Click Finish to create the project.

Project Overview

The project contains a build.xml file which contains a set of tasks to build the project. The default task is to build all the grammars. Once built, you may need to refresh the Eclipse project tree. The default source tree is setup to convert from Oracle to Ingres. The supplied example can be executed using the run-oracle-tests task.

Adding New Rules

Most new rules will be added to the IngresAST.g file, the simples rule conversions surround functions:

An Example

The supplied grammar does not convert the Oracle NVL function.

In AntlrWorks open the PLSQL.g grammar, select Run/Debug. A dialog is presented:

  • Enter the Text: NVL(x,y) into the Text box, and choose Start Rule: function_call.
  • Click Ok
  • Click on the Goto End icon >|
  • Click on the AST view option, the function is represented as an ORA_FUNC node with 3 children, the first of which is the name of the function NVL. Clicking on the NVL highlights that it is of token type ID.

In IngresAST.g, we can specify a rule to convert this into the corresponding Ingres function IFNULL()

Locate the plsql_functions rule.

We need to create a rule which converts NVL to IFNULL. The IngresAST.g file parser works by rewriting streams of tokens, in our example, we have a stream of four tokens: ORA_FUNC ID ID ID.

A pattern which matches this stream of tokens, and checks for a NVL is as follows:

  • Add the following line to IngresAST.g (note that this rule should already have been added)
   |  ^(ORA_FUNC f=. x+=.+) {String name = $f.getText().toUpperCase();} -> {"NVL".equals(name)}?  ^(TN_FUNC ID["IFNULL"] $x+)
-> ^(TN_FUNC $f $x+)

The translation of the above rule, is: Denote the first ID token as f, and assign a local java variable named name with it's text value. If the name of the token is NVL then replace the tree with an Ingres function named IFNULL with the same parameters. Otherwise, replace the tree with a new tree whose only difference is that we have rewritten ORA_FUNC as the Ingres TN_FUNC.

Files

Databases such as Oracle allow the invocation of other procedures via positional parameters, however Ingres requires named parameters. To facilitate this, the IngresAST.g file stores procedure parameter names postions into a flat file 'ParamNames.txt', and reads this file whenever a call to an existing procedure is required such that parameter names can be used when converting the call into an Ingres call.

Additionally, Oracle allows the datatype of variables to be of column type:

e.g

   DECLARE
       first_name    employee.first_name%type;

In Ingres, the type needs to be replaced with the actual type. To automate this, the IngresAST.g file searches within a file named 'ColumnTypes.txt' to convert between types. This file is also needed by the SQL Server module to deduce the datatype of columns for result sets. This file consists of a set of lines in the format: TABLE.COLUMN:IngresType

This file can be created manually, however a useful start is to generated the file by saving the results of the following SQL to a file:

   SELECT
       CASE DATA_TYPE
         WHEN 'NUMBER'   THEN TABLE_NAME || '.' || COLUMN_NAME || ':INTEGER' 
         WHEN 'VARCHAR2' THEN TABLE_NAME || '.' || COLUMN_NAME || ':VARCHAR(' || DATA_LENGTH || ')' 
         ELSE                 TABLE_NAME || '.' || COLUMN_NAME || ':' || DATA_TYPE
       END
       || CASE NULLABLE WHEN 'N' THEN ' NOT NULL' END
   FROM
       SYS.all_tab_columns

If the decode can not be found within the file, the datatype is left as-is.

Addendum The following SQL is for SQL Server.

 SELECT
     CASE DATA_TYPE
     WHEN 'int'              THEN UPPER(table_name + '.' + column_name) + ':INTEGER'
     WHEN 'ntext'            THEN UPPER(table_name + '.' + column_name) + ':LONG NVARCHAR'
     WHEN 'text'             THEN UPPER(table_name + '.' + column_name) + ':LONG VARCHAR'
     WHEN 'nvarchar'	     THEN UPPER(table_name + '.' + column_name) + ':NVARCHAR(' + 
                                        CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR) + ')'
     WHEN 'varchar'          THEN UPPER(table_name + '.' + column_name) + ':VARCHAR(' + 
                                        CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR) + ')'
     WHEN 'uniqueidentifier' THEN UPPER(table_name + '.' + column_name) + ':BYTE(16)'
     WHEN 'bit'              THEN UPPER(table_name + '.' + column_name) + ':TINYINT'
     WHEN 'datetime'         THEN UPPER(table_name + '.' + column_name) + ':DATE'
     WHEN 'tinyint'          THEN UPPER(table_name + '.' + column_name) + ':SMALLINT'
     WHEN 'image'            THEN UPPER(table_name + '.' + column_name) + ':LONG BYTE'
     ELSE UPPER(table_name + '.' + column_name) + ':' + DATA_TYPE
     END
 FROM INFORMATION_SCHEMA.COLUMNS;

Known Issues

  • Currently the Sybase and SQL Server conversions are broken.
  • Oracle conversion doesn't declare local variables for loop counters.
Personal tools
© 2011 Actian Corporation. All Rights Reserved