Login Register Actian.com  

Actian Community Wiki

Navigation
Learn About
Developing With
Ingres Talk
Information
Toolbox

Scalar Subqueries

From Ingres Community Wiki

Jump to: navigation, search

Contents

References

ISO/IEC 9075-2:2003 Information technology — Database languages — SQL — Part 2: Foundation (SQL/Foundation)

Discussion Document is available here.

Detailed Design Specification for SQL Scalar Sub-Selects

Scalar Subqueries

When porting applications to Ingres, support for scalar subqueries within the select clause is required. While some of this functionality exists within Ingres, it is incomplete. This project solidifies and extends the existing code to enable easier porting of applications to Ingres.

Scalar subqueries can combine multiple queries into a single SQL unit, where they can be executed as a single unit. This greatly simplifies complex SQL computations. Scalar subqueries are especially useful for data warehouse applications and those types of databases requiring complex SQL queries. A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement or inside another subquery. A scalar subquery is a query that returns exactly one value: a single row, with a single column. Scalar subqueries can be used in most places in a SQL statement where you could use an expression or a literal value.

The places in a query where a subquery may be used are as follows:

   * In the SELECT list used for column projection
   * In the FROM clause
   * In the WHERE clause
   * In the HAVING clause 

A subquery is often referred to as an inner query, and the statement within which it occurs is then called the outer query. There is nothing wrong with this terminology, except that it may imply that you can only have two levels, inner and outer. In fact, the Oracle implementation of subqueries does not impose any practical limits on the level of nesting: the depth of nesting permitted in the FROM clause of a statement is unlimited, and that in the WHERE clause is up to 255.

A subquery can have any of the usual clauses for selection and projection. The following are required clauses:

   * A SELECT list
   * A FROM clause 

The following are optional clauses:

   * WHERE
   * GROUP BY
   * HAVING 

The subquery (or subqueries) within a statement must be executed before the parent query that calls it, in order that the results of the subquery can be passed to the parent.

Ingres must have the ability to support a scalar subqueries in:

   *  CASE expressions
   * SELECT statements
   * VALUES clauses of INSERT statements
   * WHERE clauses
   * ORDER BY clauses
   * Parameters of a function

Scalar subqueries need not be implemented for:

   * Default values for columns
   * RETURNING clauses
   * Hash expressions for clusters
   * Functional index expressions
   * CHECK constraints on columns
   * GROUP BY and HAVING clauses

Scalar subqueries can be used to compute several different types of aggregations (max and avg) all in the same SQL statement. Note that this query uses both scalar subqueries and in-line views.

SELECT
   (SELECT MAX(salary) FROM emp) AS highest_salary,
   emp_name AS employee_name,
   (SELECT AVG(bonus) FROM commission) AS avg_comission,
   dept_name
FROM emp, (SELECT dept_name FROM dept WHERE dept = ‘finance’);

Scalar subqueries can also be used for inserting into tables, based on values from other tables. The following examples uses scalar subquery to compute the maximum credit for BILL and insert this value into a max_credit table.

INSERT INTO max_credit (name,max_credit) VALUES (
   ‘Bill’,
   SELECT MAX(credit) FROM credit_table WHERE name = ‘Bill’
);
INSERT INTO emp_salary_summary
(sum_salaries, max_salary,min_salary, avg_salary)
VALUES (
   (SELECT SUM(salary) from emp),
   (SELECT MAX(salary) from emp),
   (SELECT MIN(salary) from emp),
   (SELECT AVG(salary) from emp));


Existing Implementation

Presently scalar sub-selects are implemented in the where clause with the proviso that they are not in a case, if, nullif or coalesce context. In addition, the enforcement of the cardinality checks are only performed if the user disables singleton flattening.

Particularly missing is the ability to use them in target lists, UPDATE set clauses, CASE expressions and in table join clauses.

Proposed Implementation

By far the bulk of these changes have been in the parser and relate to tracking table and column references. The Ingres SQL parser performs a single forward scan building its parse tree as it proceeds. This has always presented a problem for column references in any target list as the name scoping rules must be applied but when the reference is made, the FROM clause is yet to be processed. Thus given a name in the target list it is not apparent what is referred to until the FROM list has been parsed. This has always been so and is overcome by rescanning the resultant parse tree once the FROM list has been seen.

However, in the context of scalar sub-queries in the target list there is the added complexity that there can be more than one level of FROM lists to resolve a column reference. Identifying the scope levels and disambiguating the column names would in retrospect have been simpler if the scanner had been designed to be two-pass from the beginning, initially building a syntax tree and then applying the symbolic work afterwords when all the tables where known.

Having built the parse tree and correctly associated the column references we apply the simple folding of the target list sub-selects to make them appear as part of the main query.

Where the sub-query constitutes a portion of a target, it can be eliminated by moving the inner resdom value to be the sub-expression result and moving the implied inner join into an outer join in the containing query. This transformation is achievable with purely parser changes. Given a general target level sub-query such as:

SELECT vars, expr1+(
    SELECT expr2 FROM t2 WHERE t1.col OP t2.col AND iqual
)
FROM t1,tbls WHERE oqual

Can be transformed into:

SELECT vars, expr1+expr2
FROM t1 LEFT OUTER JOIN t2 ON(t1.col OP t2.col),tbls
WHERE oqual AND iqual
GROUP BY vars

Well almost. The transformation has a couple of caveats namely, the outer join MUST produce no more than one row or the cardinality of the main query will be violated and secondly it needs a correlated variable.

Without the correlated variables we ought to generate a cart-prod except a simple cart-prod cannot represent the cardinality either. Fortunately, this case gets handled in the optimiser by creating a QP to be executed up front, often yielding a query constant.

The cardinality issue we would generally solve using the SEjoin mechanism but knowing that this is not very efficient, it has been decided to extend the main join operators to be able to enforce the cardinality checks on request.

The problem of CASE expressions in conjunction with sub-queries can be broken in two depending on the position of the sub-query. If the sub-query is present as the selector expression in a simple CASE expression then the CASE can be migrated into the target list of the sub-query:

SELECT vars, CASE (SELECT expr FROM t2 WHERE t1.col OP t2.col)
   WHEN 1 THEN ‘A’ ELSE ‘B’ END
FROM t1,tbls

Can be transformed into:

SELECT vars, (SELECT CASE expr WHEN 1 THEN ‘A’ ELSE ‘B’ END
   FROM t2 WHERE t1.col OP t2.col)
FROM t1,tbls

It can be further transformed (see 2.1.1) into:

SELECT vars, CASE expr WHEN 1 THEN ‘A’ ELSE ‘B’ END
FROM t1 LEFT OUTER JOIN t2 ON(t1.col OP t2.col),tbls
GROUP BY vars

If, on the other hand, the sub-query is present in one of the CASE branches or is in a searched case selector expression context then this factoring will not work and the sub-query will need the more general approach of SEjoins. However, at present, there is a limitation that SEjoins have not been coded to interact with CASE expressions. This restriction will need to be lifted to support Scalar Sub-queries fully,

Documentation

Scalar Subqueries

A scalar subquery (or scalar subselect) is a subquery that selects only one column or expression and returns one row.

Ingres has supported scalar subqueries in "expression comparison_op scalar subquery" syntax in a WHERE, ON, or HAVING clause. This enhancement provides full scalar subquery support. It allows scalar subqueries to be used anywhere in an SQL query that a column or expression can be used. For example, it can appear in the select-list, in a WHERE or ON clause of a containing query, or as an operand in any expression.

A scalar subquery can be used in the following contexts:

  • The select list of a query (that is, the expressions between the SELECT and FROM keywords)
  • The JOIN clause of a query
  • WHERE clause that contains CASE, IF, COALESCE, and NULLIF expressions
  • Source to an UPDATE statement when the subquery refers to more than the modified table
  • Qualifier to a DELETE statement where the subquery identifies the rows to delete
  • The VALUES clause of an INSERT statement

Scalar subqueries can be used to compute several different types of aggregations (max and avg) all in the same SQL statement. The following query uses both scalar subqueries and in-line views:

SELECT
  (SELECT MAX(salary) FROM emp) AS highest_salary,
  emp_name AS employee_name,
  (SELECT AVG(bonus) FROM commission) AS avg_comission,
  dept_name
FROM emp, (SELECT dept_name FROM dept WHERE dept = ‘finance’) dept1;

Scalar subqueries can also be used for inserting into tables, based on values from other tables. The following examples uses scalar subquery to compute the maximum credit for Bill and insert this value into a max_credit table.

INSERT INTO max_credit (name,max_credit) VALUES (
   ‘Bill’,
   SELECT MAX(credit) FROM credit_table WHERE name = ‘Bill’
);

INSERT INTO emp_salary_summary
(sum_salaries, max_salary,min_salary, avg_salary)
VALUES (
   (SELECT SUM(salary) from emp),
   (SELECT MAX(salary) from emp),
   (SELECT MIN(salary) from emp),
   (SELECT AVG(salary) from emp));

To be a valid scalar subquery, the subquery must produce at most a single value. That is, the result should consist of zero or one row of one column. If more than one row results, a cardinality error occurs.

Legacy Note: In prior releases that supported WHERE clause subqueries, this cardinality error was not raised unless SET NOFLATTEN was enabled. This means that cardinality errors may be reported now where previously they were ignored. A new SET option, SET [NO]CARDINALITY_CHECK, allows for the WHERE clause cardinality exception to be controlled. If set, this option raises the cardinality violation exception and otherwise restores the legacy behavior. The session default setting is derived from a CBF parameter, ii.nodename.dbms.*.cardinality_check, which defaults to ON.

Subqueries in the Target List, SET, and VALUES Clauses

Scalar subqueries allow for the result of a query to be used in other expression contexts, such as in the target list, as shown in these examples:

Example Query Using Target List Correlated Scalar Subquery

SELECT s.i,(SELECT p.j FROM ptbl p  WHERE s.i=p.i) FROM stbl s

Example Query Using Target List Uncorrelated Scalar Subquery

SELECT s.i,(SELECT MAX(col1) FROM one) FROM s

Example Query Using Target List Scalar Subquery in a derived table

SELECT relid, x.attname,x.tbl
FROM (SELECT attrelid, attrelidx, attname,attfrml,
        (SELECT relid
         FROM relation
         WHERE reltid = attrelid AND reltidx = attrelidx) tbl
      FROM attribute) x, relation
WHERE reltid = attrelid AND reltidx = x.attrelidx
ORDER BY 1,2,3;

Example UPDATE Using a Scalar Subquery in the SET clause

UPDATE stbl s SET s.j=(SELECT p.i FROM ptbl p WHERE s.i=p.i)

Example INSERT Using a Scalar Subquery in the VALUES clause

INSERT INTO stbl s (i, j) VALUES ((SELECT MAX(p.i) FROM ptbl p), 1)

Example DELETE Using a Scalar Subquery in the WHERE clause

DELETE FROM s s WHERE s.i =(SELECT p.i FROM p p);

Example Query Using Derived Tables and Scalar Subquery

SELECT relid, x.attname
FROM (SELECT attrelid, attrelidx, attname,attfrml
      FROM attribute WHERE attrelid=(SELECT MAX(reltid) FROM relation)) x, relation
WHERE reltid = attrelid AND reltidx = x.attrelidx

Customer issues

Issue 108795 update from aggregate

UPDATE tablea
    SET columna = (SELECT SUM(columnb) FROM tableb);

Issue 112926 Need ability for subselects in case statements

SELECT DISTINCT table_name,
    CASE WHEN table_owner IN(SELECT DISTINCT table_owner FROM iitables)
        THEN 'a'
        ELSE 'b'
    END AS owner
FROM iitables;

Issue 121911 Enhancement request for subselects to be supported inside join syntax

SELECT ccs.key_code, ccs.name,IFNULL(cnuTel.contence, '')
FROM adc_cust_supp ccs LEFT JOIN contact_numbers cnuTel
    ON ccs.key_id = cnuTel.key_id AND cnuTel.contact_desc_id = 1  AND
        cnuTel.sequence = (SELECT MIN(sequence) FROM contact_numbers cnu
                   WHERE cnu.key_id = cnuTel.key_id AND cnu.contact_desc_id = cnuTel.contact_desc_id)
WHERE ccs.company_id=1 AND ccs.id_typ_desc_id=1 AND ccs.key_code between '001'  and '002';

Issue 123007 Unexpected exception occurred within query compilation

SELECT c1 FROM t2
WHERE c1 = (CASE (SELECT MAX(c3) FROM t2) WHEN'c1' THEN c1 ELSE c2 END);
SELECT c1 FROM t2 WHERE c1 = (CASE c3 WHEN 'c1' THEN (SELECT MAX(c1) FROM t2) ELSE c2 END);
SELECT c1 FROM t2 WHERE c1 = (CASE (SELECT '1') WHEN '1' THEN 1 END);
SELECT (CASE (SELECT '1') WHEN '1' THEN 1 END);
SELECT (SELECT '1');

Issue 132540 Scalar Subqueries (SQL-2003 F471 Compliance)

Issue 135668 internal error processing target list aggregate sub-select

SELECT DISTINCT product.product_name,
(
SELECT COUNT(*)
FROM sales_fact_1998 INNER JOIN customer
ON sales_fact_1998.customer_id = customer.customer_id
WHERE
  ( date_part ( 'year',date('now') )
   -
   date_part ( 'year', customer.birthdate) <= 20 AND product_id = 10
  )
)
FROM product
WHERE product_id = 10;

Notes

Extent of implementation

Scalar Subqueries support in Ingres 10 is complete except for the items below.

  • CASE related expressions pose a more awkward problem and so in 10.0 only in the simple case selector exepression may a Subquery be placed. Subqueries in the WHEN expressions or the THEN/ELSE clauses are still not supported. Issue 123007 summarised the CASE related forms and of these, the following three examples are now addressed.
SELECT (CASE (SELECT '1') WHEN '1' THEN 1 END);
SELECT c1 FROM t2 WHERE c1 = (CASE (SELECT '1') WHEN '1' THEN 1 END);
SELECT c1 FROM t2 WHERE c1 = (CASE (SELECT MAX(c3) FROM t2) WHEN'c1' THEN c1 ELSE c2 END);
The next two examples will generate errors reporting the unsupported Subquery in CASE exepression error.
SELECT c1 FROM t2 WHERE c1 = (CASE c3 WHEN 'c1' THEN (SELECT MAX(c1) FROM t2) ELSE c2 END);
SELECT c1 FROM t2 WHERE c1 = (CASE WHEN c1 < (SELECT MAX(c1) FROM t2) THEN c1 ELSE c2 END);
  • The 'FIRST 1' syntax cannot be used to eliminate cardinality errors. Not strictly an SSQ restriction but due to how FIRST 'n' is implemeted in query execution.
  • A SSQ cannot be the input parameter to an DB Procedure.

Test Considerations

As unit testing has proceded, each test has been categorised and captured. These use a small set of tables that have particular characteristics of interest to the subject. These tests get included into a simple pair of sep tests, one that pulls in and executes the covered SQL tests and a similar one for those marking the limit of implementation :-). The former list has been regularly run to identify when a change breaks developed functionality.

OS Dependencies

None

Personal tools
© 2011 Actian Corporation. All Rights Reserved