Login Register Actian.com  

Actian Community Wiki

Navigation
Learn About
Developing With
Ingres Talk
Information
Toolbox

Data at Rest Encryption

From Ingres Community Wiki

Jump to: navigation, search

Contents

Ingres 10.0 Data Encryption Feature

Ingres release 10.0 provides AES encryption [1] (128, 192, and 256 bit) for user data at the column level. Encrypted columns are declared when a table is created. Stored values for these columns in database table files, log files, journal files, and checkpoint files are encrypted. Access is enabled by a passphrase [2] that is declared at the time of table creation. (This passphrase can be modified by a table owner who knows the old passphrase.) Once encryption is enabled by providing the correct passphrase after server startup, encryption and decryption are handled by the Ingres server; the process is transparent to applications accessing the data. When the server is shut down, encryption is automatically disabled until it is re-activated by a holder of the correct passphrase.

Explicit user functions ("aes_encrypt" and "aes_decrypt") are also provided. These facilitate application-driven AES-128 encryption and decryption of data. Such user-encrypted data can be stored in tables as BYTE type data.

This feature is essentially code-complete, but there are known issues that are currently being addressed. If you want to participate in field testing of this feature, please read the issues list at the bottom of this wiki article and visit frequently for status updates and additional documentation. Should you want to make an observation about or post a question about the feature, feel free to use the "discussion" link on this wiki page. Of course, other Ingres forums are also at your disposal:

  • Ingres Community Forum [3]
  • comp.databases.ingres [4] (which cross-posts on the Community Forum)
  • Public bug reporting via Trac at bugs.ingres.com [5]

This wiki article was re-written in advance of the public release of a test build of Ingres 10.0 with this feature included. This article is a work in process. Visit from time to time during product testing for new documentation, product status updates, and discussion.
-- Miket 14:54, 3 May 2010 (CDT)

In a Nutshell: Synax

The type of encryption (AES128, AES192, or AES256) and the passphrase are declared at the table level. Individual columns are flagged as encrypted, and can be encrypted with salt (the default, adding extra bits to the encrypted data) or without it (NOSALT). All datatypes (char, varchar, integer, float, money, date, etc.) except for the long datatypes can be encrypted using the transparent column encryption feature.

 CREATE TABLE socsectab
 (
    fname CHAR(10),
    lname CHAR(20),
    socsec CHAR(11) ENCRYPT NOSALT
 )
    WITH ENCRYPTION=AES128,
         PASSPHRASE='this is a secret';

Data can only be read or updated in an encrypted table when access has been granted by someone who is a valid owner of the table and has knowledge of the secret passphrase. This is done using the MODIFY command:

 MODIFY socsectab ENCRYPT
    WITH PASSPHRASE='this is a secret';

Access is revoked by specifying an empty passphrase string:

 MODIFY socsectab ENCRYPT
    WITH PASSPHRASE='';

A passphrase is changed by specifying a NEW_PASSPHRASE:

 MODIFY socsectab ENCRYPT
    WITH PASSPHRASE='this is a secret',
         NEW_PASSPHRASE='we have a new secret';

After such a change is made, an additional enabling MODIFY command must be issued. This ensures that the new phrase can be reliably supplied (no typos).

In a Nutshell: Basic Demonstration

Transparent Column Encryption

In this example we use table socsec1 to demonstrate transparent column encryption at the DBMS system level and similar table socsec2 to demonstrate encryption handled at the application level. These tables contain columns for first name, last name, and social security number. Of course, some requirements (e.g. a recently-enacted law in the Commonwealth of Massachusetts) would classify all of this information as "identifying" and therefore a target for encryption. Anyone who works in application programming knows that real-world requirements go way beyond the examples that are provided with basic instructions!

First we create an encrypted table, enable access to it, insert a few rows, and then select them:

* create table socsec1
        (fname char(10),
         lname char(20),
         socsec char(11) encrypt nosalt)
    with encryption=aes256,
        passphrase='transparent encryption example'
Executing . . .

* modify socsec1 encrypt
    with passphrase='transparent encryption example'
Executing . . .

* set trace point dm805
Executing . . .

* insert into socsec1 values ('John', 'Smith', '012-33-4567')
Executing . . .
(1 row)

* insert into socsec1 values ('Lois', 'Lane', '010-40-1234')
Executing . . .
(1 row)

* insert into socsec1 values ('Charlie', 'Brown', '012-44-9876')
Executing . . .
(1 row)

* select * from socsec1
Executing . . .

+----------+--------------------+-----------+
|fname     |lname               |socsec     |
+----------+--------------------+-----------+
|John      |Smith               |012-33-4567|
|Lois      |Lane                |010-40-1234|
|Charlie   |Brown               |012-44-9876|
+----------+--------------------+-----------+
(3 rows)

There is nothing too surprising here. In fact, the "transparent encryption" lives up to its name by being transparent to the application (here, the Ingres SQL terminal monitor). Notice, though, that a TRACE POINT was enabled during the example. Trace point DM805 sends a dump of encrypted buffers to II_DBMS_LOG; DM806 does the same thing for decryption. So if we look in the log we can see the values that are being stored for the social security numbers:

!       AES 256-bit encrypt blocks:
!            1C90492C913D7D9195FED8507F0D1BFE >,I...}=.P.......<
!       AES 256-bit encrypt blocks:
!            FF24FB9037A156F6D4CE57921F0EFD07 >..$..V.7.W......<
!       AES 256-bit encrypt blocks:
!            94EE866C722BEA0AF096EF3D64347271 >l.....+r=...qr4d<

Further evidence that we're dealing with an encrypted table: if we disable the passphrase we can no longer access the table.

* modify socsec1 encrypt with passphrase=''
Executing . . .

* select * from socsec1
Executing . . .

E_US24BF A query has been issued against an encrypted table for which
    encryption is not enabled. Please contact your system administrator.
    (Mon May  3 13:00:39 2010)

Perhaps the circle of trust for the table has changed, or the time to change the passphrase (mandated by policy or law) has arrived. This change is made using the MODIFY command and the NEW_PASSPHRASE option. Note that it is the AES key in the Ingres catalog that is re-protected with a new phrase. The table data itself need not be modified to change the enabling secret.

* modify socsec1 encrypt
    with passphrase='transparent encryption example',
         new_passphrase='rotate your keys frequently!'
Executing . . .

* modify socsec1 encrypt with passphrase='rotate your keys frequently!'
Executing . . .

Note that after the MODIFY to change the passphrase a subsequent MODIFY must be issued to enable the table with the new phrase. This forces the user to prove that the change has been typed correctly. It is wise to do this immediately and before a COMMIT has been issued. If there is a problem, ROLLBACK to the old passphrase and try again.

Encryption Functions

In addition to transparent encryption, user initiated encryption via the functions aes_encrypt and aes_decrypt is also provided. These two aspects of the encryption feature address encryption in different ways and so very naturally address different needs. If you want the Ingres server to handle encryption for the application and be assured that data at rest is encryption, declare the columns as encrypted when you CREATE TABLE. If you want to control the process and provide the passphrase at the application level, use the functions. Of course the truly security minded could combine the two, declaring encrypted BYTE or VARBYTE columns and storing within them application-encrypted data.

First, let's create a table to store our function-encrypted data and insert the data from table socsec1.

* create table socsec2
        (fname char(10),
         lname char(20),
         socsec byte(16))
Executing . . .

* insert into socsec2 select
        fname, lname,
        aes_encrypt(socsec,'user function encryption')
        from socsec1

The encryption functions accept and receive VARBYTE strings (with data being automatically or explicitly converted to that data type) and a user-supplied passphrase that is used to construct the encryption key. Note that there is no need to MODIFY socsec2 to enable encryption. In fact, Ingres does not regard the table as encrypted. Only the application logic makes the socsec2 column socsec "special". The fact that the data is not plain-text is clear if we do a simple select on the table.

* select fname, lname, hex(socsec) as socsec from socsec2
Executing . . .

+----------+--------------------+--------------------------------+
|fname     |lname               |socsec                          |
+----------+--------------------+--------------------------------+
|John      |Smith               |5722A4EEDA081CB25955E826DDFA2A3F|
|Lois      |Lane                |814ACE20D419A8F36944625941155709|
|Charlie   |Brown               |40CD699016608827C7A9A4E7CDB161DF|
+----------+--------------------+--------------------------------+
(3 rows)

Only when we use the aes_decrypt function, supplying the secret passphrase, can we read the data:

* select fname, lname,
        aes_decrypt(socsec,'user function encryption') as socsec
        from socsec2
Executing . . .

+----------+--------------------+----------------+
|fname     |lname               |socsec          |
+----------+--------------------+----------------+
|John      |Smith               |012-33-4567     |
|Lois      |Lane                |010-40-1234     |
|Charlie   |Brown               |012-44-9876     |
+----------+--------------------+----------------+
(3 rows)

With transparent encryption the passphrase for a table applies to all rows and all columns in that table. With function-based encryption, however, it is up to the application to decide what byte values are saved in the socsec column of secsec2. Even within a single column data can be encrypted with different passphrases, or (for that matter) not encrypted at all, or doubly encrypted. In this example we change the passphrase for just one row in our table.

* update socsec2 set socsec = 
  aes_encrypt(aes_decrypt(socsec,'user function encryption'),'Smith socsec')
  where lname='Smith'
Executing . . .
(1 row)

* select aes_decrypt(socsec,'Smith socsec') from socsec2 where lname='Smith'
Executing . . .

+----------------+
|col1            |
+----------------+ 
|012-33-4567     |
+----------------+
(1 row)

Now, if we select all rows from the table supplying the original passphrase, decryption of the row for 'Smith' fails, and a blank string is returned for that row.

* select aes_decrypt(socsec,'user function encryption') from socsec2
Executing . . .

+----------------+
|col1            |
+----------------+
|                |
|010-40-1234     |
|012-44-9876     |
+----------------+
(3 rows)

HELP TABLE

To help us better understand the above socsec1 and socsec2 example, let's look at the HELP TABLE output for the two tables.

The socsec1 table is Ingres system-encrypted. The HELP TABLE command displays additional information about the table:

  • Encryption type
  • Physical width of encrypted columns
  • Physical width of the table

The logical width (width as seen by applications) is 44, which is the sum of the column widths including the NULL bytes. Since AES is a block encryption algorithm, encrypted column widths as stored on disk will always be a multiple of 16. The NULL byte is included in the encrypted data, as is a data verification hash value. In this case the socsec column with NULL (12 bytes) plus verifying hash (4 bytes) exactly fits in one AES block. Partial AES blocks are padded as needed in cases where the user data is not an exact fit.

* help table socsec1
Executing . . .

Name:                 socsec1
Owner:                ingres
Created:              03-may-2010 13:10:57
Location:             ii_database
Type:                 user table
Version:              II10.0
Page size:            2048
Cache priority:       0
Alter table version:  0
Alter table totwidth: 44
Row width:            44
Number of rows:       3
Storage structure:    heap
Compression:          none
Duplicate Rows:       allowed
Number of pages:      3
Overflow data pages:  0
Journaling:           enabled after the next checkpoint
Base table for view:  no
Permissions:          none
Integrities:          none
Optimizer statistics: none

Column Information:
                                                                          Key
Column Name                      Type               Length Nulls Defaults Seq
fname                            char                   10  yes    null
lname                            char                   20  yes    null
socsec                           char                   11  yes    null

Secondary indexes:    none

Column encryption:    AES256
Alter table totwidth: 48
Encrypted width:      48

Encrypted Column Name            Type        Width Salt
socsec                           char           16   no

Although socsec2 is used by application SQL code to store encrypted data, there is nothing special about the table. It is up to the application to ensure that the encrypted data fits in the BYTE column that is put to that purpose. In this case the sum of the VARBYTE length for the encrypted data (2) plus that data itself (11) fits within one AES block, which is stored in our BYTE(16) column.

* help table socsec2
Executing . . .

Name:                 socsec2
Owner:                ingres
Created:              03-may-2010 13:10:58
Location:             ii_database
Type:                 user table
Version:              II10.0
Page size:            2048
Cache priority:       0
Alter table version:  0
Alter table totwidth: 49
Row width:            49
Number of rows:       3
Storage structure:    heap
Compression:          none
Duplicate Rows:       allowed
Number of pages:      3
Overflow data pages:  0
Journaling:           enabled after the next checkpoint
Base table for view:  no
Permissions:          none
Integrities:          none
Optimizer statistics: none

Column Information:
                                                                          Key
Column Name                      Type               Length Nulls Defaults Seq
fname                            char                   10  yes    null
lname                            char                   20  yes    null
socsec                           byte                   16  yes    null

Secondary indexes:    none

Computing the Width of Encrypted Data

Encrypted data takes up more room than unencrypted data. There are three reasons for this:

  1. AES is a block cipher [6] that operates only on 16-byte chunks, so padding is often necessary
  2. at-rest encrypted data includes a 4-byte hash to validate decryption processing
  3. adding SALT to guarantee the unique encryption of each row of an encrypted column adds 16-bytes of overhead

The HELP TABLE command displays the physical width of encrypted columns in a special encryption section at the bottom of that report.

Algorithm:

  1. start with the natural width of the column
  2. remember to add 1 for nullable columns
  3. add 4 for the verification hash
  4. round up to the nearest multiple of 16

The goal of encryption is to make meaningful data appear to be a random series of bits until the encryption algorithm in combination with the encryption key is used to restore the data to its original state. One result is that encrypted data does not compress well, and compression of such columns is intentionally skipped by Ingres compression processing.

Thus, the net effect on disk storage needs of encryption is a combination of the expansion of the encrypted rows for necessary overhead, and the loss of compressibility of the encrypted columns.

The aes_encrypt function accepts as input a string of type VARBYTE and encrypts the whole thing, including the 2-byte prefix that holds the VARBYTE length. So to compute the length of the encrypted output of the function:

  1. start with the length of the input as VARBYTE (that is, after casting to that data type if the original input is of another type)
  2. add 2 for the length prefix
  3. round up to the nearest multiple of 16

If you store aes_encrypt encrypted data in a database table, be sure to allocate sufficient space for the full encrypted data length. Truncated encrypted data cannot be decrypted.

AES Key vs. Passphrase

Encryption works by obscuring information so that it appears random; returning the seeming randomness to the original pattern is prohibitively costly and time consuming without the cryptographic secret but is computationally easy and quick with the secret. For Ingres encryption of "data at rest" the AES encryption key is stored in the catalog at the table level; the data to be unlocked and the key that unlocks it are stored in the same place. This means that, from a DBMS operational point of view, data and key are stored, recovered, and backed up together. But how can it be safe to store the key in proximity to the lock that it opens?

This is where the passphrase comes in. The trick is to separate the runtime AES key for user data from the secret that makes the user data accessible. The passphrase is a lockbox that is used to secure the catalog-stored key, and this lockbox has the same cryptographic hardness as the protection for the user data itself. When a encrypted table is created an AES key is randomly generated (unless the AESKEY= option is specified, in which case the specified key bit pattern is used). In either case the key is protected by an AES key that is generated from the passphrase, which protective key is not stored. It is important that the passphrase be both protected from discovery and faithfully remembered, and that when the passphrase is changed this history of changes be tracked and that this history be correlated with the checkpoint backups of the database. If the passphrase is lost, the procedural solution is to retore the database from the last backup for which the passphrase is known. Loss of the passphrase can be equated, procedurally, to loss of the data that the passphrase protects and unlocks.

The fact that there is one AES key that encrypts the user data and another passphrase-derived key that protects this first key has a few implications for database design and operations:

  • If the passphrase needs to be changed (as, for example, when one security officer leaves a project and another assumes that role), the MODIFY command is used to make this change. The catalog-stored key is decrypted with the old passphrase, re-encrypted with the new passphrase, and then replaced in the catalog. Database backups before this point in time will still be accessed using the old passphrase. Subsequent backups will use the new passphrase. The user data is not re-encrypted, because it is the passphrase (the key's lockbox) that has changed, not the underlying user key itself. There is no need to suspend use of the table for this operation.
  • The encrypted binary representation of data is controlled by the user AES key, not the passphrase. Thus, if a backup or replication scheme works at the binary level by transferring and applying journal records, the original table and the copy table must use the same catalog-stored AES key. This will be the case if (1) the catalogs themselves were copied at the binary level or (2) the AESKEY= option was used when the table was created. Note that tables that share an underlying AES key for encrypting user data can be protected by different passphrases.
  • In general, the AES key for user data will be different for every table. Because of this there is no canonical binary representation of encrypted data. As we will see below, even within a column of a table the same value will have differing encypted representations. When "at rest" within the encryption-protected table, the underlying data values cannot be read. However, when being processed by a database application the plain text value is, of course, readable. Included in data that is outside of this protection is data that is copied out to a flat file. Therefore, such files should be appropriately protected (for example, written to an encrypted device or file system).

Understanding SALT

SALT / NOSALT is a column-level option that specifies whether or not 16 bytes of random bits are added to user data before encryption (and stripped out after decryption). What protection does SALT (which is the default) provide, and under what circumstances should NOSALT be used?

Consider, for example, a medical database that contains a column indicating the status for each patient: whether they have had a particular medical test and, if the test has been done, whether they tested postive or negative for the condition. This is certainly the sort of data that must be keep confidential, and encryption can be one tool to ensure this confidentiality. Assume that we use Y (for yes) to indicate postive, N (for no) to indicate negative, and a NULL data setting to indicate that the test has not been done. We encrypt this column and we have now protected our patients, right?

If SALT was used then indeed the data is protected. But if we decided that 16 extra bytes on disk was too high a cost to pay and specified NOSALT, then we have a serious problem. Given the same input and the same key, encryption is repeatable. If we encrypt Y again and again with the same AES key we will keep getting the same encrypted result. Ditto for the N. Someone who has stolen a hard drive and is working to "crack" our database will have little trouble correlating the repeated patterns with the easily learned frequency of positive and negative test results in the general population. Without needing to crack the actual AES encryption the secret is out with regard to the medical status for this condition for our patient database. Our error: that we hid the values Y and N but did not hide the information that Y and N were encoding.

Now consider the same scenario with salt added to the encryption. Every row will contain, in the indicator column, a different encrypted value. Because the null indicator is included in the encryption protection it will be impossible to know who has been tested or (for those tested) who has tested positive for the condition. In fact, not even statistical information can be gleaned. Our medical test result column will, absent knowledge of the passphrase protecting the AES key for the table, appear to contain nothing but meaningless binary noise. This result is the goal of encryption.

So SALT seems to be an unambiguously good thing. Why not use it all the time?

One obvious reason would seem to be the disk storage cost; salt adds one AES block (16 bytes) to the column. Still, that would seem to be a small price to pay for better protection. If the data is worth protecting with encryption (which already adds to storage and computational overhead) then the salt is just a small additional incremental cost, especially in an age when storage is relatively cheap. But consider another column in our hypothetical medical database that contains values that do not repeat: the US Social Security number of the patient. With SALT each encrypted column will have a distinct value, of course. But even without salt (specifying NOSALT at column definition) the encrypted values will be distinct. (Furthermore, they would be distinct from the the same Social Security number stored encrypted in another table, because that other table will have a different AES key, even if the tables are protected with the very same passphrase.) In this case we still have apparent randomness in our encrypted values.

Besides hiding column value repetitions (e.g. Y, N, null) the SALT option has a more subtle effect: it renders the encryption non-repeatable (since a different, random, salt prefix is generated for each encryption operation). In the case of our Social Security numbers NOSALT still yields distinct encrypted values (because of the distinct underlying data). Are we leaking any information? Only the fact that Social Security numbers are different for each patient, but this is common knowledge. And what do we gain with NOSALT? The fact that the encryption is repeatable means that we can index the table on the encrypted value of the column, without revealing any characteristics of the underlying plain text for the column.

Which leads us to our next topic ...

Indexing Encrypted Columns

The problem of indexing encrypted columns is clear when you read these propositions:

  • An index organizes values alphabetically for fast lookup.
  • Encryption randomizes data in a way that destroys alphabetical order.

Consider a column containing these values:

Alice
Bob
George
John
Paul
Ringo

Encrypting each with the same arbitrary key we get:

FFB288991173D98EF3DC324DB41D1018
144E08CD22321DD33485ED2DF1CAC073
E9FE977A04BA5F480C007D71059B027C
ED58D4408D0CBBDEF8737CF47C648107
751D1AB5A1B0106E15DC96F118D9203D
67E4F916301203448E82B0543ECB41A5

If we build an index with the plaintext names we've revealed our secret data. So we build an index containing the encrypted values and a tuple id pointer (tidp) that records the row in the base table containing the encrypted value.

144E08CD22321DD33485ED2DF1CAC073 2
67E4F916301203448E82B0543ECB41A5 6
751D1AB5A1B0106E15DC96F118D9203D 5
E9FE977A04BA5F480C007D71059B027C 3
ED58D4408D0CBBDEF8737CF47C648107 4
FFB288991173D98EF3DC324DB41D1018 1

What can we do and, equally important, what can we not do with this index?

Suppose a user asks for the record in the table for George. Indexed encrypted columns must not use SALT, so we can encrypt George and find the corresponding index entry and from that, using the tidp, directly retrieve the row.

On the other hand, what if the user asks for all names greater than George? The relationship of the encrypted values, alphabetically, does not match the relationship of the underlying plaintext values. Our index is not designed to maximally facilitate this request; if it were then we would have a breach in our data security! To satisfy this query we would need to do a full read of the index or base table, decrypting and testing each plaintext value against the string 'George'.

This is why encrypted indexes must be defined as type HASH; the hash function inherently provides for, and only for, direct lookup in this manner.

The cost of doing a range lookup or string pattern match on encrypted columns is much higher than a similar operation on an unencrypted column. In general, such queries should (for performance reasons) be avoided.

Understanding Cipher Modes

Consider this command, inserting into an encrypted column:

insert into cbc values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');

The stored value (for this particular table's AES key) is:

44FBD13C3627EBC5E455DDBC82F93183
A988A727DA2F2BF3E0B62F0360E93907
7415657A01FD33AD82A113F0CA4EF2D3

Do you see anything surprising about this? (By the way, the surprise is not that there are three AES encrypted blocks and not two; the "extra" bytes are needed to hold the verification hash.)

The plaintext string is two repetitions of the string 'aaaaaaaaaaaaaaaa'. How is it (given that AES encryption works on 16-byte blocks) that the resulting encrypted value does not show the same repetition?

The secret is that Ingres encryption implements AES encryption for each 16-byte block and the Cipher-block chaining (CBC) mode across all the blocks being encrypted (salt, where applicable, included).

See Wikipedia for an excellent explanation of the pseudo-code behind CBC. [7] The security ramifications of failing to use something like CBC and instead choosing the simplest mode, Electronic codebook (ECB), is beautifully described and illustrated in that article as well. [8]

CBC is used by the AES_DECRYPT and AES_ENCRYPT functions and by the code implementing encryption of data at rest in encrypted columns; it is not an option that you need to select or worry about. But it does allow you to rest soundly at night knowing that any patterns in your protected data are completely obliterated by the Ingres encryption feature. And you can no doubt impress your friends and confound your enemies by making reference to "AES 256-bit encryption implemented with Cipher-block chaining".

COPYDB Considerations

The encryption of table columns "at rest" and the AES_ENCRYPT / AES_DECRYPT functions can be thought of, respectively, as server-based encryption and application-based encryption. Each has its uses in helping to secure data, depending upon the business problem being solved. Neither is a panacea for data security. Full system security depends upon a comprehensive analysis of the data life cycle and a comprehensive analysis of physical and logical system operations, including hardware security, general system integrity (e.g. virus protection), and application design. The credit card industry's PCI DSS [9] is a useful comprehensive checklist of such issues.

Data "at rest" protects Ingres base table, log, journal, and checkpoint files. Thus it protects table information, transaction information, and full database backups. It does not protect data outside of the database, which includes data passing back and forth to applications, transactions that are implementing data replication at a logical (vs. binary, journal application) level, and, most importantly, files created using COPYDB.

The binary representation of data is AES-key specific. In effect, data is binary-incompatible across tables within a single database. This is a part of the strength of encryption: binary data representations are unpredictable and appear to be random and patternless. Thus, encrypted data has no canonical representation absent a known and table-shared AES key.

The COPYDB command creates a copy.out file that writes the decryption-enabled table to a flat file as plaintext. It is important that the resulting file be encrypted with file or disk encryption facilities. The corresponding copy.in file contains encryption syntax to recreate the exported table with the original column encryption. The standard passphrase 'TEMPORARY PASSPHRASE' is encoded in the copy.in file. It would be good practice to use this or some other temporary phrase for the import of the table, and then to MODIFY the passphrase immediately after import.

The Power of Encryption

As the above discussion of COPYDB makes clear, it is a mistake to think that one can specify ENCRYPT on sensitive columns and thereby secure data in all aspects of its life cycle. So don't overestimate the effect of using this one tool.

On the other hand, don't underestimate the power of AES encryption as implemented in the Ingres server. The question is sometimes heard: how do I get my data back if I lose the passphrase? The Ingres server cannot distinguish between a forgetful valid user and a data thief, and there are no back doors or tricks that allows decryption without the key. With the passphrase to unlock the internal AES key the data is transparently accessible. Without the key the logical data is unreadable and is only meaningless bit patterns. Numeric hashes in the catalog and within each stored column monitor the correctness of keys and the health of the overall encryption system.

Optimizedb Considerations

The optimizedb utility creates histograms for columns, which information is useful for the query optimizer when calculating the cost of alternative query plans. The histogram cells are just statistical extracts, and cannot be linked back to any particular row in the table (except of course in the edge case where there is only a single table row).

Even for encrypted columns these statistics are plain text. Is this a security problem? That depends entirely on the nature of the data and of the business situation. By default, the Ingres 10.0 optimizedb program will skip over encrypted columns, but they can be included by specifying the "-ze" flag. Alternatively, the -r (relation, i.e. table) and -a (attribute, i.e. column) flags can be used to specify exactly which tables and columns should be included.

Note: For Ingres Star databases optimizedb is not able to determine whether or not a column is encrypted. So for Ingres Star the -r and -a flags must be used to control the columns for which statistical data is gathered, should the user wish to exclude encrypted columns.

dmf_crypt_maxkeys

The cbf parameter dmf_crypt_maxkeys controls how many active encryption keys are available at one time for an Ingres installation. A key is "active" if, for an encrypted table, a MODIFY statement has been issued to unlock the table for encryption by supplying the passphrase. If maxkeys is reached, slots can be freed up by re-locking tables with MODIFY, supplying an empty string for the passphrase. Slots are also freed up, of course, when a table is dropped. For a new maxkeys setting to take effect the installation must be stopped and restarted. The default value for dmf_crypt_maxkeys is 180, a number that conveniently fits on a single memory page for many computer architectures.

Log and Journal Records and auditdb

The recording of UPDATE statement after images in the log file is handled a little bit differently for encrypted tables. This is because:

  • encrypted changes are physically wider than the values of the decrypted column values
  • when SALT is used to further obfuscate the plaintext value, each re-encryption of the data yields a different physically stored value

Because auditdb displays encrypted records in hex format and without decryption, these characteristics of encrypted log and journal files are evident in the output of that command.

Consider this simple script, in which a two-integer table (only the second of which is encrypted) is created, populated, and then updated:

create table auditdemo (c1 int, c2 int encrypt)
with encryption=aes256, passphrase='auditdb demo';
modify auditdemo encrypt with passphrase='auditdb demo';
insert into auditdemo values (1,2);
update auditdemo set c1=3 where c1=1;
update auditdemo set c1=4 where c1=3;

Even in the hex display we can see that c1 starts off as value 1 and then is updated to 3 and then to 4. But notice also that the remainder of these rows appears to change as well. Indeed, each re-write of the row causes a re-encrypted that produces a different value.

Audit for database aud                             				18-Aug-2010 12:50:03.59			Page    1

Begin   : Transaction Id 00004c584d495a7e 18-Aug-2010 12:49:56.89			Username ingres                          
  Create  : Transaction Id 00004c584d495a7e Id (268,0)		Table [auditdemo            ,ingres   ]
	Location $default                        
  Insert/Append  : Transaction Id 00004c584d495a7e Id (268,0)			Table [auditdemo,ingres]
    Record: 01000000006d8f35d1cff37f6e7eab7070403c13b785bc20e6e489f41731daa0230db78884
  Update/Replace : Transaction Id 00004c584d495a7e Id (268,0)				Table [auditdemo,ingres]
    Old:    01000000006d8f35d1cff37f6e7eab7070403c13b785bc20e6e489f41731daa0230db78884
    New:    03000000006e954a05f5c9f8a5b73234ec957386c23bf686d8141cb180d190a4b8c7bc17f6
  Update/Replace : Transaction Id 00004c584d495a7e Id (268,0)				Table [auditdemo,ingres]
    Old:    03000000006e954a05f5c9f8a5b73234ec957386c23bf686d8141cb180d190a4b8c7bc17f6
    New:    0400000000829e143ee15459d75aa0f8d3238a2b1ea1643a594b6dffb4792c5b59f8657bb5
End     : Transaction Id 00004c584d495a7e 18-Aug-2010 12:49:56.90

But we can see, in the script, that column c2 was not updated by the SQL statements. So it is no surprise that despite its various encrypted incarnations, the underlying value is the one that was specified when the row was first inserted:

select * from auditdemo
Executing . . .

+-------------+-------------+
|c1           |c2           |
+-------------+-------------+
|            4|            2|
+-------------+-------------+
(1 row)

The fact that encrypted columns are in their encrypted format in the journals and thus in the audit trails created with auditdb implies that the encrypted byte format must be specified for tables into which audit trails are copied. Take a look at the Command Reference manual's example of auditdb and the audit table created for the emp table. Now imagine that we decide to encrypt the salary column in the emp table:

create table emp
(eno I2,
ename char(10),
age I1,
job I2,
salary money encrypt,
dept I2)
with encryption=aes192, passphrase='encrypted audit example';

Examining the HELP TABLE output for emp shows us that the salary column will be 32 bytes long in the audit record:

Column encryption:    AES192
Alter table totwidth: 54
Encrypted width:      54

Encrypted Column Name            Type        Width Salt
salary                           money          32  yes

Therefore, when we create an empaudit table to hold the audit trail for emp, we need to define salary as a non-nullable byte(32):

create table empaudit
(date date not null with default,
usrname char(32) not null with default,
operation char(8) not null with default,
tranid1 integer not null with default,
tranid2 integer not null with default,
table_id1 integer not null with default,
table_id2 integer not null with default,
eno I2,
ename char(10),
age I1,
job I2,
salary byte(32) not null,
dept I2);

Note that although the terminal output format for auditdb for encrypted tables is all in hex and thus difficult for humans to read, the imported audit trail is quite readable. The encrypted column, of course, is in AES encrypted format and thus readable only at this level. The plaintext value is protected.

select date, operation, eno, hex(salary) as salary from empaudit
Executing . . .

+-------------------------+--------+------+----------------------------------------------------------------+
|date                     |operatio|eno   |salary                                                          |
+-------------------------+--------+------+----------------------------------------------------------------+
|01-oct-2010 18:46:05     |append  |     1|CC438DD9ECEE3E406D52C4F7FDCA85FBD5BD11374E77E6819E024818E0C646B0|
|01-oct-2010 18:46:05     |append  |     2|C3A5D3012B3EAAD877F2E2AFD02725225FE987370C7A12A97BAE32F6C7AF0B6B|
+-------------------------+--------+------+----------------------------------------------------------------+

Partitioning

Partitioning on an encrypted column should, in general, be avoided because of a security issue: the partition into which each row is assigned will reveal how the encrypted column's plaintext value tested against the partitioning rules. Note that this applies only to partitioning on the encrypted columns themselves. There is no problem with including encrypted columns in partitioned tables, nor with indexing on such a column.

Ingres Enhancement Number

  • 107791
  • 138554 PPG Industries

References

The Detail Design Specification for this feature is available as a PDF file: DDS-ColmnEncryption.pdf.

The DDS includes a list of external reference documents.

Personal tools
© 2011 Actian Corporation. All Rights Reserved