Shuffle Function

The Shuffle Function replaces a source value with another value from the column that is then inserted in a destination column. The source row and the row that contains the replacement value will never be the same, but depending on your data, source and replacement values can be identical.

You can indicate the number of times the function will refetch a replacement value until a value that does not match the source value is found (a “retry”), or you can allow a replacement value to match the source. Each Shuffle Function operates independently of other Shuffle Functions used in a Column Map.

Use the Actions tab in Personal Options to define a default that determines either the number of retries allowed per row or if the replacement value can match the source (no retry). You can use the retry parameter in the function to override this default. If the maximum number of retries is reached before a value that does not match the source is found, a conversion error will occur.

There are two forms of the Shuffle Function, single column and multiple column. The single column form inserts a replacement value into a single destination column. The multiple column form inserts replacement values from multiple columns in a row into corresponding destination columns. A column cannot be included in more than one Shuffle Function in a Column Map. If the retry feature is used with a multiple column shuffle, the function will refetch another replacement row if any value in the source row column matches the value in a corresponding replacement row column. (The multiple column form cannot be used in a Propagate Function.)

To create a multiple column Shuffle Function, enter the function for a source column that will be replaced with shuffled values, and edit the Column Map to remove the names of any other source columns with values that will also be replaced.

The ignore parameter prevents the function from replacing a source row or using a replacement row if either contains a specified value (NULL, SPACES (for CHAR columns), or zero-length VARCHAR). If no retries are allowed, the ignore parameter will not apply to the replacement row.

The syntax is:


SHUFFLE [ ( dest=(col1,coln) ) ] | 
[ ( dest=(col1,coln) , retry[=number] ) ] |
[ ( dest=(col1,coln) [ , retry[=number] ] , ignore=( col1 ( [spaces] | [spaces,null] 
| [spaces,null,zero_len] | [null] | [null,zero_len] | [zero_len] ) , coln (….) ) ) ] |
[ ( retry[=number] ) ] | [ ( retry[=number] , ignore=( col ( [spaces] | [spaces,null] | [spaces,null,zero_len]
| [null] | [null,zero_len] | [zero_len] ) ) ) ] |
[ ( ignore=( col ( [spaces] | [spaces,null] | [spaces,null,zero_len]
| [null] | [null,zero_len] | [zero_len] ) ) ) ]

where:

dest=
Names of the destination table columns in which replacement values are inserted. (Required for multiple column shuffle.)
col1, coln, ...
Destination table column names.
retry
Number of times to refetch a replacement value to find a value that does not match the source row. Enter zero to allow a replacement value to match the source. This parameter overrides the default set on the Actions tab in Personal Options.
Note: Using a high retry value with columns that contain many duplicate values will increase the processing time. For these columns, it may be best to use a retry value of zero.
=number
Enter a value in the range 0-1000. Enter 0 to allow a replacement value to match the source. (If you enter "retry" alone, Optim™ uses the default set in Personal Options.)
ignore=
List of columns for which the function will not replace a source value or not use a replacement value if either is a specified value (NULL, SPACES (for CHAR columns), or zero-length VARCHAR). If a replacement value is ignored, the function will refetch another replacement value. If no retries are allowed, the ignore parameter will not apply to replacement values.
col
The source column name.

For single column shuffle, enter one column name only.

For multiple column shuffle, the order of the column names must correspond to the destination table columns in the dest= parameter. The number of columns must equal the columns in the dest= parameter, and at least one column must include values. To not specify values for a column, do not enter a value. For example, coln().

null
Do not replace the source value or use a replacement value if either is a NULL value.
spaces
Do not replace the source value or use a replacement value if either is a SPACES value. For CHAR columns only.
zero_len
Do not replace the source value or use a replacement value if either is a zero-length VARCHAR value.

Single Column Default Example

The following example inserts shuffled values in a single column using the default retry setting in Personal Options.

SHUFFLE

Single Column Retry Example

The following example inserts shuffled values in a single column and refetches a replacement value that does not match the source up to 12 times.

SHUFFLE(RETRY=12)

Multiple Column Example

The following example inserts shuffled values in the STATE and ZIP columns and refetches a replacement value that does not match the source up to 12 times.

SHUFFLE(DEST=(STATE,ZIP),RETRY=12)

Ignore Example

The following example inserts shuffled values in the STATE and ZIP columns and refetches a replacement value that does not match the source up to 12 times. The example also does not replace a source value or use a replacement value for the STATE column if a source or replacement row contains a NULL or SPACES value, but does not ignore any source or replacement rows for the ZIP column.

SHUFFLE(DEST=(STATE,ZIP),RETRY=12,
IGNORE=(STATE(NULL,SPACES),ZIP()))