## HOW TO USE ROLLUP and CUBE

HOW TO USE CUBE:

The CUBE operator generates a result set that is a multidimensional cube. A multidimensional cube is an expansion of fact data, or data that records individual events. The expansion is based on columns that the user wants to analyze. These columns are called dimensions. The cube is a result set containing a cross tabulation of all the possible combinations of the dimensions.

The CUBE operator is specified in the GROUP BY clause of a SELECT statement. The select list contains the dimension columns and aggregate function expressions. The GROUP BY specifies the dimension columns and the keywords WITH CUBE. The result set contains all possible combinations of the values in the dimension columns, along with the aggregate values from the underlying rows that match that combination of dimension values.

For example, a simple table Inventory contains:

``````Item                 Color                Quantity
-------------------- -------------------- --------------------------
Table                Blue                 124
Table                Red                  223
Chair                Blue                 101
Chair                Red                  210``````

This query returns a result set that contains the Quantity subtotal for all possible combinations of Item and Color:

```SELECT Item, Color, SUM(Quantity) AS QtySum FROM Inventory GROUP BY Item, Color WITH CUBE```

Here is the result set:

``````Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                Blue                 101.00
Chair                Red                  210.00
Chair                (null)               311.00
Table                Blue                 124.00
Table                Red                  223.00
Table                (null)               347.00
(null)               (null)               658.00
(null)               Blue                 225.00
(null)               Red                  433.00``````

The following rows from the result set are of special interest:

Chair (null) 311.00

This row reports a subtotal for all rows having the value Chair in the Item dimension. The value NULL is returned for the Color dimension to show that aggregate reported by the row includes rows with any value of the Color dimension.

`Table (null) 347.00`

This row is similar, but reports the subtotal for all rows having Table in the Item dimension.
``` (null) (null) 658.00```

This row reports the grand total for the cube. Both the Item and Color dimensions have the value NULL showing that all values of both dimensions are summarized in the row.

```(null) Blue 225.00 (null) Red 433.00```

These two rows report the subtotals for the Color dimension. Both have NULL in the Item dimension to show that the aggregate data came from rows having any value for the Item dimension.

Using GROUPING to Distinguish Null Values
The null values generated by the CUBE operation present a problem: How can a NULL generated by the CUBE operation be distinguished from a NULL returned in the actual data? This is achieved using the GROUPING function. The GROUPING function returns 0, if the column value came from the fact data, and 1 if the column value is a NULL generated by the CUBE operation. In a CUBE operation, a generated NULL represents all values. The SELECT statement can be written to use the GROUPING function to substitute the string ALL in place of any generated NULL. Because a NULL from the fact data indicates the data value is unknown, the SELECT can also be coded to return the string UNKNOWN in place of any NULL from the fact data. For example:

```SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySum FROM Inventory GROUP BY Item, Color WITH CUBE```

Multidimensional Cubes
The CUBE operator can be used to generate n-dimensional cubes, or cubes with any number of dimensions. A single dimension cube can be used to generate a total, for example:

```SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, SUM(Quantity) AS QtySum FROM Inventory GROUP BY Item WITH CUBE GO```

This SELECT statement returns a result set showing both the subtotals for each value of Item and the grand total for all values of Item:

``````Item                 QtySum
-------------------- --------------------------
Chair                311.00
Table                347.00
ALL                  658.00``````

SELECT statements that contain a CUBE with many dimensions can generate large result sets, because these statements generate rows for all combinations of the values in all the dimensions. These large result sets may contain too much data to be easily read and understood. One solution to this problem is to put the SELECT statement into a view:
``` CREATE VIEW InvCube AS SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySum FROM Inventory GROUP BY Item, Color WITH CUBE```

The view can then be used to query only the dimension values of interest:

```SELECT * FROM InvCube WHERE Item = 'Chair' AND Color = 'ALL'```

```Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                ALL                  311.00```

HOW TO USE  ROLLUP:

The ROLLUP operator is useful in generating reports that contain subtotals and totals. The ROLLUP operator generates a result set that is similar to the result sets generated by the CUBE operator. For more information, see Summarizing Data Using CUBE.

The differences between CUBE and ROLLUP are:

CUBE generates a result set showing aggregates for all combinations of values in the selected columns.

ROLLUP generates a result set showing aggregates for a hierarchy of values in the selected columns.
For example, a simple table Inventory contains:

````Item                 Color                Quantity`
`-------------------- -------------------- --------------------------`
`Table                Blue                 124`
`Table                Red                  223`
`Chair                Blue                 101`
`Chair                Red                  210````

This query generates a subtotal report:

```SELECT CASE WHEN (GROUPING(Item) = 1) THEN 'ALL' ELSE ISNULL(Item, 'UNKNOWN') END AS Item, CASE WHEN (GROUPING(Color) = 1) THEN 'ALL' ELSE ISNULL(Color, 'UNKNOWN') END AS Color, SUM(Quantity) AS QtySum FROM Inventory GROUP BY Item, Color WITH ROLLUP```

``````Item                 Color                QtySum
-------------------- -------------------- --------------------------
Chair                Blue                 101.00
Chair                Red                  210.00
Chair                ALL                  311.00
Table                Blue                 124.00
Table                Red                  223.00
Table                ALL                  347.00
ALL                  ALL                  658.00``````
`(7 row(s) affected)`

If the ROLLUP keyword in the query is changed to CUBE, the CUBE result set is the same, except these two additional rows are returned at the end:

```ALL Blue 225.00 ALL Red 433.00```

The CUBE operation generated rows for possible combinations of values from both Item and Color. For example, not only does CUBE report all possible combinations of Color values combined with the Item value Chair (Red, Blue, and Red + Blue), it also reports all possible combinations of Item values combined with the Color value Red (Chair, Table, and Chair + Table).

For each value in the columns on the right in the GROUP BY clause, the ROLLUP operation does not report all possible combinations of values from the column (or columns) on the left. For example, ROLLUP does not report all the possible combinations of Item values for each Color value.

The result set of a ROLLUP operation has functionality similar to that returned by a COMPUTE BY; however, ROLLUP has these advantages:

ROLLUP returns a single result set; COMPUTE BY returns multiple result sets that increase the complexity of application code.

ROLLUP can be used in a server cursor; COMPUTE BY cannot.

The query optimizer can sometimes generate more efficient execution plans for ROLLUP than it can for COMPUTE BY.