Column Store Index in SQL Server 2016  

  • A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a column-wise data format.A columnstore index physically stores most of the data in columnstore format. In columnstore format, the data is compressed and uncompressed as columns. There is no need to uncompressed other values in each row that are not requested by the query. This makes it fast to scan an entire column of a large table.
  • A rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a row-wise data format. This has been the traditional way to store relational table data such as a heap or clustered "btree" index. A columnstore index also physically stores some rows in a rowstore format called a deltastore. The deltastore, also called delta rowgroups, is a holding place for rows that are too few in number to qualify for compression into the columnstore. Each delta rowgroup is implemented as a clustered btree index.
  • A deltastore is a holding place for rows that are too few in number to be compressed into the columnstore. The deltastore is a rowstore.

What is ColumnStore Segment Elimination?
The most important thing here is that SQL Server stores a Minimum and Maximum Value internally for each ColumnStore Segment. And based on these values SQL Server can perform so-called Segment Elimination. Segment Elimination means that SQL Server only reads those Segments (while accessing a ColumnStore Index) that contain requested data. You can think about it in the same way as Partition Elimination when you work with Partitioned Tables. But the elimination happens here at the ColumnStore Segment Level. As you have seen in the previous picture, SQL Server wasn’t able to eliminate any segments during the ColumnStore Index access, because by default you have no sorting order in the ColumnStore Index. The sorting order of your data depends on how SQL Server reads the data in the Execution Plan when you create the ColumnStore Index: As you can see the Clustered ColumnStore Index was created by reading from the Heap Table that initially contained the data. And therefore, you have no sorting order in the Clustered ColumnStore Index, and therefore the Segment Elimination can’t work perfectly for you.

How can you improve that situation? Enforce a sorting order in your data by first creating a traditional RowStore Clustered Index, and change it to a Clustered ColumnStore Index! Ouch that hurts…

-- Now we create a traditional RowStore Clustered Index to sort our table data by -- the column "DateKey".
CREATE CLUSTERED INDEX idx_ci ON FactFinance(DateKey)

-- "Swap" the Clustered Index through a Clustered ColumnStore Index
CREATE CLUSTERED COLUMNSTORE INDEX idx_ci ON FactFinance WITH (DROP_EXISTING = ON)

You should have now sorted data in the Clustered ColumnStore Index, and Segment Elimination should work quite well:
-- Segment Elimination works better than previously, but still not perfectly.
SELECT DateKey, SUM(Amount)
FROM FactFinance
WHERE DateKey >= '20101229' AND DateKey <= '20131228' GROUP BY
DateKey
But when you look again at the output of STATISTICS IO, SQL Server still has to read a lot of segments, and only skips a few of them:

1 thought on “Hello world!”

Leave a Reply

Your email address will not be published. Required fields are marked *