Intro

In the previous post I have described the basic query capabilities of Azure Table. In this post I am going to cover more advanced topics.

Prerequisites

Refer to previous posts to get started with Azure Storage Table Service.

Test Data

The following code will generate 10K+ records in Azure Table that we will use in the next section:

var batch = new TableBatchOperation();
for (var i = 0; i < 12345; i++)
{
    batch.Insert(new Profile("AU", $"EMP{i:D8}", $"John Doe {i}", $"The {i} employee of the company"));
    if (i % 100 == 0)
    {
        table.ExecuteBatch(batch);
        batch.Clear();
        Console.WriteLine($"{i} entities added!");
    }
}

table.ExecuteBatch(batch);

Code

In this post I am going to focus on more complex query scenarios. The first scenario is 'query with paging'. Consider the following example:

table.CreateQuery<Profile>().Execute().ToList();

This code loads all entities from the corresponding table. It may take a few seconds, or minutes, or even hours! The result of this query may be huge, imagine how much time it will take to load a table with hundreds of entities. Also, you may not have enough memory to store results. Fortunatly, we can load and process items in smaller chunks.

The following code loads entities from EMP00005000 to EMP00008000 in pages, each page contains 1000 entitites:

var query = (from p in table.CreateQuery<Profile>()
    where p.PartitionKey == "AU"
    where p.RowKey.CompareTo("EMP00005000") >= 0
    where p.RowKey.CompareTo("EMP00008000") < 0
    select p);

foreach (var profile in query)
{
    Console.WriteLine(profile);
}

I didn't find a way to specify the page size using this syntax. I tried to convert the query to TableQuery and specify TakeCount:

 var query = (from p in table.CreateQuery<Profile>()
    where p.PartitionKey == "AU"
    where p.RowKey.CompareTo("EMP00005000") >= 0
    where p.RowKey.CompareTo("EMP00009000") < 0
    select p).AsTableQuery();

query.TakeCount = 10;

TableContinuationToken token = null;
do
{
    var result = table.ExecuteQuerySegmented(query, token);
    Console.WriteLine(result.Results.Count);
    token = result.ContinuationToken;
} while (token != null);

But it still loads 1000 entities per page. So if you want to load data in smaller chunks, you need to use a different approach.

The following example loads data in segments, each segment contains 10 items:

var filter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "AU");
var query = new TableQuery<Profile>().Where(filter);
query.TakeCount = 10;

TableContinuationToken token = null;
do
{
    var result = table.ExecuteQuerySegmented(query, token);
    Console.WriteLine(result.Results.Count);
    token = result.ContinuationToken;
} while (token != null);

So this code will go through all items in a partition page-by-page. It may be useful for some bulk-processing.

Sometimes you will need to process only certain range of items from the middle of data set. The first guess will be something like this:

table.CreateQuery<Profile>().Skip(5000000000).Take(30000000000).AsTableQuery().ToList();

Unfortunately this code will throw the following exception:

The method 'Skip' is not supported

That's it, you can't use Skip and Take pattern here. The right way of paging through millions of entities will depend on your RowKey format. For our example the following method works fine:

var partitionKeyFilter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "AU");

var rowKeyFilter = TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, "EMP00005000"), TableOperators.And,
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThanOrEqual, "EMP00008000"));

var filter = TableQuery.CombineFilters(partitionKeyFilter, TableOperators.And, rowKeyFilter);

var query = new TableQuery<Profile>().Where(filter);
query.TakeCount = 10;

TableContinuationToken token = null;
do
{
    var result = table.ExecuteQuerySegmented(query, token);
    Console.WriteLine(result.Results.Count);
    token = result.ContinuationToken;
} while (token != null);

This code will go through entities EMP00005000 to EMP00008000 page-by-page, the page size is 10 items.

Also, it is posible to use DynamicTableEntity class and load row as a Dictionary. It could be done like this:

var query = from e in table.CreateQuery<DynamicTableEntity>()
        where e.PartitionKey == "AU" && e.RowKey == "EMP00005000"
        select e;

var entity = query.FirstOrDefault();
Console.WriteLine(entity["Title"].StringValue);

Summary

In this post I have showed how to query data in pages from Azure Table. I described some advanced usage scenarious and provided a couple code examples. In the next post I am going to cover TableRequestOptions class and its usage. More information about different ways to query Azure Storage Table visit Announcing Storage Client Library 2.1 RTM & CTP for Windows Phone and How to get most out of Windows Azure Tables


;