Monday, December 23, 2013

Managing batch database updates using Global.asax

My web service product ShufflePoint has a metered billing characteristic. In order to do the metering, every time a customer runs a query some information is logged to a SQL database. In a recent performance audit, I observed that this logging was taking several hundred milliseconds to perform. That represented about 25% of the time to process a request.

The current logging

The SQL update was being done with pretty vanilla .NET logic. It is shown below. As you see, a stored procedure is being invoked to perform the update. I did not see any optimization opportunities here. In the database schema there are no relationships or indexes - it is just an isolated logging table. My best bet, I thought, would be to buffer these single row inserts and perform a period batch insert.

  using (SqlConnection conn = new SqlConnection(ms_aqlLoggerConnStr)) {

      conn.Open();
      using (SqlCommand cmd = new SqlCommand("AqlLogger_LogQuery", conn)) {

          cmd.CommandType = CommandType.StoredProcedure;

          SqlParameter input1 = cmd.Parameters.Add("@user", SqlDbType.UniqueIdentifier);
          input1.Direction = ParameterDirection.Input;
          input1.Value = new Guid(m_UserId);

          SqlParameter input2 = cmd.Parameters.Add("@datasource",SqlDbType.NVarChar, 10);
          input2.Direction = ParameterDirection.Input;
          input2.Value = dataSource;

          SqlParameter input3 = cmd.Parameters.Add("@profileid", SqlDbType.Int);
          input3.Direction = ParameterDirection.Input;
          input3.Value = profileId;

          SqlParameter input4 = cmd.Parameters.Add("@web", SqlDbType.NVarChar, 256);
          input4.Direction = ParameterDirection.Input;
          input4.Value = webId;

          SqlParameter input5 = cmd.Parameters.Add("@acct", SqlDbType.NVarChar, 256);
          input5.Direction = ParameterDirection.Input;
          input5.Value = accountId;

          SqlParameter input6 = cmd.Parameters.Add("@runtime", SqlDbType.Int);
          input6.Direction = ParameterDirection.Input;
          input6.Value = totaltime;

          SqlParameter input7 = cmd.Parameters.Add("@cellcount", SqlDbType.Int);
          input7.Direction = ParameterDirection.Input;
          input7.Value = cellcount;

          cmd.ExecuteNonQuery();

      }
  }

Researching an alternative

In Googling on this topic, I came upon this article by Ted Spence. It showed a nice pattern for batch inserts. The question I then had was where and when to perform the batch insert. I could have created a class with an Insert method and which internally would buffer the records and then do a batch insert when the record count exceeded a specified threshold. Instead, I thought I'd try putting this logic into global.asax.

Changes to the database

Following the Ted Spence article, I first created a type in SQL Server:

CREATE TYPE AqlLoggerType AS TABLE (
  [UserID] [uniqueidentifier] NOT NULL,
  [DataSource] [nvarchar](3) NOT NULL,
  [ProfileID] [int] NOT NULL,
  [WebProperty] [nvarchar](256) NOT NULL,
  [GaAccount] [nvarchar](256) NOT NULL,
  [Runtime] [int] NOT NULL,
  [CellCount] [int] NOT NULL,
  [QueryDate] [smalldatetime] NOT NULL DEFAULT (getdate())
)

The type reflects the record to be inserted. The other change was the addition of a stored procedure which performs the batch insert. It takes a single parameter of AqlLoggerType which contains the buffered records which are to be inserted.

CREATE PROCEDURE [dbo].[AqlLogger_LogQuery_Batch]
(
  @logtable AqlLoggerType READONLY
)
AS
BEGIN
  SET XACT_ABORT ON;
  SET NOCOUNT ON;
  INSERT INTO AqlLogger
    (UserID, Datasource, ProfileID, WebProperty, GaAccount, Runtime, CellCount, QueryDate)
  SELECT 
    UserID, Datasource, ProfileID, WebProperty, GaAccount, Runtime, CellCount, QueryDate
  FROM 
    @logtable

END;

Global.asax modifications

The changes to Global.asax are shown below. A static variable of type DataTable was added to the class. In Application_Start() a DataTable is instantiated and the columns are added. Then in Application_End(), the storted procedure AqlLogger_LogQuery_Batch is invoked, passing in the DataTable.

public class Global : System.Web.HttpApplication
{
    ...
    public static DataTable AqlBatchTable;
    ...

    protected void Application_Start(Object sender, EventArgs e)
    {
        ...
        // initialize the batch update table
        InitAqlBatchTable();
    }

    public static void InitAqlBatchTable()
    {
        // setup a datatable for batch AQL logging
        DataTable AqlBatchTable = new DataTable();
        AqlBatchTable.Columns.Add(new DataColumn("UserID", typeof(Guid)));
        AqlBatchTable.Columns.Add(new DataColumn("DataSource", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("ProfileID", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("WebProperty", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("GaAccount", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("Runtime", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("CellCount", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("QueryDate", typeof(DateTime)));
    }
    ...

    protected void Application_End(Object sender, EventArgs e)
    {
        try {
            Configuration rootWebConfig = WebConfigurationManager.OpenWebConfiguration("~");
            ConnectionStringSettings css = rootWebConfig.ConnectionStrings.ConnectionStrings["LocalSqlServer"];

            using (SqlConnection conn = new SqlConnection(css.ConnectionString)) {
                conn.Open();
                using (SqlCommand cmd = new SqlCommand("AqlLogger_LogQuery_Batch", conn)) {
                    cmd.CommandType = CommandType.StoredProcedure;
                    SqlParameter param = new SqlParameter("@logtable", SqlDbType.Structured);
                    param.Value = AqlBatchTable;
                    cmd.Parameters.Add(param);
                    cmd.ExecuteNonQuery();
                }
            }
        } catch ...
    }
}


Logging code modifications

A new logging method was added which inserts a single row of tracking information to reflect the metered tracking of the users action.

        private int mLogUserQueryBatch(string dataSource, int profileId, string webId, 
            string accountId, int totaltime, int cellcount, DateTime dt)
        {
            int retval = 0;

            if (!ms_bLogAql) {
                return retval;
            }

            log.Debug("in mLogUserQueryBatch");

            try {
                Global.AqlBatchTable.Rows.Add(new object[] {
                        new Guid(m_UserId), 
                        dataSource,
                        profileId,
                        webId,
                        accountId,
                        totaltime,
                        cellcount,
                        dt
                    });

            } catch (Exception ex) {
                log.ErrorFormat("LogUserQueryBatch failed: {0}",ex.ToString());
            }
            return retval;

        }

Results

The call to mLogUserQueryBatch() only takes a few tens of milliseconds compared to a few hundreds of milliseconds - about a ten-fold improvement. I think this was a performance improvement worth making. My only concern is that I am depending upon Application_End() being called, and on the batch update succeeding. If this process fails, then I'll have lost that batch of customer activity metering.

Tuesday, December 10, 2013

Programming The Internet of Things - Zapier and IFTTT

The sum is greater

I had previously discussed MakerSwarm as a cool new app platform for programming The Internet of Things (IOT). It just occurred to me that MakerSwarm really needs to be complemented by a product like Zapier or IFTTT. These two products allow you to schedule messages between different online web services. Typical examples include posting to Twitter when you author a blog article. Of course that example is likely already handled by your blogging service, but there are many dozens of online services. Rather than worry about finding and administering LOTS of point-to-point integration, Zapier and IFTTT act as an integration hub.

What makes a merger of Zapier or IFTTT with a product like MakerSwarm interesting is that it would further break down the distinction between services and devices. Some examples quickly come to mind. If my garage door opener (which runs Android) opens, then send me an email. Or in the reverse (service to device) direction, if someone comments on a Facebook post, make my smartwatch beep. I expect that in a few years we will all be using enough smart devices and web services that all kinds use useful (or otherwise) combinations will arise.

Next wave of tech consumers

Few tech analysts grasp the significance of this opportunity. Perhaps it is generally assumed that people only want to be passive consumers if information and entertainment. That may be true still of the majority of people today, but the next generation of tech consumers I expect will be more active users who will desire to themselves orchestrate the interactions of their various web-enabled gadgets, appliances, cars, and homes. In order to reach this large potential audience, the development will have to be visual/diagrammatic along the lines of Scratch or Snap.

Other players in this space

Have a look at things like DeviceJS, WigWag, and ThingSquare. I am sure that you could find many others on Google. When some larger consumer tech companies wake up to the huge potential here, we shall see much more activity. A couple of these startups will get acquired by Facebook, Apple, Google, etc.

Some questions I have

Will there be standards? Or will this only be available on proprietary clouds? Will there be a new programming language (backing the visual tools) which dominates the space? Or will JavaScript or Python be considered simple and expressive enough? Should we expect to soon see specialized IOT chips or cards soon? How cheap can they be made? How hackable do we want our physical environment to become? Are swarm apps robots? Are the bound by Asimov's laws?

Bring on the "Killer Swarm Apps"

Catchy juxtaposition of phrases don't you think? I expect that we will be hearing it more. While I just came up with that bit of cleverness myself, I see that Scott Snyder used the phrase in his book "The New World of Wireless: How to Compete in the 4g Revolution" (2009, Pearson Prentice Hall).

One of my earliest and most memorable nightmares happened when I was about six years old and I dreamt that all of the appliances in our house became alive and came after me. A premonition of a future evil semi-sentient appliances? Technology and I have a very happy and mutually respectful relationship now (an engineering degree was good therapy in that respect). I have every intention of participating in this next phase of robotics and communications technology. Stay tuned to this blog for ongoing research, reflections and findings.

Saturday, December 7, 2013

Bug in Power Query OData HTTP handshake

I just posted the below to the Power Query discussion group

Hi,

I'd like to report a bug in the OAuth HTTP handshake. Included below are the
request/response sequence as reported by Fiddler. Note that I am redirecting to
a URL which requires authentication.

Note that in the PowerQuery handshake that the third request is sent to the
original URL instead of to the correct redirect URL, and results in an endless
loop of asking for authentication. Compare that to the sequence done by a
browser. The third request which include the authentication header is sent to the
redirect URL and succeeds.

Demonstration A: Sequence when using OAuth in Power Query
=============================================================


1. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 302 Found
Location: /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData

-------------

2. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized

-------------

3. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...
HTTP/1.1 302 Found
Location: /odata/86247d7c-7417-4b47-a17f-8ee1644cfc02/Aql2OData

--------------------------

4. GET /odata/86247d7c-7417-4b47-a17f-8ee1644cfc02/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized



Demonstration B: Sequence when using Internet Explorer
=============================================================

1. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 302 Found
Location: /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData

-------------

2. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized

-------------

3. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 200 OK

EDIT: The response from Microsoft is that this is a bug, but not the one I was thinking. I had expected a trace like that in Demonstration B when using Power Query. But in Power Query there exists state, in the form of the value pasted into the OData URL field, which doesn't exist in the browser handshake. What Microsoft pointed out is that calls #2 and #4 should also have included the authentication header - that is a bug.

For my work I have abandoned the redirect approach for now and will instead script the OAuth request and include a random GUID in the URL which will uniquely scope the metadata to that particular request. I have to script the call anyway in order to be able to pass in workbook-scoped parameters.