Monday, December 23, 2013

Managing batch database updates using Global.asax

My web service product ShufflePoint has a metered billing characteristic. In order to do the metering, every time a customer runs a query some information is logged to a SQL database. In a recent performance audit, I observed that this logging was taking several hundred milliseconds to perform. That represented about 25% of the time to process a request.

The current logging

The SQL update was being done with pretty vanilla .NET logic. It is shown below. As you see, a stored procedure is being invoked to perform the update. I did not see any optimization opportunities here. In the database schema there are no relationships or indexes - it is just an isolated logging table. My best bet, I thought, would be to buffer these single row inserts and perform a period batch insert.

  using (SqlConnection conn = new SqlConnection(ms_aqlLoggerConnStr)) {

      conn.Open();
      using (SqlCommand cmd = new SqlCommand("AqlLogger_LogQuery", conn)) {

          cmd.CommandType = CommandType.StoredProcedure;

          SqlParameter input1 = cmd.Parameters.Add("@user", SqlDbType.UniqueIdentifier);
          input1.Direction = ParameterDirection.Input;
          input1.Value = new Guid(m_UserId);

          SqlParameter input2 = cmd.Parameters.Add("@datasource",SqlDbType.NVarChar, 10);
          input2.Direction = ParameterDirection.Input;
          input2.Value = dataSource;

          SqlParameter input3 = cmd.Parameters.Add("@profileid", SqlDbType.Int);
          input3.Direction = ParameterDirection.Input;
          input3.Value = profileId;

          SqlParameter input4 = cmd.Parameters.Add("@web", SqlDbType.NVarChar, 256);
          input4.Direction = ParameterDirection.Input;
          input4.Value = webId;

          SqlParameter input5 = cmd.Parameters.Add("@acct", SqlDbType.NVarChar, 256);
          input5.Direction = ParameterDirection.Input;
          input5.Value = accountId;

          SqlParameter input6 = cmd.Parameters.Add("@runtime", SqlDbType.Int);
          input6.Direction = ParameterDirection.Input;
          input6.Value = totaltime;

          SqlParameter input7 = cmd.Parameters.Add("@cellcount", SqlDbType.Int);
          input7.Direction = ParameterDirection.Input;
          input7.Value = cellcount;

          cmd.ExecuteNonQuery();

      }
  }

Researching an alternative

In Googling on this topic, I came upon this article by Ted Spence. It showed a nice pattern for batch inserts. The question I then had was where and when to perform the batch insert. I could have created a class with an Insert method and which internally would buffer the records and then do a batch insert when the record count exceeded a specified threshold. Instead, I thought I'd try putting this logic into global.asax.

Changes to the database

Following the Ted Spence article, I first created a type in SQL Server:

CREATE TYPE AqlLoggerType AS TABLE (
  [UserID] [uniqueidentifier] NOT NULL,
  [DataSource] [nvarchar](3) NOT NULL,
  [ProfileID] [int] NOT NULL,
  [WebProperty] [nvarchar](256) NOT NULL,
  [GaAccount] [nvarchar](256) NOT NULL,
  [Runtime] [int] NOT NULL,
  [CellCount] [int] NOT NULL,
  [QueryDate] [smalldatetime] NOT NULL DEFAULT (getdate())
)

The type reflects the record to be inserted. The other change was the addition of a stored procedure which performs the batch insert. It takes a single parameter of AqlLoggerType which contains the buffered records which are to be inserted.

CREATE PROCEDURE [dbo].[AqlLogger_LogQuery_Batch]
(
  @logtable AqlLoggerType READONLY
)
AS
BEGIN
  SET XACT_ABORT ON;
  SET NOCOUNT ON;
  INSERT INTO AqlLogger
    (UserID, Datasource, ProfileID, WebProperty, GaAccount, Runtime, CellCount, QueryDate)
  SELECT 
    UserID, Datasource, ProfileID, WebProperty, GaAccount, Runtime, CellCount, QueryDate
  FROM 
    @logtable

END;

Global.asax modifications

The changes to Global.asax are shown below. A static variable of type DataTable was added to the class. In Application_Start() a DataTable is instantiated and the columns are added. Then in Application_End(), the storted procedure AqlLogger_LogQuery_Batch is invoked, passing in the DataTable.

public class Global : System.Web.HttpApplication
{
    ...
    public static DataTable AqlBatchTable;
    ...

    protected void Application_Start(Object sender, EventArgs e)
    {
        ...
        // initialize the batch update table
        InitAqlBatchTable();
    }

    public static void InitAqlBatchTable()
    {
        // setup a datatable for batch AQL logging
        DataTable AqlBatchTable = new DataTable();
        AqlBatchTable.Columns.Add(new DataColumn("UserID", typeof(Guid)));
        AqlBatchTable.Columns.Add(new DataColumn("DataSource", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("ProfileID", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("WebProperty", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("GaAccount", typeof(string)));
        AqlBatchTable.Columns.Add(new DataColumn("Runtime", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("CellCount", typeof(int)));
        AqlBatchTable.Columns.Add(new DataColumn("QueryDate", typeof(DateTime)));
    }
    ...

    protected void Application_End(Object sender, EventArgs e)
    {
        try {
            Configuration rootWebConfig = WebConfigurationManager.OpenWebConfiguration("~");
            ConnectionStringSettings css = rootWebConfig.ConnectionStrings.ConnectionStrings["LocalSqlServer"];

            using (SqlConnection conn = new SqlConnection(css.ConnectionString)) {
                conn.Open();
                using (SqlCommand cmd = new SqlCommand("AqlLogger_LogQuery_Batch", conn)) {
                    cmd.CommandType = CommandType.StoredProcedure;
                    SqlParameter param = new SqlParameter("@logtable", SqlDbType.Structured);
                    param.Value = AqlBatchTable;
                    cmd.Parameters.Add(param);
                    cmd.ExecuteNonQuery();
                }
            }
        } catch ...
    }
}


Logging code modifications

A new logging method was added which inserts a single row of tracking information to reflect the metered tracking of the users action.

        private int mLogUserQueryBatch(string dataSource, int profileId, string webId, 
            string accountId, int totaltime, int cellcount, DateTime dt)
        {
            int retval = 0;

            if (!ms_bLogAql) {
                return retval;
            }

            log.Debug("in mLogUserQueryBatch");

            try {
                Global.AqlBatchTable.Rows.Add(new object[] {
                        new Guid(m_UserId), 
                        dataSource,
                        profileId,
                        webId,
                        accountId,
                        totaltime,
                        cellcount,
                        dt
                    });

            } catch (Exception ex) {
                log.ErrorFormat("LogUserQueryBatch failed: {0}",ex.ToString());
            }
            return retval;

        }

Results

The call to mLogUserQueryBatch() only takes a few tens of milliseconds compared to a few hundreds of milliseconds - about a ten-fold improvement. I think this was a performance improvement worth making. My only concern is that I am depending upon Application_End() being called, and on the batch update succeeding. If this process fails, then I'll have lost that batch of customer activity metering.

Tuesday, December 10, 2013

Programming The Internet of Things - Zapier and IFTTT

The sum is greater

I had previously discussed MakerSwarm as a cool new app platform for programming The Internet of Things (IOT). It just occurred to me that MakerSwarm really needs to be complemented by a product like Zapier or IFTTT. These two products allow you to schedule messages between different online web services. Typical examples include posting to Twitter when you author a blog article. Of course that example is likely already handled by your blogging service, but there are many dozens of online services. Rather than worry about finding and administering LOTS of point-to-point integration, Zapier and IFTTT act as an integration hub.

What makes a merger of Zapier or IFTTT with a product like MakerSwarm interesting is that it would further break down the distinction between services and devices. Some examples quickly come to mind. If my garage door opener (which runs Android) opens, then send me an email. Or in the reverse (service to device) direction, if someone comments on a Facebook post, make my smartwatch beep. I expect that in a few years we will all be using enough smart devices and web services that all kinds use useful (or otherwise) combinations will arise.

Next wave of tech consumers

Few tech analysts grasp the significance of this opportunity. Perhaps it is generally assumed that people only want to be passive consumers if information and entertainment. That may be true still of the majority of people today, but the next generation of tech consumers I expect will be more active users who will desire to themselves orchestrate the interactions of their various web-enabled gadgets, appliances, cars, and homes. In order to reach this large potential audience, the development will have to be visual/diagrammatic along the lines of Scratch or Snap.

Other players in this space

Have a look at things like DeviceJS, WigWag, and ThingSquare. I am sure that you could find many others on Google. When some larger consumer tech companies wake up to the huge potential here, we shall see much more activity. A couple of these startups will get acquired by Facebook, Apple, Google, etc.

Some questions I have

Will there be standards? Or will this only be available on proprietary clouds? Will there be a new programming language (backing the visual tools) which dominates the space? Or will JavaScript or Python be considered simple and expressive enough? Should we expect to soon see specialized IOT chips or cards soon? How cheap can they be made? How hackable do we want our physical environment to become? Are swarm apps robots? Are the bound by Asimov's laws?

Bring on the "Killer Swarm Apps"

Catchy juxtaposition of phrases don't you think? I expect that we will be hearing it more. While I just came up with that bit of cleverness myself, I see that Scott Snyder used the phrase in his book "The New World of Wireless: How to Compete in the 4g Revolution" (2009, Pearson Prentice Hall).

One of my earliest and most memorable nightmares happened when I was about six years old and I dreamt that all of the appliances in our house became alive and came after me. A premonition of a future evil semi-sentient appliances? Technology and I have a very happy and mutually respectful relationship now (an engineering degree was good therapy in that respect). I have every intention of participating in this next phase of robotics and communications technology. Stay tuned to this blog for ongoing research, reflections and findings.

Saturday, December 7, 2013

Bug in Power Query OData HTTP handshake

I just posted the below to the Power Query discussion group

Hi,

I'd like to report a bug in the OAuth HTTP handshake. Included below are the
request/response sequence as reported by Fiddler. Note that I am redirecting to
a URL which requires authentication.

Note that in the PowerQuery handshake that the third request is sent to the
original URL instead of to the correct redirect URL, and results in an endless
loop of asking for authentication. Compare that to the sequence done by a
browser. The third request which include the authentication header is sent to the
redirect URL and succeeds.

Demonstration A: Sequence when using OAuth in Power Query
=============================================================


1. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 302 Found
Location: /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData

-------------

2. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized

-------------

3. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...
HTTP/1.1 302 Found
Location: /odata/86247d7c-7417-4b47-a17f-8ee1644cfc02/Aql2OData

--------------------------

4. GET /odata/86247d7c-7417-4b47-a17f-8ee1644cfc02/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized



Demonstration B: Sequence when using Internet Explorer
=============================================================

1. GET /odata/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 302 Found
Location: /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData

-------------

2. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

HTTP/1.1 401 Unauthorized

-------------

3. GET /odata/c4e6baed-efc1-457d-a3dc-32b47a5a5a94/Aql2OData HTTP/1.1

Authorization: Basic c3A6M0loT3RuQ3lkRzJuQVVWZWFhe...

HTTP/1.1 200 OK

EDIT: The response from Microsoft is that this is a bug, but not the one I was thinking. I had expected a trace like that in Demonstration B when using Power Query. But in Power Query there exists state, in the form of the value pasted into the OData URL field, which doesn't exist in the browser handshake. What Microsoft pointed out is that calls #2 and #4 should also have included the authentication header - that is a bug.

For my work I have abandoned the redirect approach for now and will instead script the OAuth request and include a random GUID in the URL which will uniquely scope the metadata to that particular request. I have to script the call anyway in order to be able to pass in workbook-scoped parameters.

Friday, November 29, 2013

Installing OpenCV on Windows 7

It was not very long ago (6 weeks) that I went through the process of building OpenCV from source. I had used version 2.4.6 and recently 2.4.7 has been released. The release notes show some useful improvements and I want to take better notes for the purpose of this blog. So as much for my own benefit as yours, I am going to work through this extensive build once again.

There is a binary download of OpenCV available at the opencv.org web site. But this version is not compiled to use the OpenCL libraries which I just installed, and using OpenCL is necessary to get the benefits of GPU acceleration.

Help resources

My first task with such a complex endeavor is to find help. I found exactly what I needed on the AMD site in the form of a PDF by Dr. Harris Gasparakis. Dr. Gasparakis also contributes to the AMD blog. Two recent entries are http://developer.amd.com/community/blog/2013/07/09/opencv-cl-computer-vision-with-opencl-acceleration/ and http://developer.amd.com/community/blog/2013/11/05/opencv-cl-computer-vision-with-even-more-opencl-acceleration/

Another source of installation instructions is this page at opencv.org ("DOC" for now on). As I have two complete sets of instructions which seeminly differ from each other, the first think I am going to do is compare and combine them and create what I hope will be the best merged set of steps. Since the Dr. Gasparakis document ("PDF" for now on) is more recent, better organized, and targets the same computer configuration as my own, I will use that as my main guide.

Prerequisites

The last time I built OpenCV, I hadn't prepared my system to support the Python bindings. Python has become more interesting to me since I have been using it on a Raspberry Pi, so I desired to be able to program OpenCV with Python. The instructions at opencv.org cover that topic. Since I am on 64 bit windows and there are no 64 bit builds of Numpy at www.numpy.org, I choose to install Anaconda which provides both 32 and 64 bit installations. Also, Anaconda looks like an interesting product in its own right. Perhaps I will blog on that in the future.

Other prerequisites for the OpenCV build are Visual Studio 2012 and cmake - both of which I already have installed. The DOC lists (after "OpenCV may come in multiple flavors") many other tools and libraries. The only additional tool that I installed was Qt. I installed Qt because I suspect that some of the OpenCV samples use Qt for the user interface controls. It looks like everything else is optional - I guess I'll find out.

Following the PDF Step 1, I downloaded a zip of the latest release tag (2.4.7) from https://github.com/Itseez/opencv and unzipped it on my machine to E:-2.4.7. Thus for my installation:

OPENCV_ROOT = E:\programs\opencv-2.4.7

The folder structure is show below. Under "modules/ocl" is the OpenCL implementation. OPENCV_ROOT contains the c++ wrappers of OpenCL kernels. OPENCV_ROOT contains the OpenCL kernels.

OpenCV root folders

OpenCV root folders

Following the PDF, Step 2 is to download APPML, the AMD math libraries. As this was recommended, I went ahead and installed these libraries. After doing so, I see they are listed in Programs and Features. Sorted by "Publisher" I see these three AMD entries:

  • AMD APP SDK 2.9
  • clAmdFft
  • clAmdBlas

Step 3 in the PDF is to install cmake, which I had already done during my previous experiment with OpenCV.

The Steps

Now I am ready to apply the magic of CMake. CMake is a cross-platform, open-source build system. CMake builds make files which then build libraries and executables. I have to admit that I find CMake to be intimidating - mainly due to a complete lack of understanding in how it works. Another learning project for another day.

The outline of the process of building OpenCV is as follows:

  1. run cmake-gui
  2. specify where is source and where to build, then click "configure"
  3. select VS2012 as generator, then click "finish"
  4. select/correct the available packages
  5. again click "configure"
  6. for packages where were added but caused errors, tell cmake their location
  7. click "configure" again to see if the values were correct and eliminated the errors
  8. select what parts of OpenCV are to be built
  9. click "configure" again
  10. if no errors, tell CMake to create the project files by pushing the Generate button.
  11. Go to the build directory and open the created OpenCV solution.
  12. build both the Release and the Debug binaries.
  13. collect the header and the binary files by building the Install project
  14. test your build by going into the Build/bin/Debug or Build/bin/Release directory and start a couple of applications like the contours.exe

  15. set the OpenCV env var for your system for my machine it can be done by running setx -m OPENCV_DIR D:6411

  16. if dynamic-link libraries (dlls) were built, then add the path to these libraries tot he system PATH env var

Execution of steps

In my installation:

OPENCV_BUILT = E:\programs\opencv-2.4.7-built

Clicking "Configure" runs for about a minute and generates several pages of output. I reviewed the output to see what can be gleaned from it. I find this:

OpenCL:
    Version:                     dynamic
    Include path:                E:/programs/opencv-2.4.7/3rdparty/include/opencl/1.2 C:/Program Files (x86)/AMD/clAmdFft/include C:/Program Files (x86)/AMD/clAmdBlas/include
    Use AMD FFT:                 YES
    Use AMD BLAS:                YES

Python:
    Interpreter:                 e:/programs/Anaconda/python.exe (ver 2.7.5)
    Libraries:                   e:/programs/Anaconda/libs/python27.lib (ver 2.7.5)
    numpy:                       e:/programs/Anaconda/lib/site-packages/numpy/core/include (ver 1.7.1)
    packages path:               e:/programs/Anaconda/Lib/site-packages

Java:
    ant:                         NO
    JNI:                         C:/Program Files/Java/jdk1.7.0_40/include C:/Program Files/Java/jdk1.7.0_40/include/win32 C:/Program Files/Java/jdk1.7.0_40/include
    Java tests:                  NO

I believe that "dynamic" relates to this (from http://code.opencv.org/projects/opencv/wiki/ChangeLog):

Dynamic dependency on OpenCL runtime (allows run-time branching between OCL and non-OCL implementation)

Note that Blas and Fft were found, as was Python, Numpy, and Java. That all looks correct.

Next step is to review available packages and check the checkboxes of those available.

The PDF says this:

Next steps:

  • If appropriate, disable WITH_CUDA, WITH_CUFFT
  • Make sure that WITH_OPENCL is enabled

WITH_OPENCL was already checked WITH_CUDA was not checked but WITH_CUFFT is checked

I don't know why WITH_CUFFT was checked by default. I unchecked it and then clicked "Configure" again.

Now I am wondering if I should install Ant. I plan on tring the Java OpenCV wrappers, and Ant may be necessary, so I am going ahead and installing it. I downloaded 1.9.2 from http://ant.apache.org

After installing, I set the environment variable ANT_HOME and added %ANT_HOME% to PATH. Running ant -version gave me the error

Unable to locate tools.jar. Expected to find it in C:Files6.jar

Investingating, I found that JAVA_HOME wasn't set to my lastest JDK. After fixing that I was able to run "ant -version"

I clicked "Configure" in cmake-gui again.

It still reports "ANT: no" so I assume that it did not reload the environment variables. I quit cmake-gui and restarted. Now after running "Configure" it is finding ANT.

I checked "INSTALL_PYTHON_EXAMPLES" because I am interested in trying Python.

Note that the PDF says to check "OPENCL_ROOT_DIR". This setting is no longer present. I am assuming that with dynamic OpenCL binding that it is no longer relevant.

As per PDF, I verified that WITH_OPENCLAMDBLAS and WITH_OPENCLAMDFFT where enabled.

I checked "BUILD_EXAMPLES" and "BUILD_PERF_TESTS" and clicked "Configure" again.

All looks good so I clicked "Generate".

After doing so, I find OpenCV.sln in opencv-2.4.7-built.

I used the VS command like to run msbuild OpenCV.sln /p:Configuration=Debug and msbuild OpenCV.sln /p:Configuration=Release

For the Debug build, I got an error saying:

LINK : fatal error LNK1104: cannot open file 'python27_d.lib'

This is caused by python.h trying to be too clever about library references. See this discussion: http://stackoverflow.com/questions/11311877/creating-a-dll-from-a-wrapped-cpp-file-with-swig. Anyway, my research tells me that I can ignore this link error

The Debug and Release builds took about ten minutes each on my computer. The PDF said to "look for opencv_ocl246d.lib in the Debug folder..." but I see no such files. Again, I am assuming that is due to the change in OpenCV using dynamic linking to OpenCL.

The next step listed in the PDF is to try running some samples. I started with the face detection sample by running:

ocl-example-facedetect.exe t=OPENCV_ROOT\data\haarcascades\haarcascade_frontalface_alt.xml i=OPENCV_ROOT\samples\cpp\lena.jpg

where OPENCV_ROOT is replaced with the install path

Success! It reported this information

average GPU time (noCamera) : 44.0714 ms
accuracy value: 0

Next I tried

cpp-tutorial-cannydetector_demo.exe OPENCV_ROOT.jpg 

which also worked.

I went on to try all of the OpenCL (executables beginning with "ocl-" both to check that they worked and to see what they do. There was at least one that did not work. In a subsequent blog I will review the OpenCL samples.

Next steps

My next task on this project will be to both test and to review more of the samples. In addition to the OpenCL samples there are C and C++ samples and also Java and Python samples. What I intent to learn from these samples is basically a) what can OpenCV do, b) what are the API calls used to do it, and c) which language bindings provide a good balance of ease-of-use and runtime performance.

DIY Open Source Phone

Everyone has a phone. Lot's of people are joining the "maker" movement. I guess I am surprised there haven't been more initiatives like David Mellis' DIY phone. I doubt that going this route can be justified from a pure cost perspective (candybar phones probably go for about $10 on eBay). But if you've got the knack and the itch, then there is certainly ample opportunity now to pursue it. What I find most compelling is the creative range of case designs that has resulted.

From The Verge:

"Mellis describes his DIY phone as "a difficult but potentially do-able project" that should cost around $200 to complete."

Very tempting, but I've got far too many projects already in progress.

Monday, November 18, 2013

AMD OpenCL installation

This was a "retake" as I had gone through the process before. But since both OpenCL and OpenCV have updates, and I had not taken such good notes the first time around, I am re-installing everything. For this toolset, that is not a task to be taken lightly.

The official name is the AMD Accelerated Parallel Processing (APP) SDK

What is AMD APP Technology? From the above URL:

"AMD APP technology is a set of advanced hardware and software technologies that enable AMD graphics processing cores (GPU), working in concert with the system’s x86 cores (CPU), to execute heterogeneously to accelerate many applications beyond just graphics. This enables better balanced platforms capable of running demanding computing tasks faster than ever, and sets software developers on the path to optimize for AMD Accelerated Processing Units (APUs)."

Without OpenCL installed, I am not going to be taking advantage of the GPU, and the whole reason I'm down this rabbit hole is to have OpenCV work well (measured in high frame rates).

As I had mentioned, I had already previously installed:

  1. The AMD FirePro Unified Driver (V5900 graphics driver)
  2. AMD APP
  3. OpenCV

I was very hesitant to fiddle with the installed and working graphics driver, so I choose to uninstalled AMD APP v2.8. I found no uninstaller in Windows Programs and Features but in Googling I found that the uninstall is managed by the "AMD Catalyst Install Manager" which is listed in Programs and Features. While uninstalling AMD APP, I also uninstalled a couple other AMD features I was not using.

The AMD APP V2.9 installer was downloaded and executed. I specified an alternate install path and no other options were presented. I installed to

e:\amd_app_2_9

The next step was to compile it. Rather than rehash here my frustrations with that, I refer the audience to the thread that I started at the AMD Developer Central forum. A synopsis is as follows.

  1. The AMD documentation is very helpful but there are some inaccuracies.
  2. I did end up having to install the latest AMD Unified Driver. I just installed over top of the existing and had no issues.
  3. After the driver update and compilation of OpenCL, I was able to successfully run the OpenCL samples.

The coolest OpenCL demo that I ran was the FluidSimulation2D. Mandelbrot was also rewarding. I can tell these demos are using the V5900 because I can hear the fan spin up. When time allows, I will experiment with more of the sample, peruse some source, and perhaps even learn a bit of OpenCL. But my main task is to learn OpenCV so the next installment in this series will pertain to getting that installed and operating.

An important note from Dr. Harris Gasparakis' blog at http://developer.amd.com/community/blog/2013/11/05/opencv-cl-computer-vision-with-even-more-opencl-acceleration/

With the view towards the future, and OpenCL 2.0 in particular, there is a significant and ground breaking development! About a year ago, we asked the question: Is it possible run the same code on both CPUs and GPUs, and have the OpenCV runtime libraries determine if OpenCL capable devices are present, and use hardware acceleration if they are? By that I do not mean running OpenCL on CPU vs. running OpenCL on GPU – that already works. What I mean is run a native C++ implementation on CPU vs. an OpenCL implementation on the GPU, if, at runtime, it is determined that the system supports OpenCL.

It is not clear to me if OpenCV 2.4.7 is already doing some of this. But I do know that the binding of OpenCV and OpenCL is different in 2.4.7. This is a question that I will be investigating during my installation and testing of OpenCV.

Sunday, November 17, 2013

Hardware preparation for OpenCV development

Why I added a graphics card

My computer was working just fine, thank you. The Intel i5-2500K contains Intel HD 3000 integrated graphics processor. It drives my dual HD monitors just fine. The The reason that I added a graphics card was to get a GPGPU (general purpose graphics processing unit). A GPU provides a great speed-boost to any math-intenstive processing because all those graphics cores can be brought to bear on any calculations which can be performed in parallel.

Some background

OpenCV supports two differnet GPU-based acceleration schemes. GPU match accelerators are usually proprietary and are developed and provided by the hardware manufacturer. I assume that the reason for their being closed is that the vendors have trade secrets related to the specialized GPU hardware and accompanying software. Even getting the vendor's GPU libraries often involves signing an NDA. The two schemes are OpenCL and CUDA.

OpenCL was initially developed by Apple who still owns the trademark but the technology was submitted to the Khronos Group who is now responsible for publishing the technical standards. The current standard is OpenCL 2.0 but most implementations are still at OpenCL 1.X.

CUDA is a parallel computing platform and programming model invented by NVIDIA who both owns the trademark and develops and distributes the libraries. CUDA works with all Nvidia GPUs from the G8x series onwards, as well as the Tegra system on a chip (SoC) series.

OpenCV supports both OpenCL and CUDA for GPU-based acceleration. I've chosen to go the OpenCV route as there is a much wider choice of supported hardware. This is not in any way a disparagement of CUDA, which is a very strong technology back by a very strong company. Anyone starting an OpenCV project should evaluate both OpenCL and CUDA at project initiation, and look at the available processors and boards. I had considered using an Nvidia Shield as an OpenCV development platform. It has a Tegra 4 chip and an available CUDA SDK. But I am trying to be risk-adverse and I could not find anyone else who has attempted to compile and run OpenCV on a Shield.

The FirePro V5900

After evaluating my options, I settled on an AMD FirePro V5900. AMD provides an OpenCL compliant driver and SDK. While no stellar performer, the V5900 is reasonable priced. New the are less that $400. I picked up a used on eBay for less than $200. A good place to compare OpenCL performance is CompuBench. For example here are the results of their "Optical Flow" test. I had to "expand all" before getting the V5900 to show up on the results. And while you are there, check out how well the Intel HD 4000 performs.

My desire was to run the V5900 "headless" and to continue using the integrated graphics for my monitors and thus dedicate the GPU to OpenCL processing. But alas when I plugged in the card my motherboard's firmware automagically made the V5900 become the graphics device. Perhaps with more research and experiments I may have been able to achieve the desired configuration. But who know how much time investment that would have been. A side benefit of having the V5900 drive the display is that Google Earth runs much better and I can also run Outerra Anteworld (the V5900 just meets the minimal hardware requirements).

Installing a GPU is something that I hadn't done for many years. I was shocked by the size of the AMD Catalyst download size. There is a great deal of bloatware which gets installed. One can uninstall specific items and I probably could have avoided installing them in the first place. Generally I am hesitant to uninstall things if my computer is in working order and I haven't run low on disk space.

In my next entry I will describe installing the AMD Accelerated Parallel Processing (APP) SDK. As an aside, this was my first posting created with Emacs, Markdown, and Pandoc.