Monday, June 29, 2015

SlickGrid Async Post Render Cleanup

This SlickGrid issue has been a bit of a hot potato.

Discussions at:
  github.com/mleibman/SlickGrid/issues/82
  github.com/mleibman/SlickGrid/issues/855
in particular.

So MLeibman has gone on record as saying that using jQuery data and event handlers in AsyncPostRender nodes is not a good idea and that allowing a cleanup might give people the idea that it's OK, which it really isn't.
The counter argument run along the lines that jQuery is used more and more for event and data binding, and that the amount of binding is a continuum - it may be used only lightly, all the way to fully fledged jQueryUI plugins, thus there may be a valid use case, and if so, having provided the means to implement it the grid should provide the means to clean up.

Having spent some time maintaining the SubSonic micro-ORM, one of the things I liked most about the project was the philosophy that the developer is in the end responsible for their decisions, and it is the place of the framework developer to simply offer as wide a range of options as possible - if necessary, also opening up the possibility of these options being abused.
So in this spirit, I decided to try to work out a flexible cleanup method for async post rendered nodes that would also work asynchronously, just like the post render method itself.

After examination of the code, and an initial attempt to code a solution, it became clear that there were three cleanup scenarios needing handling (remembering that one or more cells in any row might have been post rendered):
  1. deletion of an entire row from the DOM and grid cache
  2. deletion of a single cell from the DOM and grid cache
  3. the re-rendering of a previously rendered cell without cell deletion
In addition, the SlickGrid code has a workaround for a Mac scrolling issue that delays deletion of the grid node where a mouse scroll started (using var zombieRowNodeFromLastMouseWheelEvent). This is a row cleanup like the regular one, but it is handled in a different section of code and might easily escape notice. I'd just about bet that was what was causing Jonozzz's small remaining memory leak in Issue 82.

The first step was to add the example example10a-async-post-render-cleanup.html to the examples folder of my alternative master repository.

The modifications to SlickGrid added two new options to the Grid:
    enableAsyncPostRenderCleanup (bool)
    asyncPostRenderCleanupDelay (int)
and an asyncPostRenderCleanup (function) option to each column definition.
A cleanupBeforeRender parameter was added to the end of the AsyncPostRender function arguments.

Internally, where nodes were cleaned up using direct DOM manipulation previously, the code now checks to see if post render cleanup is enabled, and if the row or cell had been post rendered. If so, any post rendered cells are added to a cleanup queue, followed by the row node in the case of row deletion.
Any process that generates queue events afterwards kicks off the post render cleanup on timed delay.
A cleanup queue push example:
postProcessedCleanupQueue.push({
    actionType: 'C',
    groupId: postProcessgroupId,
    node: cacheEntry.cellNodesByColumnIdx[ columnIdx ],
    columnIdx: columnIdx | 0,
    rowIdx: rowIdx
  });
ActionType is C to clean up a post rendered cell, and R to delete a parent row node once the child cells have been cleaned up.GroupId groups cleanup records so we can clean up a row's worth of cells at a time (the same way they are post rendered). The cleanup function processes queue entries matching the GroupId of the first queue entry, then re-calls itself after a delay of asyncPostRenderCleanupDelay ms.
The node is the actual cell or row DOM node, and the column and row indexes may be used to reference the column or row data.

So case 1 (row deletion) is handled by adding any post-processed column nodes to the queue, then the row node.
Case 2 is handled by adding the cell to the queue if it has been post-processed.
Case 3 is handled by including a parameter cleanupBeforeRender (bool) in the PostRender call. If true, this indicates that the cell has been post-rendered already and is being re-rendered without deleting the cell, and that the render function should therefore clean up the previous actions prior to starting the new ones.

So as a summary, here are the key parts of the example page demonstration AsyncPostRender with cleanup:
var options = {
    ... ,
    enableAsyncPostRender: true,
    asyncPostRenderDelay: 50,
    enableAsyncPostRenderCleanup: true,
    asyncPostRenderCleanupDelay: 40
  };

  var columns = [
    ... ,
    {id: "chart", name: "Chart", formatter: waitingFormatter, rerenderOnResize: true,
        asyncPostRender: renderSparkline, asyncPostRenderCleanup: cleanupSparkline}
  ];

  function renderSparkline(cellNode, rowIdx, dataContext, colDef, cleanupBeforeRender) {
    ...
  }
  
  function cleanupSparkline(cellNode, rowIdx, colDef) {
    ...
  }

This provides a comprehensive API and async cleanup process, which should be able to handle cleanup gracefully, as long as the weight of the render and cleanup is not so great that it compromises the performance of the basic grid code.
I reiterate: it is up to the developer to plan, test and monitor the performance of the grid in conjunction with 'heavy' external controls under production conditions. It is easy to get into trouble. Use this tool wisely.

See the commit for details, and check out the sample page for usage notes.

Wednesday, June 3, 2015

SlickGrid with jQueryUI and Bootstrap

Recently I've taken up the mantle of maintaining an updated fork of MLeibman's fantastic SlickGrid trunk. I suppose it was inevitable that it would lead to a blog post.

There have been multiple issues posted about the jQuery Accordion and about Bootstrap 3 issues with sizing. After examination, the bootstrap issue is quite complex and I thought it was worth documenting the details of both.
The first step was to add stripped down, simple example pages for both cases to be used for testing. example-jquery-accordion.html and example-bootstrap-3-header.html are now present in the examples folder of my alternative master repository.

Accordion Formatting

The formatting of the SlickGrid header row was showing small inconsistencies in header size:








This was a minor CSS issue: it appears that in normal situations the header is formatted by the
.slick-header-column.ui-state-default class, which is being evaluated as more specific than (hence takes precedence over) the generic jQueryUI ui-state-default class.
When enclosed in the accordion div (and I'd assume tabs or any other similar container), ui-state-default gets precedence and adds extra border segments.
It is easily fixed by adding the !important tag to various SlickGrid header classes. This is exactly the kind of situation that !important is designed for.

.slick-header.ui-state-default, .slick-headerrow.ui-state-default {
  width: 100%;
  overflow: hidden;
  border-left: 0px !important;
}
.slick-header-column.ui-state-default {
  position: relative;
  display: inline-block;
  overflow: hidden;
  -o-text-overflow: ellipsis;
  text-overflow: ellipsis;
  height: 16px;
  line-height: 16px;
  margin: 0;
  padding: 4px;
  border-right: 1px solid silver;
  border-left: 0px !important;
  border-top: 0px !important;
  border-bottom: 0px !important;
  float: left;
}

Blank Grid Display in Accordion in IE8

More sinister was a problem with vanishing grid contents. I tested in IE8 but other IE versions may also be implicated.
Steps to reproduce using the accordion demo page:
1) open the page in IE 8 and scroll the first grid down a page or so.
2) switch to the second accordion, then back
The first grid should now be blank.

This is an IE rendering issue - IE resets the scrollbar and display when a div or its parent is hidden with display:none. Checking it out with the IE developer tools showed that the DOM elements still existed, but just weren't being shown. Probably because of this, all attempts to refresh the grid failed. Only destroying and recreating the grid was able to get past the blank display.

Because this workaround changes the screen UI, I have commented the code out in the demo page, but it is there if needed.
I was unable (and frankly unwilling) to find a solution for what appears to be an IE bug. If anyone finds a mechanism for refreshing the blank div, let me know and I'll bake it in to the grid.

Bootstrap 3 Column Width Issues

There have been many reports of column width issues under Bootstrap 3. It doesn't take long to find the culprit. Simply including the bootstrap.css file on your page includes this little gem:

* {
  -webkit-box-sizing: border-box;
     -moz-box-sizing: border-box;
          box-sizing: border-box;
}
*:before,
*:after {
  -webkit-box-sizing: border-box;
     -moz-box-sizing: border-box;
          box-sizing: border-box;
}

Admittedly, border-box is a much more sensible model, but this css forces border-box onto most elements on the page (some inputs are excluded).
The interesting thing is that the main (MLeibman) branch of SlickGrid, which is at jQuery 1.7, deals with this perfectly. The header height needs css tweaking, but the column widths resize and drag fine. It's only after we update jQuery that the trouble starts.

The problem is similar to the first image, but it only starts when resizing columns. The drag handle and column header size are offset by an amount equal to the padding and border widths of the column. Worse, the effect is cumulative each time the column is resized.

The reason is summarised here (thanks to JCReady for the heads up). The way jQuery handles box-sizing: border-box in relation to the .width and .outerWidth properties changed in jQuery 1.8. Before, .width essentially did a .css("width") which meant that it would return different numbers depending on the box-sizing setting. Afterwards, it returned the correct inner element size regardless of the box-sizing setting (note that it warns of the performance hit for this feature).
1.8 also allowed .outerWidth to be used as a setter.

Solution 1

A very easy (and tempting) solution is to follow in the footsteps of the css fixes above and add:

slick-header-column.ui-state-default {
   box-sizing: content-box !important;
}

This works just fine since the existing codebase was written to use the default box-sizing: content-box setting. However, it is conceivable that border-box could be needed on the header elements, particularly when using the menu and button features that are available. I resolved to rather solve the problem in the code.

Solution 2

Most of the forum fixes for the column sizing issue recommend replacing all occurences of .width with .outerWidth. This works for the  box-sizing: border-box case but manifests a mirror image problem with content-box (ie. the offset is negative instead of positive).
In order to preserve the correct operation under the old and new jQuery versions in both  box-sizing cases, it was necessary to sniff the jQuery version and provide an alternate code path.
In the end, it was only necessary to make a small adjustment to applyColumnHeaderWidths to solve the issue.
See the commit for code details.

Monday, November 3, 2014

IIS7 Mixed Windows and Forms Authentication

I recently undertook a project with the following requirements:
- an ASP .NET website hosted on IIS 7.5
- Windows authentication for users logged in to the organisation's internal domain (single sign-on)
- forms (user/password) authentication for anonymous public internet users

While simple enough in theory, it seems that this scenario is officially unsupported by Microsoft.

With Google searches yielding a mix of results, mostly for a mix of IIS versions and a variety of approaches, I was able to finally arrive at a solution. In the process, I learnt a lot about the ASP .NET/IIS authentication system.
Because the most relevant articles on StackOverflow come to the conclusion that this kind of Mixed Mode Authentication is not possible, I thought I'd write this post.

Firstly, I'd like to go over some of the principles that I found applied to this process.
Then I'll sketch the solution I used.
If you'd like to save time, I also offer a complete C# skeleton solution for download.

IIS Versions and Settings

Firstly, the authentication model fundamentally changed at IIS 7. This solution deals only with IIS 7+ 'Integrated Pipeline' mode authentication.
Earlier IIS versions or IIS 7+ running in 'classic' pipeline mode work in a different way and this post is not relevant to that situation.

In IIS 7 or 7.5, three authentication types need to be selected: Windows, Forms and Anonymous. You'll get a warning from IIS about mixing challenge and redirect authentication types, but you can ignore it.

Forms Authentication Overview

Let's cover a few characteristics of Forms Authentication:
- authentication is recorded and maintained by use of a cookie (except in very rare case not dealt with here)
- Forms Authentication safeguards access to the site: on initial request by an unauthenticated user, a redirection occurs to the 'login' page specified in the web.config, with the original request URL appended to the login page URL in the querystring (this is why it's a 'redirect' authentication type)
- if Forms Authentication is enabled, successful Windows Authentication does not prevent the forms-auth redirect from occurring: it is necessary for code to create a valid forms-authentication cookie before the site can be accessed
- specific folders or pages of the web site can be opened up to anonymous users, and can then be accessed without triggering the redirect mechanism

In the main <system .web=""> section of the web.config, the <authorization> section specifies the access for the whole site:
<authorization>
  <deny users="?">
    <allow users="*">
    </allow>
  </deny>
</authorization>
'?' represents anonymous users, and '*' all users (authenticated and non-authenticated/anonymous), so denying all anonymous users and then allowing all users is equivalent to allowing only authenticated users.

Then outside the main <system .web=""> section (ie. in the top level <configuration> section), we add authorization settings for specific pages or folders
<location path="css">
  <system .web="">
    <authorization>
      <allow users="?">
      </allow>
    </authorization>
  </system>
</location>
<location path="AccountRecover.aspx">
  <system .web="">
    <authorization>
      <allow users="?">
      </allow>
    </authorization>
  </system>
</location>
The first gives anonymous access to the 'css' subfolder of the root folder, and the second gives anonymous access to the 'AccountRecover.aspx' page.
It is common practice in ASP .NET to give anonymous access to 'content' folders such as image, css and js folders since a request to the Login page will often refer to images or css files necessary to deliver the look and feel of the site prior to a user logging in. There is rarely any security issue giving anonymous users access to these folders.
The AccountRecover.aspx page needs to be given anonymous access because, by definition, the user won't be logged in if they have forgotten their details and wish to recover their account. There are a few other pages that need this access also.

Windows Authentication Overview

- Windows authentication, if selected by itself, normally triggers IIS itself (not ASP .NET) to issue a HTTP 401 challenge on first page request for an unauthenticated user
- if Anonymous authentication is enabled, then authentication defaults to anonymous on first page request, and no Windows auth 401 challenge occurs. However, the web server may manually issue a 401 challenge at any point after this
- The 401 challenge is handled by the browser typically like this (different browsers can vary):
  (a) if the user is already logged in to a Windows domain, then the challenge is usually answered invisibly by the browser, providing the logged on credentials.
  (b) if the user is not logged on to a Windows domain, then a user name and password window will be displayed by the browser, inviting the user to log on to such a domain
  (c) the logon credentials are returned to the web server and if they pass authentication, the user is now logged in. If they fail then the browser displays the 401 Not Authorized error. The user can try again by navigating 'back'.

Authentication Settings in Asp .NET

A range of settings are mentioned by the various articles on authentication:

System.Security.Principal.WindowsIdentity.GetCurrent().Name

This is the identity of the IIS worker process.
It may be the default system/IIS process, or a service or user account if the application pool has been assigned to a particular account, or it may even be a dynamic user account if Identity Impersonation has been turned on.
However this is an IIS artifact, and nothing to do with us in the current scenario, so we will ignore it.

Request.ServerVariables["LOGON_USER"]

This is cited by various articles out there as the logged on Windows user. Perhaps this was relevant before IIS 7, but now it appears to echo Page.User.Identity.Name as below

Page.User.Identity.Name

If logged in via Forms Authentication, this contains the username used to log in.
If authenticated by Windows, it contains the name (including the domain) of the Windows identity used to answer the 401 challenge.


Issues Mixing Windows and Forms Authentication

There appears to be only a single successful strategy to mix the two types of authentication, and that is to use forms authentication as the basis for the login system, but also allow the user to trigger and respond to a 401 challenge, programmatically creating a forms authentication ticket if successful.

The issues with this are the following:

- Page.User.Identity.Name returns different values (as above) depending on what type of authentication is active. If Windows authentication is active, it can't be turned off except by closing and reopening the browser. Forms authentication is only active if Windows authentication is not.
For this reason every page in the application should be able to identify a user given either their Forms Authenticated username, or by their Windows credentials.
In particular, it should not be possible for one user to have a Forms username that is the same as another user's Windows credentials. Luckily, Windows credentials appear to always contain a domain and a backslash. The presence of a backslash in the username is a reasonable test for the type of authentication that is active.

- Windows authentication often seems to drop out mid-session, falling back to into forms mode using the generated forms auth cookie. I'm just warning about this; it's not a problem - unless relying on some aspect of one of the the types of authentication.

- the pre-built ASP .NET forms authentication and authorization management controls work fine for forms authenticated users, but choke for Windows authenticated users because they try to use the authenticated username (which is the Windows username) to look up the forms-based user records. It is possible to find the equivalent forms user name and programmatically feed it in to these controls to emulate forms auth when using Windows auth. For example, with the <asp:changepassword> control, the 'UserName' property allows us to do this

Putting It All Together

Given that most users requiring Windows authentication will already be logged in to their domain, the most seamless authentication result would be if we could silently test for successful Windows authentication immediately after the Forms Authentication redirection to the Login page, and if successful, create the Forms Authentication ticket and proceed to the originally requested URL.

We can trigger immediate Windows Authentication by sending the 401 challenge on first entry to the Login page, but the problem lies in what occurs if the Windows login attempt fails.
Remember that once the 401 challenge has been sent, things proceed smoothly if credentials are accepted, but if they fail the browser will bring up a 401 'Not Authenticated' error for the page where the attempt failed and block further access to the site. The user could no longer access the Login page to attempt forms-based user/password login.

Give this problem, the best compromise is to leave the Login page as a purely forms-based page, and offer a button to trigger windows-based authentication only for those who want it. If windows authentication fails, clicking 'back' will go back to the login page, or if totally blocked, the browser can be closed and re-opened.

Another logical approach would be to construct two separate entry pages, one for windows authentication users and one for forms authentication users. This was tested, but does not operate well within the redirect based authentication paradigm, and I could not reliably prevent the Windows Authentication URL from being redirected to the Forms redirect URL under some conditions, even if the page was added to the Forms Authentication anonymous access list as shown earlier.
This option was therefore discarded.

In summary, the key points are:
- the Windows login credentials for a user account should be stored as an additional column in the ASP .NET Membership provider user record (thus linking it to a Forms based user)
- the login screen should be forms based, and contain a button to trigger Windows login
- the Windows login button should respond with an HTTP 401 response challenge
- the login page and all other applications should be able to identify a Forms user given either their Forms Authenticated username, or by their Windows credentials. It must be ensured that Forms and Windows usernames are distinct and unique.
- where a user has successfully Windows Authenticated, but the credentials are not recognised, a separate linking page may be created to allow the user to log in using their Forms based username and password, and for the Windows credential to be saved against the Forms user database record.

final note!
I found this StackOverflow question.
Has anyone tried this approach? I note above that I couldn't get the two-pages approach to work because of the forms redirect, however I didn't know about the ability to specify authentication mode per-page. That could be an alternative.

Sample Application

At arrow-of-time.com, I am offering a complete Visual Studio C# skeleton solution for download. This solution contains the ASP .NET pages and classes to achieve all of the discussed behaviour, and detailed deployment instructions.


Varbinary(MAX), BLOBs and linked MSSQL tables in Microsoft Access

It is time to catalogue more of the crimes against humanity that is the evolution of Microsoft Access.
I've had document storage for OLE object fields in MS Access linked tables working fine for a while now, so swapping to SQL Server linked tables shouldn't be a problem, right ?
Wrong.
I started off trying to solve a seemingly innocent error message that occurred only when storing a document over a certain size, about half a megabyte. This snowballed into most of a weekend taking a tour through the entire history and inner workings of linked SQL tables.
I present the results here in the hope that it may save others.

(1) Overview of the MAX types


In MSSQL2005, Microsoft introduced the varchar(MAX), nvarchar(MAX) and varbinary(MAX) types. These replaced text, ntext and image. The advantages were that the new text types could be operated on with LIKE, LEFT and other text functions, and could be used in sorting, and that all the new types were stored in the database more efficiently.
The old text, ntext and image were listed as deprecated, and still have warnings that they will be removed completely in a future version of MS SQL Server, although they are still present in MSSQL2014.

MSSQL2000 did not have these MAX types, and was the last version in which text, ntext and image were not deprecated.

(2) Overview of MS Access Linked SQL Server Tables


Linked tables with an MSSQL back end in MS Access always use the ODBC driver, and always use DAO, which is the Access native data API. The DAO tabledef object stores the ODBC connection string in its .connect property.
There is the option to use an ODBC DSN to make the connection, in which case the connection string holds the name of the DSN, but not the server, database name or authentication details, since they are stored in the DSN. The table linker wizard always uses DSN based connections.
The connection string looks something like this:
ODBC;DSN=MyDSN;DATABASE=MyDB;Trusted_Connection=Yes

Another option is the DSNless connection, which specifies the server and database directly. The table's connection strings must be set using code, something like:
ODBC;DRIVER=SQL Server;SERVER=TESTSERVER\SQL2008EXPRESS;Trusted_Connection=Yes;DATABASE=MyDb;QuotedId=no;

(3) ODBC Drivers


There are essentially three ODBC drivers, which must be either chosen when creating a DSN, or specified as the DRIVER parameter in a DSNless connection:
a. SQL Server: the historical driver, released prior to 2005
b. SQL Native Client: the driver released with SQL2005
c. SQL Server Native Client 10.0: the version-specific driver for SQL Server. Other versions include 11.0, 12.0, etc.

You might expect because that the 'SQL Server' ODBC driver came out before MSSQL2005 existed, it might not handle the MAX datatypes gracefully, and you'd be correct.
What was surprising was that none of the other drivers handle it gracefully either.
Also, because MS Access 2003 came out before MSSQL2005 existed, we might also expect it not to work even with the later drivers (although Access 2003 SP3 added a lot of back-patched for things like this and DID come out after 2005).

However, I can confirm that MS Access 2007 and 2003, none of these drivers handle the MAX types correctly.

I suspect that this may be an Access version issue, or OS version issue, or driver version issue, but I don't have several days to go down that particular rabbit hole.

(4) Editing large data in MS Access


There are is a well documented way to write binary data to a back end table in MS Access. Essentially, a DAO or ADO recordset is opened and the .AppendChunk method is used to append byte arrays of data until the whole binary image is written.
Generally a 'chunk size' is specified and the image is broken up and written in chunks of that size, with a final leftover smaller chunk at the end or start.

However editing varchar(MAX) columns will almost always be through a table bound to a form.

As mentioned, writing to linked tables, whether an MSSQL back end or MS Access back end, and whether using DAO or ADO, ultimately uses DAO and the ODBC drivers because that's the base mechanism for linked tables.
This applies to all linked tables bound to forms.

The only way around the ODBC drivers is to open an ADO connection using OLEDB direct to the back end MSSQL database. This is by necessity a code-based activity. It cannot be used for bound data.

(5) Putting it all together


Now that we have had the overview discussion, we can proceed with the actual problem

Point 1: The Old DataTypes work Fine


Sticking with textntext and image appears to be the best option.
While this solution is flagged as NOT RECOMMENDED anywhere you google it, for reasons discussed earlier, it actually appears to be the only fully supported and tacitly recommended solution.
For example, the upsizing wizard has an option for this, and it's recommended by Microsoft to choose it.

So simply, if you want large data types in SQL Server to work across all scenarios for linked tables, use the old data types.

Point 2: The ODBC Drivers for the new MAX datatypes don't work at all


Using the 'SQL Server' ODBC driver, the varbinary(MAX) column linked as an OLE field in MS Access 2003 and 2007, but trying to insert a BLOB of more than about half a megabyte gave me the following:
Error: 7125 - ODBC Error:[Microsoft][ODBC SQL Server Driver][SQL Server]The text, ntext, or image pointer value conflicts with the column name specified. (in xf_Test_RWBlob:TestBlobRW)
Error: 0 - ODBC Error:[Microsoft][ODBC SQL Server Driver]Warning: Partial insert/update. The insert/update of a text or image column(s) did not succeed. (in xf_Test_RWBlob:TestBlobRW)

Using either of the 'Native' ODBC drivers, the varbinary(MAX) column linked as a fixed binary(255) field in MS Access 2003, and as an OLE object in MS Access 2007, but either Access version still gave me:
Error: 0 - ODBC Error:[Microsoft][SQL Native Client]String data, right truncation (in xf_Test_RWBlob:TestBlobRW_DAO)

As I said earlier, this may be an Access version issue, or OS version issue, or driver version issue. But long story short, I couldn't get it to work reliably, or, in fact, at all.

Point 3: The Workarounds are Partial


For just the case of the BLOB, it is possible to open an ADO connection in code using OLEDB direct to the back end MSSQL database, using a connection like:
Provider=SQLOLEDB;Data Source=TESTSERVER\SQL2008EXPRESS;Initial Catalog=MyDB;Integrated Security=SSPI;

This works. 
However this doesn't help us with bound linked tables containing nvarchar(MAX) and varchar(MAX) columns. They'll still throw errors if we add too much text to the control.
This would be workable if editing was done by unbound controls and data was saved with an ADO connection.

Summary


We are far better off sticking with the old, deprecated types. The alternatives are at best a messy, and at worst a massive, workaround.
Thanks, Microsoft, for once again breaking everything and leaving us to pick up the pieces.
I'd love it if someone proves me wrong here, and tells me how to do it properly. But I don't think that's going to happen.

Final Note


After writing this, I kept getting errors in MS Access 2003 on a table with an nText and an Image column using the 'SQL Server' driver. Eventually I discovered that if I moved the nText column to occur after the Image column in the table, the error disappeared. Then I tried using the later drivers. This also fixed the problem.
I have a motive for going with the old driver: many of my clients don't have the newer one installed, whereas the old one is always there. If I can make it work, it's less headaches for everyone.
Also, you'll probably laugh, but I still use Access 2003 for all my development. It's the last useable version, IMO. The newer version double the amount of clicks necessary to do anything. I can't stand them. I suppose one day I'll have to update. It's the rare issues like this that will probably force my hand.

Saturday, December 28, 2013

Positional MSAccess OleDb Parameters

A few years back, I wrote an MS Access data provider for SubSonic 2, the DAL generation tool.

It was a rather thankless task - the V2 project was pretty much obsolete by then - but it was extremely educational.
I'd like to share some tricks I was forced to come up with that I don't think I've ever seen published.
Surprisingly, JET SQL called from .NET can come up with the goods in many ways you might not expect.

If you want to check out the DataProvider code directly, it's 'AccessDataProvider.cs' on GitHub here.

Issue #1 - Positional Parameters

When connecting to an MS Access database in .NET, it's best to use an OleDb connection, since these are cited by Microsoft as being optimal for MS Access, and give us the widest range of functionality.
It is well known that parameters in OleDb queries are assigned by position only, and the names are disregarded entirely.
For example:
SELECT CategoryName, [comp 2] as x1, [cat 1] as x2, [comp 2] as x3 From Categories WHERE CategoryName=[cat 1];
would actually expect four individual parameters which would be assigned in order despite parameters 1 and 3, and 2 and 4 having the same names, effectively giving

SELECT CategoryName, ? as x1, ? as x2, ? as x3 From Categories WHERE CategoryName=?;

This makes it very difficult to construct the automated SQL statements required by a query tool such as SubSonic's.

After quite a bit of digging, what I found was that you actually CAN use named, positional parameters in access: the SQL just needs to be formatted in a precise way.
To return to the above example, the following ad-hoc SQL:

PARAMETERS [cat 1] Text(255), [comp 2] Text(255);
SELECT CategoryName, [comp 2] as x1, [cat 1] as x2, [comp 2] as x3 From Categories WHERE CategoryName=[cat 1];

will expect two parameters, and will re-use the values in the specified locations.
The types for the parameters (above: Text(255) ) are standard DDL types, as catalogued by Allen Browne here.

The parameter names must be standard JET SQL identifiers and as such may not contain the @ character, unlike in MS SQL Server's Transact SQL where the @ is mandatory.

When saved as a query in Access, SQL with parameter declarations are marked as a 'Stored Procedure' by OleDb, a different schema category from normal queries which are marked as a 'View' (see examples below on schema data retrieval).

Issue #2 - Fetching Database Metadata

In order to construct a DAL for an MS Access database, it is necessary to retrieve schema information about database objects.
The OleDbConnection offers a GetOleDbSchemaTable() method that returns most comprehensive schema data of any of the connection types.

DataTable dt = autoOleDbConn.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null);

However there are still two problems with the schema data retrieved.

(1) There is actually no way to determine whether a column is an Autonumber (the equivalent of a MSSQL identity).
An urban myth is circulating that COLUMN_FLAGS = 90 and DATA_TYPE = 3 (Int) means an Autonumber, but actually these conditions are met for any non-nullable Long field (of which an Autonumber is a subset).

(2) MS Access also has the 'Allow Zero Length String' property, which is unique among DBMS's as far as I know. This is not picked up by the schema data.

In the dataprovider, the way I solved this problem was to open the database object using DAO and use the DAO field attributes:
// Use DAO to extract parameters 
DAO.DBEngineClass dbc = new DAO.DBEngineClass();
DAO.Database db = dbc.OpenDatabase(
    GetAccessDBNameAndPathFromConnectionString(autoOleDbConn.ConnectionString), null, false, "");
...

    // if this is a table, extract addl DAO info
    DAO.Field dbField = db.TableDefs[tempTableName].Fields[tempColumnName];
    tempAllowEmptyString = dbField.AllowZeroLength;
    tempAutoInc = ((dbField.Attributes & (int)DAO.FieldAttributeEnum.dbAutoIncrField) != 0);
If any of this is not comprehensive enough, I'd recommend looking at the SubSonic provider code on GitHub as mentioned above.
There are a lot of useful bits and pieces there, including conversion between .NET DbType, .NET native type, DAO enumerated type and DAO DDL type.

Friday, December 27, 2013

VBA Collections: Using as a Hash Table for Fast String Lookup in Excel and Access

VBA collections are pretty simple beasts. They only have three methods: Add, Remove and Item, and a Count property. When adding values to the collection, we can include a string key for lookup.
Some sample code:
Dim TestColl As New Collection
TestColl.Add 153, "Greece"
TestColl.Add 23, "Japan"
now we can use an integer index to retrieve the n-th 1-based item (returns 23)
TestColl.Item(2)
or the string to retrieve the item by key (returns 153)
TestColl.Item("Greece")
We can also iterate the values:
Dim v As Variant

For Each v In TestColl 
  Debug.Print v
Next
There is some debate as to whether the collection uses a proper hashed lookup to retrieve the value when given a string key. Many sites/blogs recommend using the Dictionary object in the System.Scripting namespace.
While this may mostly work, over the years I have learned to avoid external dependencies in MS Access wherever possible. Hence I determined to investigate the lookup performance of the native VBA collection for both integer and string based retreival.

The following code builds a collection of 100,000 integer values and then looks up a particular value at the start, middle or end of the collection 10,000 times, so as to return a measurable time.
Public Sub TestCollectionItemLookup()
Dim TestColl As New Collection
Dim i As Long
Dim k As Long
Dim StartTime As Single
Dim Iterations As Long
    Iterations = 10000
    Debug.Print "Iterations=" & Iterations

    For i = 0 To 100000
        TestColl.Add i, CStr("k" & i)
    Next
    
    ' By Int Index, i=2
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(2)
    Next
    
    Debug.Print "By Int Index, i=2: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=500
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(500)
    Next
    
    Debug.Print "By Int Index, i=500: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=50000
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(50000)
    Next
    
    Debug.Print "By Int Index, i=50000: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=99999
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(99999)
    Next
    
    Debug.Print "By Int Index, i=99999: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, i=500
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & 500))
    Next
    
    Debug.Print "By String Index, i=500: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, i=99999
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & 99999))
    Next
    
    Debug.Print "By String Index, i=99999: t=" & Format(Timer - StartTime, "0.000000")
    
    '*****************************************************************
    
    ' By Int Index, random lookup
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CLng(Rnd * 100000))
    Next
    
    Debug.Print "By Int Index, random lookup: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, random lookup
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & CLng(Rnd * 100000)))
    Next
    
    Debug.Print "By String Index, random lookup: t=" & Format(Timer - StartTime, "0.000000")
End Sub
Executing this sub printed the following results for me:
   
Iterations = 10000
By Int Index, i=2: t=0.007813
By Int Index, i=500: t=0.031250
By Int Index, i=50000: t=5.406250
By Int Index, i=99999: t=16.195310
By String Index, i=500: t=0.031250
By String Index, i=99999: t=0.027344
By Int Index, random lookup: t=6.164063
By String Index, random lookup: t=0.027344
We can draw the following conclusions:
- lookup in a collection by integer index just uses a linear for/next loop with a value comparison. Values at the start of the collection will be retrieved far more quickly than ones at the end
- lookup by string key IS hashed, and it makes no difference where in the array the values lies, although key lookup is at least an order of magnitude slower than direct array lookup would be (we would expect integer lookup for i=2 to be 2-3 times slower than an array lookup)
- the code also checks 10,000 tries of random retrieval, in case of some kind of caching was speeding up the key lookup after the first retrieval, but the results indicate this not to be the case, with the result very similar to the repeated lookup of a single value

Thus, VBA collections can be recommended for hashed string lookup for large collections, however should never be used for integer indexed lookup for large collections.
If integer lookup was required, it would be the best strategy to create an array and copy the values into it, then use the array for positional retrieval.

Wednesday, July 3, 2013

Random Number Generation in ASP .NET

I wrote a previous article about Generating a Random Password using the ASP .NET provider.

Thanks to Joakim Uddholm for the comment that the Random() function was not really secure.
In this case, I think it was secure enough, but that doesn't change the fact that it is certainly a weak point in the system, which could be exploited under certain conditions.

In the previous article I tackled the main problem with Random(), which is that it returns random numbers in a fixed repeatable sequence for a given seed value, and that when created with the default (empty) constructor it is seeded with the current time. This means that if multiple instances of Random() are instantiated in a short period of time (before the time changes), they will return precisely the same sequence of 'random' numbers.

Googling around, I see comments on StackOverflow that agree with Joakim's position that System.Random() simply isn't meant for any security related purpose. This is probably true, however I'm equally certain that people for a variety of reasons will inevitably attempt to use it for security functions.

For this reason I'm posting first an improved version of the static Random() wrapper using a seed that isn't able to be associated with the current time. I'm still using a periodic re-seeding behaviour because I'm concerned that someone may be able to recognise a section of  the fixed 'random' sequence and use that to predict new values. This just emphasises that we shouldn't really be using System.Random() at all.
Anyway, I think this one is probably as good as we'll get to secure random values using Random().

Secondly, I'm posting a method to return a random int up to a given value using System.Security.Cryptography.RNGCryptoServiceProvider
I'm definitely recommending using the second method for any security related purposes.

The improved System.Random wrapper:
private static Random randomNumGenerator = new Random();
private static DateTime lastRandomNumGeneratorSeedTime = DateTime.Now;

public static Random RandomNumGenerator {
  get {
    lock (typeof(Random)) {
      if (randomNumGenerator == null) {
        randomNumGenerator = new Random();
      } else {
        if (DateTime.Now > lastRandomNumGeneratorSeedTime.AddSeconds(1)) {

          randomNumGenerator = new Random(randomNumGenerator.Next(Int16.MaxValue) * DateTime.Now.Millisecond);
          lastRandomNumGeneratorSeedTime = DateTime.Now;
      }
    }
    return randomNumGenerator;
   }
  }
}
And the better, crypto derived method. I've read comments about the performance when retrieving multiple bytes, so I've split it to only retrieve as many random bytes as it needs. The basic code is mainly from this MSDN page.
  private static System.Security.Cryptography.RNGCryptoServiceProvider rngCsp
    = new System.Security.Cryptography.RNGCryptoServiceProvider();

  public static int CryptoRandomNumber(int maxRndValue) {
    // deal with byte and UInt16 values separately for performance reasons
    if (maxRndValue <= Byte.MaxValue) {
      byte[] randomNumber = new byte[1];
      do {
        rngCsp.GetBytes(randomNumber);
      }
      while (!IsFairRoll(randomNumber[0], maxRndValue, Byte.MaxValue));

      return (int)(randomNumber[0] % maxRndValue);
    }

    if (maxRndValue <= UInt16.MaxValue) {
      byte[] randomNumber = new byte[2];
      int rnd = 0;
      do {
        rngCsp.GetBytes(randomNumber);
        rnd = (int)(randomNumber[0] + randomNumber[1] * 256);
      }
      while (!IsFairRoll(rnd, maxRndValue, UInt16.MaxValue));

      return (rnd % maxRndValue);
    }

    int rnd1 = 0;
    byte[] randomNumber1 = new byte[4];
    do {
      rngCsp.GetBytes(randomNumber1);
      rnd1 = (int)(randomNumber1[0] + randomNumber1[1] * 256 
        + randomNumber1[2] * 256 * 256 + randomNumber1[3] * 256 * 256 * 256);
      if (rnd1 < 0) { rnd1 = (rnd1 + 1) * -1; } 
    }
    while (!IsFairRoll(rnd1, maxRndValue, int.MaxValue));

    return (rnd1 % maxRndValue);
  }

  private static bool IsFairRoll(int result, int maxRndValue, int arrayMaxValue) {
    // There are MaxValue / numSides full sets of numbers that can come up 
    // in a single byte.  For instance, if we have a 6 sided die, there are 
    // 42 full sets of 1-6 that come up.  The 43rd set is incomplete. 
    int fullSetsOfValues = arrayMaxValue / maxRndValue;

    // If the roll is within this range of fair values, then we let it continue. 
    // In the 6 sided die case, a roll between 0 and 251 is allowed.  (We use 
    // < rather than <= since the = portion allows through an extra 0 value). 
    // 252 through 255 would provide an extra 0, 1, 2, 3 so they are not fair 
    // to use. 
    return result < maxRndValue * fullSetsOfValues;
  }
The new password generation function from the previous post now looks like this:
public static string GenerateFriendlyPassword(int length) {
    string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
    var password = new StringBuilder(length);

    for (int i = 0; i < length; i++) {
        password.Append(chars[Gbl.CryptoRandomNumber(chars.Length)]);
    }
    return password.ToString();
}
Thanks!