COMPSCI - Summer 2009: 2009

Saturday, August 29, 2009

ASP.Net, IIS and SQL Server – Integrated Security, Authentication and Impersonation

In one of my last projects, I was responsible for designing an enterprise wide ASP.Net application that would run on IIS and use integration windows authentication and connect to backend SQL Server database using integrated security i.e. SSPI. As you see, as per the requirement, everything (all network resource access) had to use integrated windows security mechanism i.e. “trusted connection”.

This is indeed a very common development scenario that we face on a regular basis when working with ASP.Net applications. In order to use Integrated Security, we simply check the ‘Integrated Windows NT Authentication’ (challenge/response) option in IIS, set ‘impersonate=true’ in our web.config file and we are ready to go.

Or are we?

A very common stumbling block that developers face with the above setup is when trying to connect to SQL Server, you end up getting one of the two following error message:

Login failed for user '(null)'. Reason: Not associated with a trusted SQL Server connection.
Microsoft OLE DB Provider for ODBC Drivers (0x80040E4D)
[Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '\'.

What’s going on? The reason for the above error is due to the ‘anonymous access’ option also being turned on (checked) in IIS. When this is the case, anonymous access takes precedence over Windows NT Authentication access and user’s credentials are not passed. Ok so what can we do? Let’s turn off ‘anonymous access’ option in IIS and try again to connect to SQL Server. Now you will likely end up getting the following error messages:

Microsoft OLE DB Provider for ODBC Drivers error '80040e4d'
[Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user 'NT AUTHORITY\ANONYMOUS LOGON'.

Huh?? The reason is because of a 'double hop' that authentication mechanism undertakes. When the client authenticates with IIS, it passes the logged in user’s (in the domain) NTLM credentials (username/password). This is the ‘first hop’. After IIS has authenticated the user’s credentials now it is time to access SQL Server (or any other secured network resources). But IIS does not pass the NTLM credentials to the SQL Server machine because that would constitute a "second hop" which is not allowed for security reasons. Therefore trying to access a SQL Server instance running on another machine other than the web server will result in a logon failure error.
So what is the solution to the problem? Two most commonly used solutions are as follows:

SQL Server authentication

SQL Server authentication relies on the internal user list maintained by the SQL Server computer. This list does not include Windows NT users, and is specific to the SQL Server computer. You need to provide a specific SQL Server user account’s “username” and “password” in the connection string when connecting to SQL server.

Windows NT authentication

This is what Microsoft’s Tech Net article has to say - “To configure IIS for Windows NT authentication, you cannot use Windows NT Challenge\Response (NTLM) authentication”. This essentially means do not connect to SQL Server using a trusted connection from ASP.Net application when using Windows NT authentication with IIS. Instead do the following:

Use ‘Basic Authentication’
Use anonymous access and follow steps to setup proper authentication (refer the article http://support.microsoft.com/kb/247931/)
Use Windows NT Authentication and use a specific “generic domain account” (ensure it has been given appropriate access and level of permission to the SQL database). Then use either of the following two ways to provide the generic user credential:

Specify a username and a password in the connection string of the (this is similar to SQL Server authentication)
Specify values in the impersonation settings and still use a trusted connection. You can also encrypt the web.config to protect the username/password information

Now that we have seen the approaches to solve the “double hop” problem next comes the question “Can we impersonate an account at runtime, programmatically?” e.g. say I have the following 2 accounts: (i) DOMAIN\User1 and (ii) DOMAIN\User2 where User1 has “write access” while User2 has ‘full access’ to SQL Server (or any other resource). Once I am done with the network resource I want to continue accessing with the original logged in user’s credentials. In other words, I want to be able to execute certain code under a specific impersonated user credential. Is this possible?

Yes, absolutely. Here is the article on MSDN that explains it all. How To: Use Impersonation and Delegation in ASP.NET 2.0. It explains all the various Impersonation Scenarios namely:

Impersonating the original caller. You want to access Windows resources that are protected with ACLs configured for your application's domain user accounts.
Impersonating the original caller programmatically. You want to access resources predominantly by using the application's process identity, but specific methods need to use the original caller's identity.
Impersonating a specific Windows identity. You need to use a specific identity or several Windows identities to access particular resources.
Using delegation to access network resources by using an impersonated identity. You need to use an impersonated identity to access remote resources.

That's all folks!

Wednesday, August 5, 2009

Modifying <appSettings> in web.config programatically using AJAX, JQuery, JSON

Recently I was working on a ASP.Net project where there was a need to update the values stored in <appSettings> section in the web.config file programtically. Specifically, there is a maintenance screen with a tabbed interface and one of the tab was called 'Configuration' which would provide the user an interface to 'Add' and 'Update' <appSettings> values. One of the main requirements was that it has to be completely AJAX enabled and use of 'UpdatePanels' are not allowed.

You can find lot of resources on the web explaining how to do 'modify web.config programatically' and its an easy thing to do. But the main challenge lies in making it completely AJAX enabled (and there is not much information out there on this topic). And the steps to acheive that are as follows:

Create a server method (e.g. GetConfigSettings()) to enumerate all the <appSettings> values
Create a custom serializale type called 'AppplicationSetting' containing public members 'Key' and 'Value' and for each appsettings entry create a corresponding 'AppplicationSetting' objects and put it in a generic list (named 'AppSettingsList') of type'ApplicationSetting'
Serialze the list data (from Step 2) in JSON format using JavaScriptSerializer
Invoke the GetConfigSettings() from client side JavaScript on page load and retrieve the data (use PageMethods)
Process the data (from Step 3) and create an user interface (as shown in Fig 1)
Provide ability to add new values
Process all the newly entered value and updated value on the client side ensuring duplicates are handled properly and create a JSON object
On clicking 'Save' button, invoke a server method (e.g. SaveConfigSettings()) passing configuration settings (in form of a the JSON object)
On the server side, deseriaize the data back to list of objects of type 'AppplicationSetting'
Loop through the list (from Step 9) and add or update the corresponding key/value appSettings entry
Save the web.config file

Now let's get cracking at the code (the fun stuff!!!). Let's start by looking at at the UI screen

(Fig 1)

The UI screen is self-explanatory. It consists of two textboxes for 'key' and 'value' for each of the existing appSettings value. Clicking on the 'Add' button adds a new pair of texboxes to enter a new key,value pair.

Clicking on the 'Cancel' reloads the form with the original (existing) values.

Define the public method GetConfigSettings() and mark it as a 'WebMethod()', make it 'Shared' so that is can be invoked from clientside JavaScript using PageMethods.

 <WebMethod()>_
 Public Shared Function GetConfigSettings() As String
   Dim status As String = String.Empty
   Dim bsrtent As ServiceLayer = New ServiceLayer
   status = bsrtent.LoadConfigSettings()
   Return status
 End Function

 <WebMethod()> _
Public Shared Function SaveConfigSettings(ByVal newconfigsettings As String) As String
Dim status As String = String.Empty
Dim bsrtent As ServiceLayer = New ServiceLayer
status = bsrtent.SaveConfigSettings(newconfigsettings)
Return status
End Function

Once we have marked a server method as 'WebMethod' we no longer can work with non-static members because the method is 'Shared'. So we need to define a class (essentially a service layer), instatiate an object of the service layer and invoke public instance member of the class to do the necessary functions.

The LoadConfigSettings function of the ServiceLayer is as follows:

The SaveConfigSettings() function of ServiceLayer is as follows:

Now to the client side JavaScript code (where all the action is!!!). We assume you have properly referred JQuery library in your project, use ScriptManager on your page and have a form with a div id='#tab-Config'. The JavaScript code in itself provides lots of insight into JQuery and its usage. (I'll blog about some of them in greater details in future)

 $(function() {   
    PageMethods.GetConfigSettings(LoadConfigSettingsSuccess);
});

function LoadConfigSettings() {
   
    if (CONFIGSETTINGS) {
        var count = CONFIGSETTINGS.length;
        var configrow;
        if (count > 0) {
            $('#tab-Config').empty();
            var msg = '<b class="MSGRED"><u>Note</u>: Changing the web.config file will cause the application to re-start.</b>';

            var configtable = $('<table><tr><td class="funcheader" align="center">Key</td><td class="funcheader" align="center">Value</td></tr></table>');
            for (var i = 0; i < count; i++) {
                var configrow = $('<tr></tr>');
                var key = CONFIGSETTINGS[i].Key;
                var val = CONFIGSETTINGS[i].Value;
                var keytxt = $('<td><input type="text" name="configkey" id="configkey' + key + '" class="namemaptextbox" value="' + key + '"/></td>');
                var valtxt = $('<td><input type="text" name="configval" id="configval' + key + '" class="namemaptextbox" value="' + val + '"/></td>');
                configrow.append(keytxt);
                configrow.append(valtxt);
                configtable.append(configrow);
            }
            var buttontable = $('<table></table>');
            var addkeybutton = $('<td><input type="button" class="clickbutton" name="addconfigbutton" id="addconfigbutton" value="Add"/></td>');
            var saveconfigbutton = $('<td><input type="button" class="clickbutton" name="saveconfigbutton" id="saveconfigbutton" value="Save" onclick="SaveConfigSettings();"/></td>');
            var cancelconfigbutton = $('<td><input type="button" class="clickbutton" name="cancelconfigbutton" id="cancelconfigbutton" value="Cancel" onclick="LoadConfigSettings();"/></td>');
            buttontable.append($('<tr></tr>').append(addkeybutton).append(saveconfigbutton).append(cancelconfigbutton));

            $('#tab-Config').append(msg).append(configtable).append(buttontable);

            //Reset the width of the value textbox
            $('div#tab-Config input[id^=configval]').each(function() { $('#' + this.id).css('width', '200px'); });

            addkeybutton.click(function() {
                var newkeytxt = $('<td><input type="text" name="configkey" class="namemaptextbox" value=""/></td>');
                var newvaltxt = $('<td><input type="text" name="configval" class="namemaptextbox" value="" style="width:200px"/></td>');
                configtable.append($('<tr></tr>').append(newkeytxt).append(newvaltxt));
            });
        }

    }
}

function SaveConfigSettings() {
    var keys = new Array();
    var vals = new Array();
    var newconfigsettings = "[";
    $('#tab-Config input[name=configkey]').each(function() {
        keys.push(this.value);
    });
    $('#tab-Config input[name=configval]').each(function() {
        vals.push(this.value);
    });

    //Generate the JSON Objects
    var prevkey = "";
    for (var i = 0; i < keys.length; i++) {
        //Check for duplicate
        var dupkey = false;
        for (var j = 0; j < i - 1; j++) {
            if (keys[j] === keys[i]) {
                dupkey = true;
                break;
            }
        }
        if (!dupkey) {
            if ((keys[i] !== '') && (vals[i] !== '')) {
                newconfigsettings += '{"Key":"' + keys[i] + '","Value":"' + vals[i].replace(/\\/g, '\\\\') + '"},';
                prevkey = keys[i];
            }
        }
        else {
            i++; //found duplicate so skip the index for both both <key> and <val>
        }

    }
    newconfigsettings += '{"Key":"EOF","Value":""}]';

    PageMethods.SaveConfigSettings(newconfigsettings, SaveConfigSettingsSuccess, SaveConfigSettingsFailure);
}

function SaveConfigSettingsSuccess(result) {
    var success = eval(result);
    if ((success) && (success.Message)) {
        alert(success.Message);
        PageMethods.GetConfigSettings(LoadConfigSettingsSuccess);
    }
}

function SaveConfigSettingsFailure(result) {
    var err = eval(result);
    alert(err.get_message());
}

function LoadConfigSettingsSuccess(result) {
    eval(result);
    LoadConfigSettings();
}

And that is all (really!!!) to create a pure AJAX enabled UI interface to programtically modify web.config appSettings entries. The same concept can be extended to update/modify other sections of the web.config (or any data) in an ASP.Net application.

Wednesday, July 8, 2009

Different ways to populate a dropdown list in ADO.NET Entity Framework

One of the most common tasks in web-application development is to fill a dropdown list with values from a table in the database or from a collection. With ASP.Net the various approaches are well-documented. What I wanted to show here is the various ways to populate a dropdown list when working with ADO.Net Entity Framework(using EntitySQL). So let’s get right to the code!

First let us assume that you have created an Entity Data Model (.edmx file), named it as “MyEntities”, the model name being “MyModel” and one of the entities added is the “Owners” (mapped to the “OwnersTable” table in the database) as shown below:

There will be the following entry in the web.config:

<add name="MyEntities" connectionString="metadata=res://;provider=System.Data.SqlClient;provider connection string="Data Source=MyDataSource;Initial Catalog=MyCatalog;Integrated Security=True;MultipleActiveResultSets=True"" providerName="System.Data.EntityClient" />

Now let us assume in the page MyTestPage.aspx we have the following dropdown list control which we want to populate with the records in the Owner on the page load:

<asp:DropDownList runat="server" ID="ddlOwners" AutoPostBack="false"></asp:DropDownList>

Let us create a private procedure named PopulateOwners which will be invoked from the Page_Load event of MyTestPage.aspx page.

Method 1
This is the most straight forward way of populating a dropdown using the DataSource and DataBind properties of the dropdown list control. . _ent.Owners() returns an “ObjectQuery” which we assign to the ‘DataSource’ property of the dropdown and then invoke ‘DataBind’.

Public Sub PopulateOwners()
  Dim _ent As MyEntities = New MyEntities
  ddlOwners.DataSource = bsrtent.Owners()
  ddlOwners.DataTextField = "owner_name"
  ddlOwners.DataValueField = "owner_id"
  ddlOwners.DataBind()
End Sub

Method 2
Another way to populate the dropdown list is to loop through the Owners object ‘collection’. _ent.Owners() returns an “ObjectQuery” on which we use and ‘Extension’ function AsEnumerable which then returns the input typed as System.Collections.Generic.IEnumerable(Of T). We can then use an enumerator to loop through the collection and populate the dropdown list.

Public Sub PopulateOwners()
  Dim _ent As MyEntities = New MyEntities
  Dim en As IEnumerator = _ent.Owners().AsEnumerable.GetEnumerator()
    While en.MoveNext
      Dim o As Owners = DirectCast(en.Current, Owners)
      ddlOwners.Items.Add(New ListItem(o.owner_name, o.owner_id))
    End While
End Sub

Method 3
Another approach is to create a LINQ query against the “Owners” entity which returns an IQueryable object against which we can loop through and populate the dropdown list.
In this example we also see how to add a ‘blank item’ or a default entry to the list e.g. the very first value in the list should display “Select an Owner”. The same can be applied to Method 2 but not with Method 1

Public Sub PopulateOwners()
  Dim _ent As MyEntities = New MyEntities
  ddlOwners.Items.Add(New ListItem("Select an Owner", ""))
  Dim qry = From a As Owners In _ent.Owners Order By a.owner_name
    If qry IsNot Nothing AndAlso qry.Count > 0 Then
      For Each o As Owners In qry
        ddlOwners.Items.Add(New ListItem(o.owner_name, o.owner_id))
      Next
    End If
End Sub

Method 4
In this method I’ll demosntrate how to make use of ‘ObjectDataSource’ and ‘Entity Framework’ together in order to populate a dropdown list. Poluating a list is just one of the many things you can do with this approach.
First I’ll need a Data Access Layer (DAL) class which will interact with my Entity Model. Let’s name it MyEntityDAL. . It contains a contructor which isntantiates the MyEntities object and a public Owners() function that queries the Owners entity in the Entity Model, retreives all the records and returns it in form of a DataTable. In this example I am returning a DataTable but as you can see you can return any type of ‘collection’ e.g. IList, DataSet, IQueryable etc.

Public Class MyEntityDAL
    Private _ent As MyEntities

    Public Sub New()
      _ent = New MyEntities
    End Sub

Public Function Owners() As DataTable
        Dim results As New DataTable
        results.Columns.Add(New DataColumn("OwnerID", GetType(String)))
        results.Columns.Add(New DataColumn("OwnerName", GetType(String)))
        'Add an empty row
        Dim _emptyrow As DataRow = results.NewRow()
        _emptyrow.Item("OwnerID") = ""
        _emptyrow.Item("OwnerName") = "Select an Owner"
        results.Rows.Add(_emptyrow)
        Dim qry = From a As Owners In _ent.Owners Select a.owner_id, a.owner_name
        If qry.Count > 0 Then
            For Each rec In qry
                Dim dr As DataRow = results.NewRow()
                dr.Item("OwnerId") = rec.owner_id.ToString
                dr.Item("OwnerName") = rec.owner_name
                results.Rows.Add(dr)
            Next
            results.AcceptChanges()
        End If
        Return results
    End Function
End Class

Now that we have our DAL ready lets add an ObjectDataSource to our ASPX page.

<asp:ObjectDataSource ID="odsOwners" runat="server"  SelectMethod="Owners" TypeName="MyEntityDAL">
</asp:ObjectDataSource>

Next let’s re-write our owner dropdown lsit control as follows:

<asp:DropDownList runat="server" ID="ddlOwners" AutoPostBack="true" DataSourceID=" odsOwners" DataTextField="OwnerName" DataValueField="OwnerID">
</asp:DropDownList>

And that’s it! Now we are able to populate the dropdown list declartively. If you want you can event modify Method 1 to make use of the MyEntityDAL object and set the DataSource property to the Owners() function.

Wednesday, June 24, 2009

Using JQuery UI Tab with ASP.Net UpdatePanels

With advent of AJAX, web development has never remained the same. Developers (and business) are demanding more and more win-form like UI interaction from web applications (Think Web 2.0). To support that need, ASP.Net has come up with the "AJAX Control Toolkit" which provides a rich set of various UI controls and widgets like Tabs, Calendar, Datepicker etc. These are all fantastic AJAX controls but are all server based. It would be simply great if these were completely client side controls. As if on cue, enters j Query UI which provides advanced effects and high-level, themeable widgets, built on top of the jQuery JavaScript Library, that you can use to build highly interactive web applications.

One of the widely used JQuery UI widget is the "Tab" control which allows you to easily separate contents. It is also a great way to effectively use the "real estate" on the web page. Another advantage of using a client side "Tab" control is that it will allow you to have more than one form on the page if you need. With ASP.Net AJAX 'Tab' you cannot have "multiple web-forms".

So when do we need "multiple forms" ? Think about a "Maintenance" option in a web-application e.g. a maintenance page which allows to add users, roles, maintain reference table data etc. Each functionality needs the data to be posted back to the server which immediately brings us to the point where we want that to happen asynchronously or "AJAX style". There are two ways to achieve this goal:

We can use JQuery's AJAX capabillities
Use ASP.Net UpdatePanels

With ASP.Net UpdatePanels we have to code much less (ASP.Net does all the work for us) than if we used JQuery AJAX (though w.r.t performance UpdatePanels are worser than pure JavaScript AJAX). I also want to point out the fact that it is okay and feasible to mix both client and server side AJAX controls to get the best of both the world.

Let's look at an example of a simple JQuery UI Tab markup using ASP.Net UpdatePanels:

  <div id="tabsMaintainTables">
          <ul>
              <li><a href="#tab-Owners">Owners</a></li>
              <li><a href="#tab-Status">Status</a></li>
              <li><a href="#tab-Roles">Roles</a></li>
          </ul>
          <div id="tab-Owners">
            <asp:UpdatePanel runat="server" ID="TableMaintStatusUpdatePanel" UpdateMode="Conditional">
            <ContentTemplate>
            ...
            </ContentTemplate>
            </asp:UpdatePanel>
          </div>
          <div id="tab-Status">
            <asp:UpdatePanel runat="server" ID="TableMaintStatusUpdatePanel" UpdateMode="Conditional">
            <ContentTemplate>
            ...
            </ContentTemplate>
            </asp:UpdatePanel>
          </div>
          <div id="tab-Roles">
            <asp:UpdatePanel runat="server" ID="TableMaintRolesUpdatePanel" UpdateMode="Conditional">
            <ContentTemplate>
            ...
            </ContentTemplate>
            </asp:UpdatePanel>
          </div>
  </div>

The JQuery code to display the 'Tabs' is as follows:

 $(function() {
     $("#tabsMaintainTables").tabs();    
 });

So far so good. Now when we submit the web-form, the UpdatePanel packages the full postback as an AJAX call for us. Once the AJAX calls completes, the page basically loads again (as it would normally do). This causes the 'Tab Index" to get reset i.e. if the form was submitted when the 3rd Tab (in our case Roles tab) was selected, after the post-back the selected tab-index is the 1st Tab.

So the obvious question is how can we maintain the selected tab-index of jQuery UI tabs after the ASP.Net UpdatePanel post back completes?

One advantage of using ASP.Net AJAX Control Toolkit's 'Tab' control over jQuery UI 'Tab' is that if you use UpdatePanel with the former, support for maintiang the 'tab-index' after the post back is provided out of the the box. No extra code needed. Unfortunately when using jQuery UI Tab we need to do perform the following extra steps to achieve the same functionality:

Trap the client side events when the ASP.Net AJAX request begins and ends
In the begin request event handler save the selected tab-index in a global variable
In the end request event handler, initialize the jQuery UI tab control and use the "saved index" to set the focus to proper tab

Trapping the client side events:

   Sys.WebForms.PageRequestManager.getInstance().add_endRequest(EndRequestHandler);
  Sys.WebForms.PageRequestManager.getInstance().add_beginRequest(BeginRequestHandler);

Saving the currently selected tab-index at the beginning of the AJAX request:

 //Global variable
 var selectedMaintenanceTabIndex;
 
 function BeginRequestHandler(sender, args) {
 ...
 var maintenancetabs = $("#tabsMaintainTables").tabs();
 selectedMaintenanceTabIndex = maintenancetabs.tabs('option', 'selected');
 ...
 }

Retreiving the saved selected tab-index at the end of the AJAX request and setting the tab index properly:

 function EndRequestHandler(sender, args) {
 ...
 var maintenancetabs = $("#tabsMaintainTables").tabs();
 maintenancetabs.tabs('select', selectedMaintenanceTabIndex);
 ...
 }

And that's it! This is all there is to maintain tab-index with jQuery UI tabs when using it with ASP.Net UpdatePanels.

Thursday, June 18, 2009

Understanding the concept of ‘unlimited’ or ‘infinite’ email storage space

In 2005, Google made an astonishing announcement that it would keep increasing GMail’s email storage by the second as long as it had enough space on its servers. Currently GMail provides more than 7338 MB of free storage. It is indeed intriguing as to how it can provide this ‘infinite’ storage space for its entire subscriber base. Is it at all possible? After all there is only so much ‘finite’ storage space available. So how can we get to ‘infinite’ storage?

If you search you will see discussion and speculation galore on nature of its physical servers, amount of storage (in petabytes) it owns and type of storage like holographic storage, network and distributed storage it might be using. Discussion also revolves around the fact that most of the email users will use less than 25%-30% of the available storage and so we will technically never run out of physical space ever or that Google will keep buying servers to keep space with the demand.

But for some reason there is not much discussion or information on the very interesting math and science behind this concept. So I have decided to talk about the simple mathematical model that can be used(based on my understanding and I am no math whizkid!) to describe this ‘unlimited’ growth of the storage space. If you have observed carefully, it is interesting to note that storage space on GMail kept increasing at faster space in the beginning and then it started to slow down considerably.

“On October 12, 2007 the rate of increase was 5.37 MB per hour.

Approximately a week later, the rate decreased to 1.12 MB per hour, on January 4, 2008 further down to about 3.35 MB per day, or 0.14 MB per hour, and in October 2008 further down to about 353.9 KB per day.” – Wikipedia

How can we achieve this kind of behavior? i.e. write a software program (aka an algorithm) that will start with an initial value and then begin incrementing the value at a fast rate for some period of time and then start to slow down (or increment at a slower rate).

To answer that question we need to understand the concept of ‘Function Growth’ – which in simple terms can be described as the rate at which the value of any given function grows in relation to the function’s current input value. And different family of function grows at a different rate e.g. you can have constant growth O(1), linear growth O(n), exponential growth O(2^n), logarithmic growth O(log n) etc. Of these, logarithmic growth is the one which we are most interested in for our case. Why? That’s because growth rate of a log function is very similar to the growth rate observed in the ‘unlimited’ growth of the email storage.

What I am going to do next is to create a program that simulates the ‘unlimited’ growth of email storage. Let us make some basic assumption first. We will assume our initial storage starts at 5000 MB (5 GB). We will increase the storage every second by some ‘factor’. The simulation will run until the storage reaches 10000 MB (10 GB). We will then observe how long it takes to reach from 5GB to 10 GB.

Since we are simulating the growth “every second” we would consider total “seconds” there are in a day (which is 1 * 60 * 60 * 24). So we will start from 1 and once we have reached 86400th second, we will consider a day has gone by and again start from 1 second. We will use the following function: fn = c*[log(s)/(s*d )] where c = is some constant, s=each second and d=current day.

The simple code is as follows:

And if you run the program you will see the following output:

The first column is the "day", second column shows the "storage size" at the end of day, third column displays the daily growth while the last depicts the over all growth. If you now plot a graph of Day vs Size you will get something like this:

Can you see now what's going on? Starting from day 1, it will take about 1300 days i.e. approximately 4.5 years to reach 10GB. By changing the value of the constant 'c', you can control the overall rate. We also notice that the storage grew by almost 60% (upto 8000 MB) in the first 60 days. Then it slowed down considerably and grew at a much slower pace.

So effectively what we are seeing is that though the growth happens every second giving the illusion as if we are marching towards infinity in practical terms we could take years before we run out of physical strorage space. And who knows by then we might have found a way to really have infinite storage.

Friday, June 12, 2009

ASP.NET 2.0 Master Pages - accessing server side control from JavaScript

Master pages are the latest and greatest addition to ASP.NET 2.0. It helps us build consistent and maintainable user interfaces. But as with every new thing, they are not without their gotchas. One of the most often faced problem is with accessing the server side controls from client side javascript. This is because both the MasterPage and Content controls are naming containers. Naming container is any control that carries the INamingContainer interface and one thing a naming container does is to mangle its children’s ClientID property.Mangling ensures all ClientID properties are unique on a page.

Let us consider the following simple Master-Content page:

So for instance, the ID for our Label control is “MyLabel”, but the ClientID of the Label is "ctl00_BodyContent_MyLabel". Each level of naming container prepends it’s ID to the control (the MasterPage control ID in this form is ctl00). So now if we try to access this control from Javascript with client side script functions like document.getElementById(), it will fail with a JavaScript error: " 'MyLabel' is undefined ".

In the content page, I have two server side controls. One is a Label and another is a Button. On every click of the button I want to display the current time in the label and I want to do it without post-back (obviously!!!) i.e. from client side javascript. So I write a javascript function called SetTime() and attach it to the 'onlick' event of the button.In the SetTime() function we calculate the current time and then set the value to the label control.

As I mentioned earlier if we do MyLabel.innerHTML = _curtime or document.getElementById(MyLabel).innerHTML = _curtime, in both cases we would get the 'MyLabel' is undefined error.

So how to fix the problem?

Solution 1: One possible solution is to directly use the generated client side ID of the Label control.

ctl00_BodyContent_MyLabel.innerHTML =_curtime ;

But as you can see this is really bad practice as we’d never want to hardcode the client ID into a script. Typically we should build the SetTime() javascript function dynamically using StringBuilder or String.Format and emit the complete client script with the ClientScript.RegisterStartupScript() function.
This approach works but only if your functions are small and simple.As your javascript functions becomes more complex this apprach falls apart.

Solution 2: Another alternative is to extend the first approach with use of markers in the script and use a call to String.Replace. Essentially we'll create a client side variable containing the control's ClientID value as follows:

Protected Sub Page_Load(ByVal sender As Object, ByVal e As EventArgs) Dim MyLabelID As String = "var MyLabelID = ""{}"";" MyLabelID = MyLabelID .Replace("{}", MyLabel.ClientID) ClientScript.RegisterStartupScript( Me.GetType(), "ClientID", MyLabelID, True) End Sub

Now we can use MyLabelID.innerHTML = _curtime or document.getElementById(MyLabelID).innerHTML = _curtime with out any problem.

So then for all the server side controls which you want to access from client side you can use approach 2 and create a corresponding JavaScript variable and use it in your client script
But what if you have 50 server side controls in your page ? Or you add 50 new server side controls to your page and want to access either all or some of them from client side?Maintenance becomes increasingly difficult as you have to remember to repeat the steps in approach 2 for every server side controls added.

Can we do better? Sure we can.

Solution 3: Let us write a function that will find all the controls in the page and automatically create the corresponding client side javascript variable with the same name.

Call the above function on Page load event. Now if you add another server side label control and name it "MyLabel2" you can directly access it from JavaScript (without the quotes).

Wednesday, June 10, 2009

Service based, on-demand web forms with AJAX, JSON and JavaScript

If you are thinking what's with the weired title, then you are right. I am yet to come up with a better name of this design approach that I am going to talk about. But before that let me start with a real world example of a web-application.

Imagine you are part of a distributed project team responsible for building a production enabled, web-based service desk application in ASP.NET. One of the many features of the application is to allow users to create new tickets. In the create new ticket page users need to select from a drop down the type of service they want and based on their selection different set of forms with validation will be displayed on the page for the user to fill up and submit. Finally, after the form is submitted, it would gather all the information and create an HTML formatted email and send it to the service owner.

How would you go about implementing this functionality?

One simple way to do it is to create separate web-forms for each type of service and redirect to the appropriate page depending on the user's selection of the type of service. Since the requirement is to have the forms be displayed within the same page we need to ensure that the look and feel of all the web-forms are consistent. Easiest and recommended way to achieve that will be to make use of Master pages and style sheets.
Another option will be to create seperate DB table in line with the form structure and store the information there. Then build the form on the fly based on that structure and show it to the client
We could also create an associative table (key,val) where the value of the rows are column names depicting the fields in the form.

Simple enough but do you see the problem with either of the above approaches?

What happens when a form changes from one release to another ?
What if we need to change any of the forms mid-release?
What if we want to add a new form to a type of service once the application is in production?
What if the team designing the forms are different from that application development team and only know HTML, JavaScript and CSS?
What if we want to change the client side validation logic?
What if the HTML display of the email needs to be changed?

For each of the above scenario every change to the form requires code change, database table tructure changes, testing cycles, build, deployment and a separate release just to put through an UI change. This could eventually lead to slower go-to-market time and be detrimental to the business. We need a way to be able to handle all the scenarios (1-6) more effectively. Ideally we would want team responsible for maing changes to the forms to be able to log in to the application, go to a maintenance section and make necessary changes and publish the form without having to make any "code change". So how to achieve this goal?

The Design:

The basic design is very simple. Instead of physically creating the forms and storing them on the web-server we store the forms in the database. Then when the client requests for a particular form for a selected type of service, we pull the necessary information from the database (making an AJAX call) and show it to the user.

Now you may ask, ok, displaying the form to the client is one thing (simple enough) but how about (i) handling form elements actions like click of a button, selecting a drop down, initializing some part on load on the client (ii) overall form validation before being submitted, (iii) reading the values of the form elements and (iv) creating the HTML display for the email?

First, let us examine the table structure where we store the forms. Let the table name be 'FormTable' which has the following columns (left out some of the columns for sake of brevity):

[frm_svc_id] [int] NOT NULL,

[frm_input_elements] [varchar](max) NOT NULL,

[frm_display] [varchar](max) NULL,

[frm_name_map] [varchar](max) NULL,

[frm_actions] [varchar](max) NULL,

[frm_validation] [varchar](max) NULL

frm_svc_id = The id of the service for which we need the form

frm_input_elements = The input form's HTML structure and elements

frm_display = The output HTML form that needs to be displayed after the form is submitted

frm_name_map = The display names of the input form fields

frm_actions = Event handling functions

frm_validation = The form validation function before submission

Let's look at an example what goes in [frm_input_elements]:

 <table>
  <tr>
    <td>Date of Service</td>
    <td><input type="text" name="frmdos"/></td>
  </tr>
<tr>
  <td>Environemnt</td>
  <td>
  <select nme="frmselect">
    <option value="DEV">DEV</option>
    <option value="PROD">PROD</option>
  </select>
  </td>
  <td>
  <input type="text" name="frmselectedenv"    
  value=""/>
  </td>
</tr>
<tr>
  <td>
  <input type="button" name="frmsubmitbutton"  
  value="Submit"/>
  </td>
</tr>
</table>

Now we want to add the following functionality to the form. On selecting the value from the 'frmselect' drop-down, display the selected value in the 'frmselectedenv' textbox. To do this all we need to do is to attach a function to the "onchange" event of the 'frmselect' dropdown. The function code will be something like this:

 function dowork()
{
var selectobj = document.getElementById("frmselect");
var txtobj = document.getElementById("frmselectedenv");
txtobj.value = selectobj.options[selectobj.selectedIndex].value;
}

So how can we store this information? Enter JSON. The [frm_actions] column holds data in the following JSON structure:

[{id:'',eventtype:'',action:''},{id:'',eventtype:'',action:''},...,{EOF:true}]

So in our example, frm_actions will be as follows:

 [
 {id:'frmselect',
  eventtype:'onchange',
  action:'var selectobj = document.getElementById("frmselect");
              var txtobj = document.getElementById("frmselectedenv");
              txtobj.value = selectobj.options[selectobj.selectedIndex].value'
  },
  {EOF:true}
]

Similarly, the [frm_name_map] column, which is used to display the data after the form is submitted, holds data in the following JSON structure:

[{SystemFieldName:'',DisplayFieldName:''},...,{EOF:true}]

In our eaxmple it will be as follows:

 [
{ SystemFieldName:'frmselectedenv',
  DisplayFieldName:'SelectedEnvironemnt'
},
{ SystemFieldName:'frmdos',
  DisplayFieldName:'Service Date'
},
 {EOF:true}
]

We are all almost done. Now all is needed is to retrieve the JSON data for the form, display it and attach the proper event handlers, set up the vlaidation (if avialable) and read the form vlaues before submission. Enter JavaScript and JQuery.

The Implementation

The AJAX call to retrive the form's information willl return the following JSON object from the server side:

({frmelements:'',frmvalidation:'',frmdisplay:'',frmnamemap:'',frmactions:''})

where:

"frmelements" <= [frm_input_elements]
"frmvalidation" <= [frm_validation]
"frmdisplay" <= [frm_display]
"frmnamemap" <= [frm_name_map]
"frmactions" <= [frm_actions]

On the client side the JavaScript function "GetForm()" is the main engine of this design approach. It makes use of Javascript's ability to create and attach "anonymous functions" to events. We need to be careful when working with anonymous functions functions is JS as it can lead to memory leak.

 //Global definition
var objArr = new Array();
var frmformdatadisplay = "";
var frmnamemap = "";
var frmFormValidation;

function GetForm(svcid)
{
var selectedid = document.getElementById(svcid).value;           
//Clear up the array to prevent memory leaks
for (i = 0; i < objArr.length; i = i + 3) {
if ((objArr[i] != null) && ($get(objArr[i]) != null)) {
                $get(objArr[i]).detachEvent(objArr[i + 1],   
  objArr[i + 2]);}
  objArr[i] = null;
  objArr[i+1] = null;
  objArr[i+2] = null;
}
PageMethods.GetForm(selectedid, GetFormSuccess, GetFormFailed);
}

function GetFormSuccess(result)
{
var frmobj = eval(result);
if (typeof (frmobj) !== 'undefined')
{
  $get("divfrm").innerHTML = frmobj.frmelements;
  var j = 0;
  for (i = 0; i < frmobj.actions.length; i++) {
  var actionobj = frmobj.actions[i];
  if (actionobj.id != "")
  {
    objArr[j] = actionobj.id;
    objArr[j + 1] = actionobj.eventtype;
    objArr[j + 2] = new Function(actionobj.action);
    $get(actionobj.id).attachEvent(actionobj.eventtype, objArr[j + 2]);
    j += 3;
  }
  else
  {
  //id = "" ==> action is not specific to any element. so execute it right away
  if (actionobj.action != "") {
    new Function(actionobj.action)();
  }
}
}
  frmformdatadisplay = frmobj.frmdisplay;
  frmnamemap = frmobj.frmnamemap;
  frmFormValidation = new Function(frmobj.frmvalidation);   
}
}

And that's it! The only thing I left out is the parser to parse the form and extract the form fields and get the corresponding values and storing them. It's easy enought to iterate through the form element collection, and for each type of input element get the value and store it in a JSON object and save it in the datastore.

As with every thing in life, this design is not a one-solution-fits-all. It has its own drawbacks. With this aproach you loose the ability to store data on the server and use SQL Query to generate reports and perform searches. Also we are putting lot of functionality and processing on the client which could prove fatal. Moreover, you need write more code in contrast to other standard solutions and there will be learning curve involved for maintainng the forms.

But if flexibility is your goal, you do not want to go through a release cycle to push in a change, the form structures will be modified often enough by businesss and you can live without having to do SQL Query then this desing could prove beneficial.

In the next part, I'll explain how to create a Web 2.0 rich interface using JQuery to edit and manage these forms.

Tuesday, June 9, 2009

ASP.NET MVC 1.0 - Are you ready for it?

ASP.NET MVC v1.0 is finally here! Today it is available as an add-on to ASP.NET 3.5 SP1. One thing that has been missing from the ASP.Net suite is the out of box support for Model-View-Controller (MVC) design/architecture pattern. It is well known that with web development, MVC is the best way to go. With classic ASP it was impossible (almost) to achieve MVC pattern while with ASP.Net Web-Form it was somewhat better (separation of View and Controller/Model) but still it was too tightly coupled. Now with ASP.NET MVC one can truly achieve the benefits of this design.

But question is are you ready for it? ASP.Net MVC introduces a new paradigm in web development. It is an intrusive technology that can radically change the way folks develop web-based applications. Right now ASP.Net MVC is not a replacement for Web-Forms but rather another option. But what can you expect when you start developing with ASP.Net? Are there any surprises waiting for you. May be ...

First, the "ViewState" and "PostBack" concept of web-forms no longer holds good. Everything is either a GET or POST. So "State Management" is no longer out of the box but needs to be handled by the developer. This also means "Server Controls" are useless and no "GridView"!!!. You need to implement you own 'gridview' (though not a difficult task).
What about standard functionalities like "Paging", "Sorting", "In-Place Edit", "Styles" ? None of the features comes out of the box and needs to be implemented too. This necessarily means knowledge of "Extension Methods", "Lambda Expressions", "Query Syntax" is a must thereby increasing the learning curve and the go-to-market speed.
What about AJAX? There is no AJAX support out of the box (like UpdatePanel in web-forms). You need to make use of JavaScript/JQuery to implement AJAX functionalities. What about Security? Early adopters of ASP.Net MVC have raised concerns over inherent security flaws in MVC framework e.g. "Delete Link through GET", "POST Data tampering" to name a few. With regard to deployment, it supports IIS 7.0 Integrated mode by defualt. For older IIS like 7.0 classic and 6.0, to needs to be configured.

WOW! that's lot of "surprises"! You might be tempted to think ASP.Net MVC is not something you want to try. But keep in mind ASP.NET MVC 1.0 is just coming out of infancy. By the next release, due to the hard work put in by early adopters, it would have reach an acceptable level of maturity with lot of the standard functionality in place. In the long run, ASP.Net MVC has the potential to shorten the development schedule, gear everyone towards Test-Driven Development(TDD) and ease maintenance of large and complex projects.

On the flip side, there are few things I don't like about the ASP.NET MVC implementation because it reminds me of classic ASP!!! I'll elaborate more in my next post.

Before signing off, I must say, ASP.NET MVC provides you with all the control that you asked for. It is here to stay. But remember what uncle Ben said to Spidey :"with great power comes great responsibility".

Monday, June 8, 2009

Batch Updates in ADO.Net 2.0 - How to find the optimum batch size?

Before I explain how to go about estimating the optimum batch size to be used with ADO.NET 2.0 batch operations, I want to briefly touch upon the basics. if you want to skip the basics, click here

One the new features in ADO.NET 2.0 is the "Batch Updates" operation. It promises to improve performance of the application by reducing the number of round trips to the database. Prior to ADO.NET 2.0, if we made any changes to the DataSet and then saved it using the Update method of the SqlDataAdapter class, it made round trips to the database for each modified row in the DataSet. This was a major performance hindrance. So how does reducing database roundtrip improves perfromance?

Let's look at the following example:

Assume we are building a 3-tier application (client-webserver-database). The web-servers are located in Boston while the database servers are in California (an extreme example, but a practical one never the less). Also, in the application, we have a datagrid (associated with an underlying datatable) that displays records and through some specific operations (and user interaction) the records in the datatable are updated and we need to persist these changes back to the database. We are using ADO.NET, SQLDataAdapter and its Update method. Let us say there were 50 records were modified. Each individual Update operation takes 1 second. And each roudtrip to SQL server takes 1 seconds. So with ADO.NET 1.1, where we need to make 50 roundtrips to the database server, the overall operation will take 50 * (1+1) = 100 seconds

So how does the above scenario change with ADO.NET 2.0 ?

In ADO.NET 2.0, we now have a new "UpdateBatchSize" property which indicates the number of rows that are processed in each round-trip to the server. It can take the following values:

n=0 - There is no limit on the batch size
n=1 - Disables batch updating.
n>1 - Changes are sent using batches of 'n' operations at a time

So in our above example, let us set batch size(n) = 5. That means, In each round-trip to server 5 records will be processed. So now the overall operation will take (50/5 * 1) + (50 *1) = 60 seconds.
Let n=10 ==> (50/10 * 1) + (50 * 1) = 55 seconds
Similarly if n=50 ==> (50/50 * 1) + (50 * 1) = 50 seconds

So we see that in this simple contrived example, we easily get a performance boost of 40% - 50% with Batch Update mode. So the key question is what should be the "optimum batch size" that will provide us the maximum performance gain. Surprisingly, this is not a simple question to answer.There is no automated way to figure this out. Moreover, there are not much information out there to help us make an informed decission. This is what MSDN says - Executing an extremely large batch could decrease performance. Therefore, you should test for the optimum batch size setting before implementing your application..

So how to test and 'estimate' the optimum batch size for your own application? Note the word 'estimate'. That is exactly what we are going to compute. In order to do that we need to start with some raw data.

Let say an update operation (take the example I described above) takes 't' seconds to complete (includng the roundtrip time to database server and the web service call alogn with the actual update operation). Execute the operation multiple times, each time with different batch size 'S' and note down the time taken 't' in each case. Here's the pseudo code -

maxbatchsize = M;
batchincrement = 1;
for(S=0; S < M; j=j+batchincrement)
{
if (S!=1) {
t1 = starttimer();
ExecuteBatchUpdate(S, dt)//where S=batch size, dt=datatable with 'R' rows updated
t2 = endtimer();
t' = t2-t1;
LogInTextFile(S, t')//log the batch size and the correspondign time taken t'
}
}

Q: Why do we skip S=1?

Let say for first run, R=500, M=500 and batchincrement =1. That means in our simulation, we will have batch size from 0 to 500 (except 1) and the log file would look something like this:

Batch Time
----- ----
0 t1
2 t2
3 t3
.. ..
500 t500

For the next run R=500, M=500 and batchincrement = 10
Batch Time
----- ----
0 t1
10 t2
20 t3
.. ..
500 t50

For the next run R=500, M=500 and batchincrement = 25
Batch Time
----- ----
0 t1
25 t2
50 t3
.. ..
500 t21

The 'batchincrement' value decides the batchsize for the update operation and the total number of roundtrip to the database server. Ideally you would want to chnage the value of 'batchincrement' (from low to high i.e from 1 to 500) to control the batch size and execute as many runs as you can. Once you have collected the raw data, the exciting part begins.

Now for each run data do the following:

a. Do a scatter plot for Batch Vs Time (x vs y)
b. Then plot the line-of-best fit i.e. do curve fitting and add a trend line.
e.g. you can se a quadratic function (a+bx+cx^2) or a linear function
c. Find the minimum value from the line-of-best fit.

So how to perfrom the above steps? You can use Excel do it or you can use the free open-source statistical analysis software R. With R, its easy as writing a 20 line of R-code to do the above steps.Here's the program (uses a quadartic function for curve fitting for my sample data):

setwd("C:/")
Data<-read.table(file="BatchResult.txt",sep="",header=TRUE)
#par(mfrow=c(2,1))
Batch<-Data[,1]
Time<-Data[,2]
plot(Batch,Time,type="p",col="red",cex=0.5)
## Fit Quadratic Time=a+b*Batch+c*Batch^2
Batch.2<-Batch^2
fit.quadratic<-lm(Time~Batch+Batch.2)
print(summary(fit.quadratic))
coef<-fit.quadratic$coefficients
a<-coef[1]
b<-coef[2]
c<-coef[3]
print(coef)
Time.est<-a+b*Batch+c*Batch.2
lines(Batch,Time.est)
cat("B.hat at min time = ", -(b)/(2*c),"\n")

The output of the above program is shown below:

So say for Run1 we get optimum batch size value B1. Repeat the steps for all the other runs and we have something like this(Table A):

Run1 B1 (B1/R)*100
Run2 B2 (B2/R)*100
Run3 B3 (B3/R)*100
.......
RunN BN (BN/R)*100

where BN/R*100 proivdes the the % of the the total records being modified as the estimated batch size. Sort Table A by the value BN/R*100 and you will get an esitmate of what would be the "optimum range of the batch size" to use in your application.

So now for any new functionality you intrduce in your appliation that needs to use the ADO.NET 2.0 batch operations, you can find out the estiamted range of the optimum batch size which will provide maximum perfromance boost (no more blind trial and error!)

Thanks to Sourish for helping me with the above program

Select-Case vs If-Then-Else

What is the difference between ‘Select-Case’ and ‘If-Then-Else’ constructs? Ask anyone and most likely you will get the answer they are same as far as functionality goes. So why use one over the other? Again the readability of the code is given as the primary reason for selecting Select-Case over If-Else. So is there any other reason beside ‘readability’? What about performance? Let us consider the following:

Both codes generate the same output. So to see the difference we need to dive little deeper and look at the IL code that gets generated:

The left hand side is the IL code from If-Then-Else while on the right hand side is the IL code from Select-Case. Can you see the difference? Let me explain.
After loading the value of ‘i=15’ on to the evaluation stack (both cases), the If-Then-Else construct executes the instruction bne.un.s (i.e. branch if not equal to OR CMP and JMP in assembly language) 3 times. On the other hand the ‘Select-Case’ creates a ‘jump table’ i.e. a ‘lookup table’ using the switch instruction. The lookup table is 0-indexed. So instead of examining every single case statement separately, we can jump to the right case by simply calculating the offset into the address table.

Here’s how it works:
After the value i=15 is loaded on to the evaluation stack, the integer value 1 is pushed on the stack .Then the instruction sub is executed and the result is pushed on the stack. So in this case the value 14 is pushed to the top of the stack. Now the offset = 14 which gives the address IL_0055 and the execution pointer directly jumps to the specified location. From there it sets ‘j=6’ and prints the value of ‘j’.Similarly if we set i=2, then the offset = 1, which gives the address IL_0050 and execution pointer directly jumps to the specified location. From there it sets ‘j=6’ and then unconditionally jumps (br.s) to IL_0058. It then prints the value of ‘j’ and exits.

Now imagine if you have 50 comparisons to make. If you use If-Then-Else construct you would use 50 bne.us.s (i.e. CMP and JMP) instructions to get to the last match (worst case scenario). While with Select-Case you can jump to the last matching case with a single table lookup. So in terms of worst-case time-complexity we have replaced a O(n) algorithm with O(1) i.e. constant-time algorithm when using Select-Case.

So does this mean the Select-Case construct always performs better than If-Then-Else? There’s your ‘gotcha’! It turns out that the ‘address lookup table’ solution implemented by ‘Select-Case’ can be applied to almost all real-world scenarios with only the offset calculation becoming more complex. There’s only one exception to this rule:

If the cases are completely unrelated to each other i.e. the compiler fails to find a pattern in order to create the lookup table e.g.

As we see the resulting code does the comparison by examining every case separately i.e. executing the instruction bne.un.s 3 times essentially making it same as the If-Then-Else construct and we get an O(n) algorithm. So in this situation it’s no better than If-Then-Else.

What if we are comparing non-Integers e.g. double? Let’s look at the source code and the underlying IL code:

Again we see that with the same set of cases, when we did Integer comparison, Select-Case generated the lookup table. But with Double it does it the long way i.e. just like If-Else construct and thereby giving us O(n) algorithm.Now what if we are comparing strings? How does Select-Case match up to If-Then-Else? Let’s take a look at the following code:

Again both produce the same output; but what about their IL code?

Surprise! In both cases the compiler uses the instruction bne.us.s to do the individual comparison for each case. Select-case does not generate an ‘address lookup table’ in this scenario. So we get O(n) algorithm when doing string comparison with both Select-Case and If-Then-Else construct i.e.both are same perfromance wise.
Can we do any better when doing string comparison with Select-Case construct?. As it turns out we can do better by using Enumerations when doing string comparison with Select-Case. Let’s re-write the code as follows:

Here's the generated IL code:

As you can see when used with Enumeration, the compiler goes back to generating the ‘address lookup table’ for the Select-Case thereby making it an efficient O(1) algorithm. So it begs the question, if we use Enumeration to do string comparison will If-Else construct will it be as efficient as Select-Case? The answer is no (obviously !!!)
So next time when you are about to make a decision as to which construct to use Select-Case or If-Else, here's the comparison chart to you help you with that decision:

COMPSCI - Summer 2009