Maarten Balliauw {blog}

Web development, NuGet, Microsoft Azure, PHP, ...

NAVIGATION - SEARCH

PHPLinq version 0.2.0 released!

Last friday, I released PHPLinq version 0.2.0. LINQ, or Language Integrated Query, is a component inside the .NET framework which enables you to perform queries on a variety of data sources like arrays, XML, SQL server, ... These queries are defined using a syntax which is very similar to SQL.

This latest PHP version of LINQ provides almost all language constructs the "real" LINQ provides. Since regular LINQ applies to enumerators, SQL, datasets, XML, ..., I decided PHPLinq should provide the same infrastructure. Each PHPLinq query is therefore initiated by the PHPLinq_Initiator class. Each PHPLinq_ILinqProvider implementation registers itself with this initiator class, which then determines the correct provider to use. This virtually means that you can write unlimited providers, each for a different data type! Currently, an implementation on PHP arrays is included.

PHPLinq class diagram 

Being able to query PHP arrays is actually very handy! Let's say you have a mixed array containing Employee objects and some other data types. Want only Employee objects? Try this one!

[code:c#]

$result = from('$employee')->in($employees)
            ->ofType('Employee')
            ->select();

[/code]

Want to know if there's any employee age 12? Easy! The following query returns true/false:

[code:c#]

$result = from('$employee')->in($employees)->any('$employee => $employee->Age == 12');

[/code]

Let's do something a little more advanced... Let's fetch all posts on my blog's RSS feed, order them by publication date (descending), and select an anonymous type containing title and author. Here's how:

[code:c#]

$rssFeed = simplexml_load_string(file_get_contents('http://blog.maartenballiauw.be/syndication.axd'));
$result = from('$item')->in($rssFeed->xpath('//channel/item'))
            ->orderByDescending('$item => strtotime((string)$item->pubDate)')
            ->take(2)
            ->select('new {
                            "Title" => (string)$item->title,
                            "Author" => (string)$item->author
                      }');

[/code]

Download LINQ for PHP (PHPLinq) now and get familiar with it. Since I started working on PHPLinq, I also noticed LINQ implementations for other languages:

Data Driven Testing in Visual Studio 2008 - Part 2

This is the second post in my series on Data Driven Testing in Visual Studio 2008. The first post focusses on Data Driven Testing in regular Unit Tests. This part will focus on the same in web testing.

Web Testing

I assume you have read my previous post and saw the cool user interface I created. Let's first add some code to that, focussing on the TextBox_TextChanged event handler that is linked to TextBox1 and TextBox2.

[code:c#]

public partial class _Default : System.Web.UI.Page
{
    // ... other code ...

    protected void TextBox_TextChanged(object sender, EventArgs e)
    {
        if (!string.IsNullOrEmpty(TextBox1.Text.Trim()) && !string.IsNullOrEmpty(TextBox2.Text.Trim()))
        {
            int a;
            int b;
            int.TryParse(TextBox1.Text.Trim(), out a);
            int.TryParse(TextBox2.Text.Trim(), out b);

            Calculator calc = new Calculator();
            TextBox3.Text = calc.Add(a, b).ToString();
        }
        else
        {
            TextBox3.Text = "";
        }
    }
}

[/code]

It is now easy to run this in a browser and play with it. You'll notice 1 + 1 equals 2, otherwise you copy-pasted the wrong code. You can now create a web test for this. Right-click the test project, "Add", "Web Test...". If everything works well your browser is now started with a giant toolbar named "Web Test Recorder" on the left. This toolbar will record a macro of what you are doing, so let's simply navigate to the web application we created, enter some numbers and whatch the calculation engine do the rest:

Web Test Recorder

You'll notice an entry on the left for each request that is being fired. When the result is shown, click "Stop" and let Visual Studio determine what happened behind the curtains of your browser. An overview of this test recording session should now be available in Visual Studio.

Data Driven Web testing

There's our web test! But it's not data driven yet... First thing to do is linking the database we created in part 1 by clicking the "Add datasource  Add Datasource" button. Finish the wizard by selecting the database and the correct table. Afterwards, you can pick one of the Form Post Parameters and assign the value from our newly added datasource. Do this for each step in our test: the first step should fill TextBox1, the second should fill TextBox1 and TextBox2.

Bind Form Post Parameters

In the last recorded step of our web test, add a validation rule. We want to check whether our sum is calculated correct and is shown in TextBox3. Pick the following options in the "Add Validation Rule" screen. For the "Expected Value" property, enter the variable name which comes from our data source: {{DataSource1.CalculatorTestAdd.expected}}

image

If you now run the test, you should see success all over the place! But there's one last step to do though... Visual Studio 2008 will only run this test for the first data row, not for all other rows! To overcome this poblem, select "Run Test (Pause Before Starting" instead of just "Run Test". You'll notice the following hyperlink in the IDE interface:

Edit Run Settings

Click "Edit run Settings" and pick "One run per data source row". There you go! Multiple test runs are now validated ans should result in an almost green-bulleted screen:

image

kick it on DotNetKicks.com

Data Driven Testing in Visual Studio 2008 - Part 1

Last week, I blogged about code performance analysis in Visual Studio 2008. Since that topic provoked lots of comments (thank you Bart for associating "hotpaths" with "hotpants"), thought about doing another post on code quality in .NET.

This post will be the first of two on Data Driven Testing. This part will focus on Data Driven Testing in regular Unit Tests. The second part will focus on the same in web testing.

Data Driven Testing?

We all know unit testing. These small tests are always based on some values, which are passed throug a routine you want to test and then validated with a known result. But what if you want to run that same test for a couple of times, wih different data and different expected values each time?

Data Driven Testing comes in handy. Visual Studio 2008 offers the possibility to use a database with parameter values and expected values as the data source for a unit test. That way, you can run a unit test, for example, for all customers in a database and make sure each customer passes the unit test.

Sounds nice! Show me how!

You are here for the magic, I know. That's why I invented this nifty web application which looks like this:

Example application

This is a simple "Calculator" which provides a user interface that accepts 2 values, then passes these to a Calculator business object that calculates the sum of these two values. Here's the Calculator object:

[code:c#] 

public class Calculator
{
    public int Add(int a, int b)
    {
        return a + b;
    }
}

[/code]

Create Unit Tests...Now right-click the Add method, and select "Create Unit Tests...". Visual Studio will pop up a wizard. You can simply click "OK" and have your unit test code generated:

[code:c#]

/// <summary>
///A test for Add
///</summary>
[TestMethod()]
public void AddTest()
{
    Calculator target = new Calculator(); // TODO: Initialize to an appropriate value
    int a = 0; // TODO: Initialize to an appropriate value
    int b = 0; // TODO: Initialize to an appropriate value
    int expected = 0; // TODO: Initialize to an appropriate value
    int actual;
    actual = target.Add(a, b);
    Assert.AreEqual(expected, actual);
    Assert.Inconclusive("Verify the correctness of this test method.");
}

[/code]

As you see, in a normal situation we would now fix these TODO items and have a unit test ready in no time. For this data driven test, let's first add a database to our project. Create column a, b and expected. These do not have to represent names in the unit test, but it's always more clear. Also, add some data.

Data to test

Test View Great, but how will our unit test use these values while running? Simply click the test to be bound to data, add the data source and table name properties. Next, read your data from the TestContext.DataRow property. The unit test will now look like this:

[code:c#]

/// <summary>
///A test for Add
///</summary>
[DataSource("System.Data.SqlServerCe.3.5", "data source=|DataDirectory|\\Database1.sdf", "CalculatorTestAdd", DataAccessMethod.Sequential), DeploymentItem("TestProject1\\Database1.sdf"), TestMethod()]
public void AddTest()
{
    Calculator target = new Calculator();
    int a = (int)TestContext.DataRow["a"];
    int b = (int)TestContext.DataRow["b"];
    int expected = (int)TestContext.DataRow["expected"];
    int actual;
    actual = target.Add(a, b);
    Assert.AreEqual(expected, actual);
}

[/code]

Now run this newly created test. After the test run, you will see that the test is run a couple of times, one time for each data row in he database. You can also drill down further and check which values failed and which were succesful. If you do not want Visual Studio to use each data row sequential, you can also use the random accessor and really create a random data driven test.

Test results

Tomorrow, I'll try to do this with a web test and test our web interface. Stay tuned!

kick it on DotNetKicks.com

Code performance analysis in Visual Studio 2008

Visual Studio developer, did you know you have a great performance analysis (profiling) tool at your fingertips? In Visual Studio 2008 this profiling tool has been placed in a separate menu item to increase visibility and usage. Allow me to show what this tool can do for you in this walktrough.

An application with a smell…

Before we can get started, we need a (simple) application with a “smell”. Create a new Windows application, drag a TextBox on the surface, and add the following code:

[code:c#]

private void Form1_Load(object sender, EventArgs e)
{
    string s = "";
    for (int i = 0; i < 1500; i++)
    {
        s = s + " test";
    }
    textBox1.Text = s;
}

[/code]

You should immediately see the smell in the above code. If you don’t: we are using string.Concat() for 1.500 times! This means a new string is created 1.500 times, and the old, intermediate strings, have to be disposed again. Smells like a nice memory issue to investigate!

Profiling

Performance wizardThe profiling tool is hidden under the Analyze menu in Visual Studio. After launching the Performance Wizard, you will see two options are available: sampling and instrumentation. In a “real-life” situation, you’ll first want to sample the entire application searching for performance spikes. Afterwards, you can investigate these spikes using instrumentation. Since we only have one simple application, let’s instrumentate immediately.

Upon completing the wizard, the first thing we’ll do is changing some settings. Right-click the root node, and select Properties. Check the “Collect .NET object allocation information” and “Also collect .NET object lifetime information” to make our profiling session as complete as possible:

Profiling property pages

Launch with profilingYou can now start the performance session from the toolpane. Note that you have two options to start: Launch with profiling and Launch with profiling paused. The first will immediately start profiling, the latter will first start your application and wait for your sign to start profiling. This can be useful if you do not want to profile your application startup but only a certain event that is started afterwards.

After the application run, simply close it and wait for the summary report to appear:

Performance Report Summary 1

WOW! Seems like string.Concat() is taking 97% of the application’s memory! That’s a smell indeed... But where is it coming from? In a larger application, it might not be clear which method is calling string.Concat() this many times. To discover where the problem is situated, there are 2 options…

Discovering the smell – option 1

Option 1 in discovering the smell is quite straight-forward. Right-click the item in the summary and pick Show functions calling Concat:

Functions allocating most memory

You are now transferred to the “Caller / Callee” view, where all methods doing a string.Concat() call are shown including memory usage and allocations. In this particular case, it’s easy to see where the issue might be situated. You can now right-click the entry and pick View source to be transferred to this possible performance killer.

Possible performance killer

Discovering the smell – option 2

Visual Studio 2008 introduced a cool new way of discovering smells: hotpath tracking. When you move to the Call Tree view, you’ll notice a small flame icon in the toolbar. After clicking it, Visual Studio moves down the call tree following the high inclusive numbers. Each click takes you further down the tree and should uncover more details. Again, string.Concat() seems to be the problem!

Hotpath tracking

Fixing the smell

We are about to fix the smell. Let’s rewrite our application code using StringBuilder:

[code:c#]

private void Form1_Load(object sender, EventArgs e)
{
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < 1500; i++)
    {
        sb.Append(" test");
    }
    textBox1.Text = sb.ToString();
}

[/code]

In theory, this should perform better. Let’s run our performance session again and have a look at the results:

Performance Report Summary 2

Compare Peformance ReportsSeems like we fixed the glitch! You can now investigate further if there are other problems, but for this walktrough, the application is healthy now. One extra feature though: performance session comparison (“diff”). Simply pick two performance reports, right-click and pick Compare performance reports. This tool will show all delta values (= differences) between the two sessions we ran earlier:

Comparison report 

Update 2008-02-14: Some people commented on not finding the Analyze menu. This is only available in the Developer or Team Edition of Visual Studio. Click here for a full comparison of all versions.

Update 2008-05-29: Make sure to check my post on NDepend as well, as it offers even more insight in your code!

kick it on DotNetKicks.com

Indexing Word 2007 (docx) files with Zend_Search_Lucene

You may have noticed Microsoft released their Search Server 2008 a few weeks ago. Search Server delivers search capabilities to your organization quickly and easily. The PHP world currently does not have a full-featured solution like Search Server, but there's a building block which could be used to create something similar: Zend Framework's PHP port of Lucene. There is also a .NET port of Lucene available.

Lucene basically is an indexing and search technology, providing an easy-to-use API to create any type of application that has to do with indexing and searching. If you provide the right methods to extract data from any type of document, Lucene can index it. There are various indexer examples available for different file formats (PDF, HTML, RSS, ...), but none for Word 2007 (docx) files. Sounds like a challenge!

Source code

Want the full code? Download it here.

Prerequisites

Make sure you use PHP version 5.2, have php_zip and php_xml enabled, and have a working Zend Framework installation on your computer. Another useful thing is to have the Lucene manual pages aside along the way.

1. Creating an index

Creating an index Let's start with creating a Zend_Search_Lucene index. We will be needing the Zend Framework classes, so let's start with including them:

[code:c#] 

/** Zend_Search_Lucene */ 
require_once 'Zend/Search/Lucene.php';

[/code]

We will also be needing an index database. The following code snippets checks for an existing database first (in ./lucene_index/). If it exists, the snippets loads the index database, otherwise a new index database is created.

[code:c#]

// Index
$index = null;

// Verify if the index exists. If not, create it.
if (is_dir('./lucene_index/') == 1) {
    $index = Zend_Search_Lucene::open('./lucene_index/'); 
} else {
    $index = Zend_Search_Lucene::create('./lucene_index/');
}

[/code]

Now since the document root we are indexing might have different contents on every indexer run, let's first remove all documents from the existig index. Here's how:

[code:c#]

// Remove old indexed files
for ($i = 0; $i < $index->maxDoc(); $i++) {
    $index->delete($i);
}

[/code]

We'll create an index entry for the file test.docx. We will be adding some fields to the index, like the url where the original document can be found, the text of the document (which will be tokenized, indexed, but not completely stored as the index might grow too big fast!).

[code:c#]

// File to index
$fileToIndex = './test.docx';

// Index file
echo 'Indexing ' . $fileToIndex . '...' . "\r\n";

// Create new indexed document
$luceneDocument = new Zend_Search_Lucene_Document();

// Store filename in index
$luceneDocument->addField(Zend_Search_Lucene_Field::Text('url', $fileToIndex));

// Store contents in index
$luceneDocument->addField(Zend_Search_Lucene_Field::UnStored('contents', DocXIndexer::readDocXContents($fileToIndex)));

// Store document properties
$documentProperties = DocXIndexer::readCoreProperties($fileToIndex);
foreach ($documentProperties as $key => $value) {
    $luceneDocument->addField(Zend_Search_Lucene_Field::UnIndexed($key, $value));
}

// Store document in index
$index->addDocument($luceneDocument);

[/code]

After creating the index, there's one thing left: optimizing it. Zend_Search_Lucene offers a nice method to doing that in one line of code: $index->optimize(); Since shutdown of the index instance is done automatically, the $index->commit(); command is not necessary, but it's good to have it present so you know what happens at the end of the indexing process.

There, that's it! Our index (of one file...) is now ready! I must admit I did not explain all the magic... One piece of the magic is the DocXIndexer class whose method readDocXContents() is used to retrieve all text from a Word 2007 file. Here's how this method is built.

2. Retrieving the full text from a Word 2007 file

The readDocXContents() method mentioned earlier is the actual "magic" in this whole process. It basically reads in a Word 2007 (docx) file, loops trough all paragraphs and extracts all text runs from these paragraphs into one string.

A Word 2007 (docx) is a ZIP-file (container), which contains a lot of XML files. Some of these files describe a document and some of them describe the relationships between these files. Every XML file is validated against an XSD schema, which we'll define first:

[code:c#]

// Schemas
$relationshipSchema    = 'http://schemas.openxmlformats.org/package/2006/relationships';
$officeDocumentSchema     = 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument';
$wordprocessingMLSchema = 'http://schemas.openxmlformats.org/wordprocessingml/2006/main';

[/code]

The $relationshipSchema is the schema name that describes a relationship between the OpenXML package (the ZIP-file) and the containing XML file ("part"). The $officeDocumentSchema is the main document part describing it is a Microsoft Office document. The $wordprocessingMLSchema is the schema containing all Word-specific elements, such as paragrahps, runs, printer settings, ... But let's continue coding. I'll put the entire code snippet here and explain every part later:

[code:c#]

// Returnvalue
$returnValue = array();

// Documentholders
$relations         = null;

// Open file
$package = new ZipArchive(); // Make sure php_zip is enabled!
$package->open($fileName);

// Read relations and search for officeDocument
$relations = simplexml_load_string($package->getFromName("_rels/.rels"));
foreach ($relations->Relationship as $rel) {
    if ($rel["Type"] == $officeDocumentSchema) {
        // Found office document! Now read in contents...
        $contents = simplexml_load_string(
            $package->getFromName(dirname($rel["Target"]) . "/" . basename($rel["Target"]))
        );

        $contents->registerXPathNamespace("w", $wordprocessingMLSchema);
        $paragraphs = $contents->xpath('//w:body/w:p');

        foreach($paragraphs as $paragraph) {
            $runs = $paragraph->xpath('//w:r/w:t');
            foreach ($runs as $run) {
                $returnValue[] = (string)$run;
            }
        }

        break;
    }
}

// Close file
$package->close();

// Return
return implode(' ', $returnValue);

[/code]

The first thing that is loaded, is the main ".rels" document, which contains a reference to all parts in the root of this OpenXML package. This file is parsed using SimpleXML into a local variable $relations. Each relationship has a type ($rel["Type"]), which we compare against the $officeDocumentSchema schema name. When that schema name is found, we dig deeper into the document, parsing it into $contents. Next on our todo list: register the $wordprocessingMLSchema for running an XPath query on the document.

[code:c#]

$contents->registerXPathNamespace("w", $wordprocessingMLSchema);

[/code]

We can now easily run an XPath query "//w:body/w:p", which retrieves all w:p childs (paragraphs) of the document's body:

[code:c#]

$paragraphs = $contents->xpath('//w:body/w:p');

[/code]

The rest is quite easy. In each paragraph, we run a new XPath query "//w:r/w:t", which delivers all text nodes withing the paragraph. Each of these text nodes is then added to the $returnValue, which will represent all text content in the main document part upon completion.

[code:c#]

foreach($paragraphs as $paragraph) {
    $runs = $paragraph->xpath('//w:r/w:t');
    foreach ($runs as $run) {
        $returnValue[] = (string)$run;
    }
}

[/code]

3. Searching the index

Searching the index Searching the index starts the same way as creating the index: you first have to load the database. After loading the index database, you can easily run a query on it. Let's search for the keywords "Code Access Security":

[code:c#]

// Search query
$searchFor = 'Code Access Security';

// Search in index
echo sprintf('Searching for: %s', $searchFor) . "\r\n";
$hits = $index->find( $searchFor );

echo sprintf('Found %s result(s).', count($hits)) . "\r\n";
echo '--------------------------------------------------------' . "\r\n";

foreach ($hits as $hit) {
    echo sprintf('Score: %s', $hit->score) . "\r\n";
    echo sprintf('Title: %s', $hit->title) . "\r\n";
    echo sprintf('Creator: %s', $hit->creator) . "\r\n";
    echo sprintf('File: %s', $hit->url) . "\r\n";
    echo '--------------------------------------------------------' . "\r\n";
}

[/code]

There you go! That's all there is to it. Want the full code? Download it here: LuceneIndexingDOCX.zip (96.03 kb)

Books I recently read...

A while ago, I was contacted by the people of Packt Publishing asking me to review two of their latest books, ASP.NET Data Presentation Controls Essentials (by Joydip Kanjilal) and LINQ Quickly (by N. Satheesh Kumar). Since both books stated something about LINQ on the back-cover, and me wanting to read more on that matter, I engaged into reviewing them.

ASP.NET Data Presentation Controls Essentials

image Being an ASP.NET developer, I'm not new to ASP.NET's data bound controls. Upon receiving the book, I immediately knew this was not going to be new stuff to me, a thought which proved right. Nevertheless, the book has a value!

The author starts with a birds-eye overview of all data bound controls in ASP.NET, allowing the reader to immediately know what's possible with these controls. In the next chapters, each control is covered with some examples. Luckily, the book did not contain "full" examples yet short, to-the-point code snippets delivering a practical approach to solving a specific data binding scenario.

This leads me to agreeing with the author that "this book is not for beginners". Some snippets contain small typo's, and because a full frame around these snippets is missing, someone new to C# and ASP.NET will have some problems creating their own examples based on the book. It's also not for professionals to learn much new things, but I think anyone familiar with a C# background wanting to start data bound ASP.NET development can benefit from this no-nonsense quickstart book.

For me, this book will find a place on my book shelf along with other reference books, as it covers all data binding scenario's from ASP.NET 1.1 over ASP.NET 2.0 up to some snippets of data binding to LINQ data sources.

LINQ Quickly

image Since the release of the Microsoft .NET framework 3.5 a few months ago, I've been playing around with LINQ (Language Integrated Query) a couple of times. It is a uniform method of accessing any type of data (Collections, XML, database, ...) in a "natural" way.

The book covers all concepts of LINQ. It starts with an overview of all language elements that have been added to the .NET framework in order to enable the use of LINQ, such as anonymous types, object/collection initializers, expressions, partial methods, ... The next chapters all cover a LINQ implementation such as LINQ to Objects, LINQ to SQL, LINQ to Entities, LINQ to XML, LINQ to DataSets, ... The last chapter gives an overview of all LINQ methods and their use with a short code example that immediately clarifies the purpose of each method.

I found the LINQ to Entities chapter quite interesting, as it clearly explains how to create your class diagram using the right decorators for generating a database schema. Another useful thing are the appendices near the end of the book, covering LINQ to "anything" (there's an example on LINQ to Outlook Contacts).

To be honest, I'm quite enthousiast about this book. If you are a .NET developer and looking for a "one-night read" to-the-point book about LINQ and all related matter, this book is your friend!

ASP.NET Session State Partitioning using State Server Load Balancing

It seems like amount of posts on ASP.NET's Session State keeps growing. Here's the list:

Yesterday's post on Session State Partitioning used a round-robin method for partitioning session state over different state server machines. The solution I presented actually works, but can still lead to performance bottlenecks.

Let's say you have a web farm running multiple applications, all using the same pool of state server machines. When having multiple sessions in each application, the situation where one state server handles much more sessions than another state server could occur. For that reason, ASP.NET supports real load balancing of all session state servers.

Download example

Want an instant example? Download it here: SessionPartitioning2.zip (4.16 kb)
Want to know what's behind all this? Please, continue reading.

What we want to achieve...

Here's a scenario: We have different applications running on a web farm. These applications all share the same pool of session state servers. Whenever a session is started, we want to store it on the least-busy state server.

1. Performance counters

To fetch information on the current amount of sessions a state server is storing, we'll use the performance counters ASP.NET state server provides. Here's a code snippet:

[code:c#]

if (PerformanceCounterCategory.CounterExists("State Server Sessions Active", "ASP.NET", "STATESERVER1")) {
    PerformanceCounter pc = new PerformanceCounter("ASP.NET", "State Server Sessions Active", "", "STATESERVER1");
    float currentLoad = pc.NextValue();
}

[/code]

2. Creating a custom session id

Somehow, ASP.NET will have to know on which server a specific session is stored. To do this, let's say we make the first character of the session id the state server id from the following IList:

[code:c#]

IList<StateServer> stateServers = new List<StateServer>();

// Id 0, example session id would be 0ywbtzez3eqxut45ukyzq3qp
stateServers.Add(new StateServer("tcpip=10.0.0.1:42424", "sessionserver1"));

// Id 1, example session id would be 1ywbtzez3eqxut45ukyzq3qp
stateServers.Add(new StateServer("tcpip=10.0.0.2:42424", "sessionserver2"));

[/code]

Next thing we'll have to do is storing these list id's in the session id. For that, we will implement a custom System.Web.SessionState.SessionIDManager class. This class simply creates a regular session id, locates the least-busy state server instance and assign the session to that machine:

[code:c#]

using System;
using System.Diagnostics;


public class SessionIdManager : System.Web.SessionState.SessionIDManager
{
    public override string CreateSessionID(System.Web.HttpContext context)
    {
        // Generate a "regular" session id
        string sessionId = base.CreateSessionID(context);

        // Find the least busy state server
        StateServer leastBusyServer = null;
        float leastBusyValue = 0;
        foreach (StateServer stateServer in StateServers.List)
        {
            // Fetch first state server
            if (leastBusyServer == null) leastBusyServer = stateServer;

            // Fetch server's performance counter
            if (PerformanceCounterCategory.CounterExists("State Server Sessions Active", "ASP.NET", stateServer.ServerName))
            {
                PerformanceCounter pc = new PerformanceCounter("ASP.NET", "State Server Sessions Active", "", stateServer.ServerName);
                if (pc.NextValue() < leastBusyValue || leastBusyValue == 0)
                {
                    leastBusyServer = stateServer;
                    leastBusyValue = pc.NextValue();
                }
            }
        }

        // Modify session id to contain the server's id
        // We will change the first character in the string to be the server's id in the
        // state server list. Notice that this is only for demonstration purposes! (not secure!)
        sessionId = StateServers.List.IndexOf(leastBusyServer).ToString() + sessionId.Substring(1);

        // Return
        return sessionId;
    }
}

[/code]

The class we created will have to be registered in web.config. Here's how:

[code:c#]

<configuration>
  <system.web>
    <!-- ... -->
    <sessionState mode="StateServer"
                partitionResolverType="PartitionResolver"
                sessionIDManagerType="SessionIdManager" />
    <!-- ... -->
  </system.web>
</configuration>

[/code]

You notice our custom SessionIdManager class is now registered to be the sessionIDManager. The PartitionResolver I blogged about is also present in a modified version.

3. Using the correct state server for a specific session id

In the previous code listing, we assigned a session to a specific server. Now for ASP.NET to read session state from the correct server, we still have to use the PartitionResolver class:

[code:c#]

using System;


public class PartitionResolver : System.Web.IPartitionResolver
{

    #region IPartitionResolver Members

    public void Initialize()
    {
        // No need for this!
    }

    public string ResolvePartition(object key)
    {
        // Accept incoming session identifier
        // which looks similar like "2ywbtzez3eqxut45ukyzq3qp"
        string sessionId = key as string;

        // Since we defined the first character in sessionId to contain the
        // state server's list id, strip it off!
        int stateServerId = int.Parse(sessionId.Substring(0, 1));

        // Return the server's connection string
        return StateServers.List[stateServerId].ConnectionString;
    }

    #endregion

}

[/code]

kick it on DotNetKicks.com 

ASP.NET Session State Partitioning

After my previous blog post on ASP.NET Session State, someone asked me if I knew anything about ASP.NET Session State Partitioning. Since this is a little known feature of ASP.NET, here's a little background and a short how-to.

When scaling out an ASP.NET application's session state to a dedicated session server (SQL server or the ASP.NET state server), you might encounter a new problem: what if this dedicated session server can't cope with a large amount of sessions? One option might be to create a SQL server cluster for storing session state. A cheaper way is to implement a custom partitioning algorithm which redirects session X to state server A and session Y to state server B. In short, partitioning provides a means to divide session information on multiple session state servers, which all handle "their" part of the total amount of sessions.

Download example 

Want an instant example? Download it here: SessionPartitioning.zip (2.70 kb)
 Want to know what's behind all this? Please, continue reading.

1. Set up ASP.NET session mode

Follow all steps in my previous blog post to set up the ASP.NET state service / SQL state server database and the necessary web.config setup. We'll customise this afterwards.

2.   Create your own session state partitioning class

The "magic" of this el-cheapo solution to multiple session servers will be your own session state partitioning class. Here's an example:

[code:c#]

using System;

public class PartitionResolver : System.Web.IPartitionResolver
{

    #region Private members

    private String[] partitions;

    #endregion

    #region IPartitionResolver Members

    public void Initialize()
    {
        // Create an array containing
        // all partition connection strings
        //
        // Note that this could also be an array
        // of SQL server connection strings!
        partitions = new String[] {      
            "tcpip=10.0.0.1:42424",   
            "tcpip=10.0.0.2:42424",       
            "tcpip=10.0.0.3:42424"
        };
    }

    public string ResolvePartition(object key)
    {
        // Accept incoming session identifier
        // which looks similar like "2ywbtzez3eqxut45ukyzq3qp"
        string sessionId = key as string;

        // Create your own manner to divide session id's
        // across available partitions or simply use this one!
        int partitionID = Math.Abs(sessionId.GetHashCode()) % partitions.Length;
        return partitions[partitionID];
    }

    #endregion
}

[/code]

Basically, you just have to implement the interface System.Web.IPartitionResolver, which is the contract ASP.NET uses to determine the session state server's connection string. The ResolvePartition method is called with the current session id in it, and allows you to return the connection string that should be used for that specific session id.

3. Update your web.config

Most probably, you'll have a web.config which looks like this:

[code:xml]

<configuration>
  <system.web>
    <!-- ... -->
    <sessionState
        mode="StateServer"
        stateConnectionString="tcpip=your_server_ip:42424" />
    <!-- ... -->
  </system.web>
</configuration>

[/code]

In order for ASP.NET to use our custom class, modify web.config into:

[code:xml]

<configuration>
  <system.web>
    <!-- ... -->
    <sessionState
        mode="StateServer"
        partitionResolverType="PartitionResolver" />
    <!-- ... -->
  </system.web>
</configuration>

[/code]

You may have noticed that the stateConnectionString attribute was replaced by a partitionResolverType attribute. From now on, ASP.NET will use the class specified in the partitionResolverType attribute for distributing sessions across state servers.

UPDATE 2008-01-24: Also check out my blog post on Session State Partitioning using load balancing!

kick it on DotNetKicks.com

LINQ for PHP (Language Integrated Query for PHP)

Perhaps you have already heard of C# 3.5's "LINQ" component. LINQ, or Language Integrated Query, is a component inside the .NET framework which enables you to perform queries on a variety of data sources like arrays, XML, SQL server, ... These queries are defined using a syntax which is very similar to SQL.

There is a problem with LINQ though... If you start using this, you don't want to access data sources differently anymore. Since I'm also a PHP developer, I thought of creating a similar concept for PHP. So here's the result of a few days coding:

PHPLinq - LINQ for PHP - Language Integrated Query

A basic example

Let's say we have an array of strings and want to select only the strings whose length is < 5. The PHPLinq way of achieving this would be the following:

[code:c#]

// Create data source
$names = array("John", "Peter", "Joe", "Patrick", "Donald", "Eric");

$result = from('$name')->in($names)
            ->where('$name => strlen($name) < 5')
            ->select('$name');

[/code]

Feels familiar to SQL? Yes indeed! No more writing a loop over this array, checking the string's length, and adding it to a temporary variable.

You may have noticed something strange... What's that $name => strlen($name) < 5 doing? This piece of code is compiled to an anonymous function or Lambda expression under the covers. This function accepts a parameter $name, and returns a boolean value based on the expression strlen($name) < 5.

An advanced example

There are lots of other examples available in the PHPLinq download, but here's an advanced one... Let's say we have an array of Employee objects. This array should be sorted by Employee name, then Employee age. We want only Employees whose name has a length of 4 characters. Next thing: we do not want an Employee instance in our result. Instead, the returning array should contain objects containing an e-mail address and a domain name.

First of all, let's define our data source:

[code:c#]

class Employee {
    public $Name;
    public $Email;
    public $Age;

    public function __construct($name, $email, $age) {
        $this->Name     = $name;
        $this->Email     = $email;
        $this->Age        = $age;
    }
}

$employees = array(
    new Employee('Maarten', 'maarten@example.com', 24),
    new Employee('Paul', 'paul@example.com', 30),
    new Employee('Bill', 'bill.a@example.com', 29),
    new Employee('Bill', 'bill.g@example.com', 28),
    new Employee('Xavier', 'xavier@example.com', 40)
);

[/code]

Now for the PHPLinq query:

[code:c#]

$result = from('$employee')->in($employees)
            ->where('$employee => strlen($employee->Name) == 4')
            ->orderBy('$employee => $employee->Name')
            ->thenByDescending('$employee => $employee->Age')
            ->select('new {
                    "EmailAddress" => $employee->Email,
                    "Domain" => substr($employee->Email, strpos($employee->Email, "@") + 1)
                  }');

[/code]

Again, you may have noticed something strange... What's this new { } thing doing? Actually, this is converted to an anonymous type under the covers. new { "name" => "test" } is evaluated to an object containing the property "name" with a value of "test".

This all sounds intuitive, interesting and very handy? Indeed! Now make sure you download a copy of PHPLinq today, try it, and provide the necessary feedback / feature requests on the CodePlex site.

Preview Word files (docx) in HTML using ASP.NET, OpenXML and LINQ to XML

Since an image (or even an example) tells more than any text will ever do, here's what I've created in the past few evening hours:

image

Live examples:

Want the source code? Download it here: WordVisualizer.zip (357.01 kb)

Want to know how?

If you want to know how I did this, let me first tell you why I created this. After searching Google for something similar, I found a Sharepoint blogger who did the same using a Sharepoint XSL transformation document called DocX2Html.xsl. Great, but this document can not be distributed without a Sharepoint license. The only option for me was to do something similar myself.

ASP.NET handlers

The main idea of this project was to be able to type in a URL ending in ".docx", which would then render a preview of the underlying Word document. Luckily, ASP.NET provides a system of creating HttpHandlers. A HttpHandler is the class instance which is called by the .NET runtime to process an incoming request for a specific extension. So let's trick ASP.NET into believing ".docx" is an extension which should be handled by a custom class...

Creating a custom handler

A custom handler can be created quite easily. Just create a new class, and make it implement the IHttpHandler interface:

[code:c#]

/// <summary>
/// Word document HTTP handler
/// </summary>

public class WordDocumentHandler : IHttpHandler
{
    #region IHttpHandler Members

    /// <summary>
    /// Is the handler reusable?
    /// </summary>

    public bool IsReusable
    {
        get { return true; }
    }

    /// <summary>
    /// Process request
    /// </summary>
    /// <param name="context">Current http context</param>

     public void ProcessRequest(HttpContext context)
    {
        // Todo...

        context.Response.Write("Hello world!");
    }

    #endregion
}

[/code]

Registering a custom handler

For ASP.NET to recognise our newly created handler, we must register it in Web.config:

image

Now if you are using IIS6, you should also register this extension to be handled by the .NET runtime:

image

In the application configuration, add the extension ".docx" and make it point to the following executable: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll

This should be it. Fire up your browser, browse to your web site and type anything.docx. You should see "Hello world!" appearing in a nice, white page.

OpenXML

As you may already know, Word 2007 files are OpenXML packages containg WordprocessingML markup. A .docx file can be opened using the System.IO.Packaging.Package class (which is available after adding a project reference to WindowsBase.dll).

The Package class is created for accessing any OpenXML package. This includes all Office 2007 file formats, but also custom OpenXML formats which you can implement for yourself. Unfortunately, if you want to use Package to access an Office 2007 file, you'll have to implement a lot of utility functions to get the right parts from the OpenXML container.

Luckily, Microsoft released an OpenXML SDK (CTP), which I also used in order to create this Word preview handler.

LINQ to XML

As you know, the latest .NET 3.5 release brought us something new & extremely handy: LINQ (Language Integrated Query). On Doug's blog, I read about Eric White's attempts to use LINQ to XML on OpenXML.

LINQ to OpenXML

For implementing my handler, I basically used similar code to Eric's to run query's on a Word document's contents. Here's an example which fetches all paragraphs in a Word document:

[code:c#]

using (WordprocessingDocument document = WordprocessingDocument.Open("test.docx", false))
{
    // Register namespace

    XNamespace w = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

    // Element shortcuts

    XName w_r = w + "r";
    XName w_ins = w + "ins";
    XName w_hyperlink = w + "hyperlink";

    // Load document's MainDocumentPart (document.xml) in XDocument

    XDocument xDoc = XDocument.Load(
        XmlReader.Create(
            new StreamReader(document.MainDocumentPart.GetStream())
        )
    );

    // Fetch paragraphs

    var paragraphs = from l_paragraph in xDoc
                    .Root
                    .Element(w + "body")
                    .Descendants(w + "p")
         select new
         {
             TextRuns = l_paragraph.Elements().Where(z => z.Name == w_r || z.Name == w_ins || z.Name == w_hyperlink)
         };

    // Write paragraphs

    foreach (var paragraph in paragraphs)
    {
        // Fetch runs

        var runs = from l_run in paragraph.Runs
                   select new
                   {
                       Text = l_run.Descendants(w + "t").StringConcatenate(element => (string)element)
                   };

        // Write runs

        foreach (var run in runs)
        {
            // Use run.Text to fetch a text string

            Console.Write(run.Text);
        }
    }
}

[/code]

Now if you run this code, you will notice a compilation error... This is due to the fact that I used an extension method StringConcatenate.

Extension methods

In the above example, I used an extension method named StringConcatenate. An extension method is, as the name implies, an "extension" to a known class. In the following example, find the extension for all IEnumerable<T> instances:

[code:c#]

public static class IEnumerableExtensions
{
    /// <summary>
    /// Concatenate strings
    /// </summary>
    /// <typeparam name="T">Type</typeparam>
    /// <param name="source">Source</param>
    /// <param name="func">Function delegate</param>
    /// <returns>Concatenated string</returns>

    public static string StringConcatenate<T>(this IEnumerable<T> source, Func<T, string> func)
    {
        StringBuilder sb = new StringBuilder();
        foreach (T item in source)
            sb.Append(func(item));
        return sb.ToString();
    }
}

[/code]

Lambda expressions

Another thing you may have noticed in my example code, is a lambda expression:

[code:c#]

z => z.Name == w_r || z.Name == w_ins || z.Name == w_hyperlink.

[/code]

A lambda expression is actually an anonymous method, which is called by the StringConcatenate extension method. Lambda expressions always accept a parameter, and return true/false. In this case, z is instantiated as an XNode, returning true/false depending on its Name property.

Wrapping things up...

If you read this whole blog post, you may have noticed that I extensively used C# 3.5's new language features. I combined these with OpenXML and ASP.NET to create a useful Word document preview handler. If you want the full source code, download it here: WordVisualizer.zip (357.01 kb).

kick it on DotNetKicks.com