Maarten Balliauw {blog}

ASP.NET MVC, Microsoft Azure, PHP, web development ...

NAVIGATION - SEARCH

OpenXML in Healthcare in PHP

Here's a cool present just before the weekend... 2 days ago, Wouter posted on his blog about an article he co-operated on for MSDN: OpenXML in Healthcare.

Being both a Microsoft and PHP fan (yes, you can curse me, I don't care), I thought of porting (part of) the sample code from his article into PHP. Except for the document signing, as I did not have many time to write this sample code...

The scenario for the article is quite simple: Contoso provides a central medical records database. Whenever a physician has to register a new patient, he downloads a Word 2007 document from the Contoso server, fills it out, and uploads it back. Contoso then strips out the necessary data and saves it back in their systems.

 

This Word 2007 document is crafted around embedded custom XML data, which is displayed and edited using Word 2007. In short: to do the above exercise, you just need to strip out the custom XML and you're done.

Stripping out the custom XML is also quite easy. First, locate the main relationships part in the OpenXML package. Then, search it for the main document part. From there, loop over the relationships to this document part and look for any relationship of the type "http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml". When that one's found, you just need to parse the referenced document and you're done!

Want to see a demo? Check this out.
Want the sample code? 2007HealthCareSamplePHP.zip (49.76 kb)
 Want the OpenXML background? Read the original article.

Generic arrays in PHP

Assuming everyone knows what generics are, let's get down to business right away. PHP does not support generics or something similar, though it could be very useful in PHP development.  Luckily, using some standard OO-practises, a semi-generic array can easily be created, even in multiple ways! Here's the road to PHP generics. 

The hard way...

One of the roads to PHP generics is some simple inheritance and type hinting. Let's have a look at PHP's ArrayObject. This class has 2 interesting methods, namely offsetSet() and append(). This would mean I can simply create a new class which inherits from ArrayObject, and uses type hinting to restrict some additions:

[code:c#]

// Example class
class Example {
  public $SomeProperty;
}

// Example class generic ArrayObject
class ExampleArrayObject extends ArrayObject {
  public function append(Example $value) {
    parent::append($value);
  }

  public function offsetSet($index, Example $value) {
    parent::offsetSet($index, $value);
  }
}


// Example additions
$myArray = new ExampleArrayObject();
$myArray->append( new Example() ); // Works fine
$myArray->append( "Some data..." ); // Will throw an Exception!

[/code]

The flexible way

There are some disadvantages to the above solution. For a start, you can't create a generic "string" array unless you encapsulate strings in a specific object type. Same goes for other primitive types. Let's counter this problem! Here's the same code as above using a "GenericArrayObject":

[code:c#]

// Example class
class Example {
  public $SomeProperty;
}

// Validation function
function is_class_example($value) {
  return $value instanceof Example;
}

/**
 * Class GenericArrayObject
 *
 * Contains overrides for ArrayObject methods providing generics-like functionality.
 *
 * @author    Maarten Balliauw
 */
class GenericArrayObject extends ArrayObject {
    /**
     * Validation function
     *
     * @var     string
     * @access    private
     */
    private $_validationFunction = '';
       
    /**
     * Set validation function
     *
     * @param     string    $functionName    Validation function
     * @throws     Exception
     * @access    public
     */
    public function setValidationFunction($functionName = 'is_string') {
        if ($this->_validationFunction == '') {
            $this->_validationFunction = $functionName;
            return;
        }
       
        $iterator = $this->getIterator();
        while ($iterator->valid()) {
            if (!call_user_func_array($functionName, array($iterator->current()))) {
                throw new Exception("Switching from " . $this->_validationFunction . " to " . $functionName . " is not possible for all elements.");
            }
           
            $iterator->next();
        }
       
        $this->_validationFunction = $functionName;
    }
   
    /**
     * Append
     *
     * @param     mixed    $value
     * @throws     Exception
     * @access    public
     */
    public function append($value) {
        if ($this->_validationFunction == '') {
            throw new Exception("No validation function has been set.");
        }
       
        if (call_user_func_array($this->_validationFunction, array($value))) {
            parent::append($value);
        } else {
            throw new Exception("Appended type does not meet constraint " . $this->_validationFunction);
        }
    }
   
    /**
     * offsetSet
     *
     * @param     mixed    $index
     * @param     string    $newval
     * @throws     Exception
     * @access    public
     */
    public function offsetSet($index, $newval) {
        if ($this->_validationFunction == '') {
            throw new Exception("No validation function has been set.");
        }
       
        if (call_user_func_array($this->_validationFunction, array($newval))) {
            parent::offsetSet($index, $newval);
        } else {
            throw new Exception("Appended type does not meet constraint " . $this->_validationFunction);
        }
    }
}

// Example additions
$myArray = new GenericArrayObject();
$myArray->setValidationFunction('is_class_example');
$myArray->append( new Example() ); // Works fine
$myArray->append( "Some data..." ); // Will throw an Exception!

[/code]

Using this flexible class, you can simply set a validation function on the GenericArrayObject, which enabels you to use PHP's built-in functions like is_string (string-only ArrayObject), is_int, ... You can even write a small validation function which matches a string against a regular expression and for example create an e-mail address ArrayObject rejecting any string that does not match this regular expression.

Creating Office2007 documents in C#

I've been too busy implementing SpreadsheetML in PHP, that I completely forgot to write something on OpenXML and C# on my blog. Luckily, Joris did now: Generating Office 2007 documents in C#. Now learn some PHP too, Joris, I can use a hand on PHPExcel [:P]

Here are some additional links to get you started:

PHP zip:// and parse_url...

After having a few months of problems using PHP and fopen('zip://something.xlsx#xl/worksheets/sheet1.xml', 'r'), I finally found the reason of that exact line of code giving errors on some PC's... Assuming this uses a call to parse_url under the hood, I tried parsing this, resulting in the following URL parts:

Array
(
   [scheme] => zip
   [host] => something.xlsx#xl
   [path] => /worksheets/sheet1.xml
)

That's just not correct... Parsing should return the following:

Conclusion: beware when using parse_url and the zip file wrapper!

PHPExcel 1.3.5 released

Just a quick note on the new PHPExcel 1.3.5 release. There are some cool new features included!

One of the new features is rich text: one can now write coloured and styled text in a cell. Here's an example of how the feature demo result file looks:

This is of course not all. Jakub had a couple of sleepless nights, but managed to port in the PEAR Spreadsheet classes. Meaningless? No! PHPExcel now supports Excel2007 and older versions, too. Want to write an Excel document for Excel200? No problem:

[code:c#]

$objWriter = new PHPExcel_Writer_Excel5($objPHPExcel);
$objWriter->save('excel2000file.xls');

[/code]

There's even a cooler part related to this, and that is .xlsx to .xls conversion! Here's how:

[code:c#]

$objReader = new PHPExcel_Reader_Excel2007;
$objPHPExcel = $objReader->load('excel2007file.xlsx');

$objWriter = new PHPExcel_Writer_Excel5($objPHPExcel);
$objWriter->save('excel2000file.xls');

[/code]

As always, you can get the new release on www.phpexcel.net!

OpenXML news overview

A lot of news around OpenXML these days, so I decided to bundle some things into one big blog post.

1. Microsoft released a Microsoft SDK for Open XML Formats

In .NET 3.0, there's the System.IO.Packaging API, which allows programmatic access to OpenXML packages (amongst them Office2007 files).
Since this API is quite low-level, the Microsoft people introduced a new SDK built on top of System.IO.Packaging, which allows you to use strongly typed classes for document parts. Checkout a code sample on Wouter's blog and see for yourself: this SDK provides access to an OpenXML package in a much easier way than System.IO.Packaging. Download the SDK here.

2. PackageExplorer 3.0 beta

Wouter released a new (beta) version of his PackageExplorer, and I assume he uses the new SDK mentioned above. Main new feature seems to be adding document parts using a template system, allowing you and I to create an OpenXML package using a set pre-defined templates. You can download PackageExplorer on CodePlex.

3. Altova XML Spy supports OpenXML

I saw this on Altova's website: "XMLSpy provides powerful support for accessing, editing, transforming, and querying XML data saved in Microsoft® Office 2007 documents and other zipped files."
Says enough, I think. You can download a free trial to check if this all is true.

4. Trying to compile PHPExcel using Phalanger

The last few days, I've been trying to compile PHPExcel to a .NET class library using Phalanger. Phalanger is a PHP compiler for .NET.
Compiling works quite well, but not all class definitions are compiled into a usable .NET alternative... Creating and saving a Spreadshete currently works, but adding data into cells doesn't. I guess thats a feature that can not be missing :-)
I'll keep you informed on the progress of this. If anyone feels interested in porting this PHP library to C#, please contact me!

5. OpenXML for JAVA

JAVA people now also have an OpenXML library: OpenXML4J. Not production-stable yet, but alpha versions are available.

New PHPExcel release: 1.3.0

The new version of PHPExcel has just been released, bringing 1.3.0 to the public. New features include formula calculation, inserting and removing columns/rows, auto-sizing columns, freezing panes, ...

One of the new features in PHPExcel is formula calculation. Just like Excel or any other spreadsheet application, PHPExcel now provides support for calculating certain cell values, using a formula. For example, the formula "=SUM(A1:A10)" evaluates to the sum of values in A1, A2, ..., A10.

Have a look at this: if you write the following line of code in the invoice demo included with PHPExcel, it evaluates to the value "64":

$objPHPExcel->getActiveSheet()->getCell('E11')->getCalculatedValue();
 

Another nice feature of PHPExcel's formula parser, is that it can automatically adjust a formula when inserting/removing rows/columns. Here's an example:

You see that the formula contained in cell E11 is "SUM(E4:E9)". Now, when I write the following line of code, two new product lines are added:

$objPHPExcel->getActiveSheet()->insertNewRowBefore(7, 2);
 

Did you notice? The formula in the former cell E11 (now E13, as I inserted 2 new rows), changed to "SUM(E4:E11)". Also, the inserted cells duplicate style information of the previous cell, just like Excel's behaviour.

Curious about all this? Want to play with it? Find the source code download and demo code on www.phpexcel.net!

Excel Formula Parsing using PHP?

One of the new (planned) features of PHPExcel is to implement parsing and calculating Excel formulas. One thing every developer should do is not to try to reinvent the wheel. Therefore, a Google search learned me someone wrote a Excel expression parser in JavaScript, which parses an expression into a tree.

Parsing Excel formulas (expressions) in JavaScript is done here. Someone ported this to C# too, and as of today, it is ported to PHP5 too 8-).

The only thing left to do is building this into PHPExcel, and performing calculations using the parsed tree...

Excel, OpenXML and PHP

Yay! My new article on Excel, OpenXML and PHP has just been released in php|architect! A copy-paste action from http://www.phparch.com/issue.php?mid=102:

"A few months ago, Microsoft released Office 2007, a version of their office suite that generates open source documents. Here, Maarten Balliauw gives an overview of the history and the politics surrounding the release before moving on to introduce his PHPExcel project, an early adoption of Microsoft's OpenXML API that enables Excel 2007 spreadsheets to be generated with PHP."

Workaround for PHP file_exists on ZIP file contents

Recently, I was writing some PHP code, to check if a specific file existed in a ZIP file. PHP has this special feature called "stream wrappers", which basically is a system which enables PHP to do I/O operations on streams.

A stream can be a file, a socket, a SSH connection, ... Each of these streams has its own wrapper, which serves as an adapter between PHP and the underlying resource. This enables PHP to do, for example, a file_get_contents() on all sorts of streams.

Assuming regular PHP file functions would be sufficient, I coded the following:

[code:c#]

if (file_exists('zip://some_file.zip#readme.txt')) { ... }

[/code]

Knowing that the file readme.txt existed in the ZIP file, I was surprised that this function always returned false... What was even more surprising, is that a file_get_contents('zip://some_file.zip#readme.txt') returned readme.txt's data!

The reason for this is unknown to me, but I've written a (dirty) workaround that you can use:

[code:c#]

function zipped_file_exists($pFileName = '') {
    if ($pFileName != '') {
        $fh = fopen($pFileName, 'r');
        $exists = ($fh !== false);
        fclose($fh);

        return $exists;
    }

    return false;
}

[/code]