Monday, August 4, 2008

Office 2007 Format


Microsoft has brought a revolutionary change in office Application through Office 2007 Launch. The biggest change, for me is its XML Format. Now, Office document is not a mere document anymore. The new office documents behave like a Container too, that means, you can store any data in an office document. For example, you can store any bussiness information in an excel file which will not be visible to users at all.

I will take an example of a simple excel file, say myExcel.xlsx. If you add a zip extension(convert the name to and open it with compressed zip folder, you can see that the excel file is a container(a folder) actually. Because, the data in office 2007 is stored in a container in the form of different files. You will see a file named "[Content_Types].xml" and along with few folders(normally three folders). These three folders are the containers where in all the data of the file store in the form of different XML Files. Whatever you will store in the excel file, it will appear in the XML file, one way or another.

One very good example of how the document is structured or used, is here:

If I include a chart in a sheet, it will create few more xml files. For ex, \xl\charts\chart1.xml. This XML will contain all the data for chart and the formatting of the chart too. It will contain, that how many series does the chart have, what is the range of the chart, what type of chart is it etc.

Apart of the above features, you can also store your own XML Data in office document. The Office 2007 XML file format accommodates custom business data, and allows for programmatic access to it when files are loaded. So the data in Custom XML will not be directly visible to the user, but you can load it on the runtime and use it programmatically. A very good link from Brian Jones is

You can add the CustomXML while using the CustomXMLParts Object (given in the Office Object Model). Here is the link of MSDN

Dim cParts As CustomXMLParts
Set cParts = ThisWorkbook.CustomXMLParts

Create an xmlFile and save it on desktop. It can contain any information related to your bussiness requirements:

< ?xml version="1.0" ? >
< BhandariInfotech >
< ChartSheet Id="fChart" >Chart Collection< /ChartSheet >
< DataSheet Id="fChart" >Data Collection< /DataSheet >
< ChartSheet Id="sChart" >Charts< /ChartSheet >
< DataSheet Id="sChart" >Datas< /DataSheet >
< /BhandariInfotech >

cParts.Add "C:\A.xml"


The above data will include the above XML Tags in an XML File and store it in the excel file itself. This data will not be visible to user, but you can use it programmatically to access the data. You can also add your critical Business information, which you have to use in the document, but you don't want the users to see that information. These information can be your revenue figures, your Balance Sheet info...etc.

The XML Format is very modular and clean too. Every single thing in the Office Documents, is stored in different XML Files. Be it your Chart Data, be it your sheet data, be it your Name ranges...everything is stored in different XML files. These all files can be accessed programmatically to extend the functionalities of office applications. The idea of keeping all the files separate is to make sure that a user should be able to access a specific type of data, without touching the other files in the container. Earlier, suppose if my one sheet is corrupted, then the other sheets in the workbook will not be effected because the data is kept in separate XML files for different sheets. If I have 200+ sheets in my workbook, and I want to access only one sheet frequently, then opening a workbook with 200+ sheets can be a cumbersum task which will involve long time in opening and closing the workbook again and again. The other option is, access the Office Document Programmatically and copy only that sheet which you require in a new workbook and open that new workbook.