Two good ways to work with Excel files in C#

I have long been a fan of LinqToExcel, available on Google Code here. It makes reading an Excel file a breeze. There are lots of different ways to write the code, but it can be as simple as this:

var excel = new ExcelQueryFactory("excelFileName"); 
var indianaCompanies = 
  from c in excel.Worksheet<Company>()                        
 where c.State == "IN" 
  select c;

 

Note that Worksheet<T> is called with the type Company, so the code actually matches the column names in the worksheet with the properties of the type Company!

Now, reading Excel files is all well and good, but what about writing them? Well, LinqToExcel wont do that right now, but there is another library called ExcelPackage on CodePlex here. ExcelPackage can both read and write Excel, but has a much more primitive interface than LinqToExcel, basically you have to resort to reading and writing Cell(x,y) in String format:

using (ExcelPackage xlPackage = new ExcelPackage(existingFile))
{
 // get the first worksheet in the workbook
  ExcelWorksheet worksheet = xlPackage.Workbook.Worksheets[1];
 int iCol = 2;  // the column to read
 
 // output the data in column 2
 for (int iRow = 1; iRow < 6; iRow++)
    Console.WriteLine("Cell({0},{1}).Value={2}", iRow, iCol, 
      worksheet.Cell(iRow, iCol).Value);
 
 // output the formula in row 6
  Console.WriteLine("Cell({0},{1}).Formula={2}", 6, iCol, 
    worksheet.Cell(6, iCol).Formula);
 
} // the using statement calls Dispose() which closes the package.

 

ExcelPackage doesn’t understand Column Headers or know how far the data goes down the page, nor automatically map to properties of your own classes. But, it does work on 64 bits!

Come again? It does work on 64 bits? Doesn’t any .NET code recompile to 64 bits? Hmmm…kind of. It turns out, however that LinqToExcel actually uses the good old JET driver to do the Excel file reading, which only exists in 32 bits. In order for this to work, you have to compile your project as x86. If you can live with that, fine. But if you are using this in a piece of 64 bit server code, you are out of luck. Actually, this is not limited to LinqToExcel, you will find that all kinds of handy Excel reading frameworks use this 32 bit driver!

ExcelPackage, on the other hand, is a wrapper around the .NET 3.0 System.IO.Packaging API and the new Office Open XML file format. It can only read XLSX files, not the old XLS format, so if you need to read both you are out of luck. You could use any manner of XML savvy frameworks to read these XML files, but there is a lot of work in understanding the structure of these massive files, so having a framework certainly makes it easier. BTW, ExcelPackage also has one other problem, it chokes on characters that are illegal in XPath expressions, for instance the single quote, so a string like “John’s blog” in an Excel file cannot be written! There are some tentative bug fixes on CodePlex, and you can also work around it by replacing the forbidden expressions, for instance I do this:

 private string SafeString(string p)
        {
 return p.Replace("&", "&amp;")
                .Replace("<", "&lt;")
                .Replace(">", "&gt;")
                .Replace("\"", "&quot;")
                .Replace("'", "&apos;");
        }

Another option is to use the COM automation API to work with Excel, which opens up a host of powerful features, and with the dynamic COM Interop support in C# 4 it can be pretty compelling, see this snippet from DevX:

using Excel = Microsoft.Office.Interop.Excel;
namespace COMInterop
{
 class Program
   {
 static void Main(string[] args)
      {
         var excelApplication = new Excel.Application();
         excelApplication.Visible = true;
         dynamic excelWorkBook = excelApplication.Workbooks.Add();
         Excel.Worksheet wkSheetData = excelWorkBook.ActiveSheet;
          excelWorkBook.SaveAs(
 "Testfile.xls",
            AccessMode: Excel.XlSaveAsAccessMode.xlShared);
      }
   }
}

 

The COM Interop approach on a server is however fraught with problems, including technical problems and licensing issues, and not recommended by Microsoft.

So, now you have a few options for your server side Excel needs! Wouldn’t it be awesome if someone could reengineer LinqToExcel to work with the XLSX format without the JET driver?!

comments powered by Disqus