Archive for December, 2010


Excel Operations – Open Xml Unleashed

My today’s post is about working with the Excel documents where I mention about how you can import data from an Excel file into a DataTable and Export a DataTable to an Excel File. I’ll be using Office Open Xml to accomplish the requirement which has its own pros and cons.

Pros:

  1. When reading/generating an Excel file through other methods, you need to have installed on the server machine the office runtime dlls and is some cases the office should be installed too. And while generating the excel file, always an instance of the Excel.exe is created on the server machine, meaning that if 100 files are getting generated 100 Excel.exe instances will be opened on the server machine, which could be stressful for the server performance and also handling the closing of the instances after the generation is finished is another problem. But in case of Office Open Xml, all you need to have is to install the Open Xml SDK (currently it’s in v2.0) and nothing else. You don’t need have office on the server and no instances or what so ever are created, so the performance of the server is always in control.
  2. The Open Xml format is completely interoperable, meaning the excel document created in this format can be worked on across any platform.

Cons:

There’s only one problem with Open Xml, it major supports only the documents created in Office 2007 or later (i.e. xlsx, docx), which means that if you wish to work on documents created in the older version of the office will not work as desired. But, the documents that are created in Office 2007 or later and saved in the Office 97-2003 format will still work.

Below I am going to post my code that contains the methods for working with the Excel Documents using the Open Xml SDK v2.0 which you can use directly for generating/reading excel documents.

Generating an Excel File

	#region Generating an Excel File

	/// <summary>
	/// Converts the DataTable into the excel format and stores at the target location (also takes care of the date values)
	/// </summary>
	/// <param name="sourceTable">Source DataTable</param>
	/// <param name="targetFilePath">Target File Path</param>
	/// <param name="dateColumnsInDataTable">List of Columns with Date datatype in the DataTable</param>
	public static void ConvertDataTableToExcel(DataTable sourceTable, string targetFilePath, List<string> dateColumnsInDataTable)
	{
		if (File.Exists(targetFilePath))
			File.Delete(targetFilePath);

		SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Create(targetFilePath, SpreadsheetDocumentType.Workbook);
		
		spreadsheetDocument.AddWorkbookPart();
		spreadsheetDocument.Close();

		InsertWorksheet(targetFilePath);

		using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open(targetFilePath, true))
		{
			var wrkSheet = myWorkbook.WorkbookPart.WorksheetParts.First().Worksheet;
			var sheetData = wrkSheet.GetFirstChild<SheetData>();

			var colValueList = new List<KeyValuePair<string, string>>();

			//Add Rows to the Sheet                                
			for (int j = 0; j < sourceTable.Columns.Count; j++)
			{
				colValueList.Add(new KeyValuePair<string, string>(FindDestinationColLetter("A", j), sourceTable.Columns[j].ColumnName));
			}

			var columnRow = CreateContentRow(1, colValueList);
			sheetData.AppendChild(columnRow);

			for (int i = 0; i < sourceTable.Rows.Count; i++)
			{
				var rowValList = new List<KeyValuePair<string, string>>();

				for (int j = 0; j < sourceTable.Columns.Count; j++)
				{
					string rowVal;
					if (dateColumnsInDataTable.Contains(sourceTable.Columns[j].ColumnName))
					{
						//Checks if the date is in date format or Julian format (that excel returns) and stores it accordingly
						var dateVal = sourceTable.Rows[i][j].ToString();
						if (IsDateFormat(dateVal))
						{
							rowVal = dateVal;
						}
						else
						{
							int no;
							rowVal = int.TryParse(dateVal, out no)
											  ? ExcelSerialDateToDMY(no)
											  : dateVal;
						}
					}
					else
						rowVal = sourceTable.Rows[i][j].ToString();

					rowValList.Add(new KeyValuePair<string, string>(FindDestinationColLetter("A", j), rowVal));
				}

				var row = CreateContentRow(i + 2, rowValList);
				sheetData.AppendChild(row);
			}
			wrkSheet.Save();
		}
	}

	/// <summary>
	/// Checks if the give string is a date
	/// </summary>
	/// <param name="dateStr">String</param>
	/// <returns></returns>
	private static bool IsDateFormat(string dateStr)
	{
		//Remove the trailing timestamp
		dateStr = dateStr.Split(' ')[0];

		DateTime date;
		if (DateTime.TryParse(dateStr, out date))
			return true;

		if (DateTime.TryParse(dateStr, DateTimeFormatInfo.InvariantInfo, DateTimeStyles.None, out date))
		{
			return true;
		}

		if (DateTime.TryParse(dateStr, DateTimeFormatInfo.InvariantInfo, DateTimeStyles.AllowWhiteSpaces, out date))
		{
			return true;
		}

		var dateArr = dateStr.Split('-');
		if (dateArr.Length == 3)
		{
			int no;
			return int.TryParse(dateArr[0], out no) && int.TryParse(dateArr[1], out no) && int.TryParse(dateArr[2], out no);
		}

		return false;
	}

	/// <summary>
	/// Converts the given Excel Julian date to the Date-Month-Year format
	/// </summary>
	/// <param name="nSerialDate">Date in the Excel Julian Date Format</param>
	/// <returns></returns>
	public static string ExcelSerialDateToDMY(int nSerialDate)
	{
		// Excel/Lotus 123 have a problem with 29-02-1900. 1900 is not a leap year, but Excel/Lotus 123 think it is...

		int nDay;
		int nMonth;
		int nYear;

		if (nSerialDate == 60)
		{
			nDay = 29;
			nMonth = 2;
			nYear = 1900;

			return nDay + "-" + nMonth + "-" + nYear;
		}

		if (nSerialDate < 60)
		{
			// Because of the 29-02-1900 problem, any serial date under 60 is one off... Compensate.
			nSerialDate++;
		}

		// Modified Julian to DMY calculation with an addition of 2415019
		var l = nSerialDate + 68569 + 2415019;
		var n = (4 * l) / 146097;
		l = l - (146097 * n + 3) / 4;
		var i = (4000 * (l + 1)) / 1461001;
		l = l - (1461 * i) / 4 + 31;
		var j = (80 * l) / 2447;
		nDay = l - (2447 * j) / 80;
		l = j / 11;
		nMonth = j + 2 - (12 * l);
		nYear = 100 * (n - 49) + i + l;

		return nDay + "-" + nMonth + "-" + nYear;
	}

	/// <summary>
	/// Inserts a worksheet inside the given excel path
	/// </summary>
	/// <param name="excelFilePath"></param>
	private static void InsertWorksheet(string excelFilePath)
	{
		// Open the document for editing.
		using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(excelFilePath, true))
		{
			// Add a blank WorksheetPart.
			spreadSheet.WorkbookPart.Workbook = new Workbook();
			var newWorksheetPart = spreadSheet.WorkbookPart.AddNewPart<WorksheetPart>();
			newWorksheetPart.Worksheet = new Worksheet(new SheetData());
			newWorksheetPart.Worksheet.Save();

			spreadSheet.WorkbookPart.Workbook.AppendChild(new Sheets());
			var sheets = spreadSheet.WorkbookPart.Workbook.GetFirstChild<Sheets>();
			string relationshipId = spreadSheet.WorkbookPart.GetIdOfPart(newWorksheetPart);

			// Get a unique ID for the new worksheet.
			uint sheetId = 1;
			if (sheets.Elements<Sheet>().Count() > 0)
			{
				sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
			}

			// Give the new worksheet a name.
			string sheetName = "Sheet" + sheetId;

			// Append the new worksheet and associate it with the workbook.
			var sheet = new Sheet { Id = relationshipId, SheetId = sheetId, Name = sheetName };
			sheets.Append(sheet);
			spreadSheet.WorkbookPart.Workbook.Save();
		}
	}

	/// <summary>
	/// Inserts the give row data in the excel file
	/// </summary>
	/// <param name="rowIndex">The index at which the row has to be inserted</param>
	/// <param name="valueList">The Row content</param>
	/// <returns></returns>
	private static Row CreateContentRow(int rowIndex, IEnumerable<KeyValuePair<string, string>> valueList)
	{
		//Create new row
		var row = new Row { RowIndex = (UInt32)rowIndex };

		foreach (var keyValuePair in valueList)
		{
			row.AppendChild(CreateTextCell(keyValuePair.Key, keyValuePair.Value, rowIndex));
		}

		return row;

	}

	/// <summary>
	/// Create a text cell that can be inserted in the excel row
	/// </summary>
	/// <param name="headerCellName">Header cell name</param>
	/// <param name="cellText">Cell Text</param>
	/// <param name="cellRowIndex">Row Index</param>
	/// <returns></returns>
	private static Cell CreateTextCell(string headerCellName, string cellText, int cellRowIndex)
	{
		int intText;
		var cell = new Cell { CellReference = headerCellName + cellRowIndex };
		if (int.TryParse(cellText, out intText))
		{
			var cellValue = new CellValue { Text = cellText };
			cell.AppendChild(cellValue);
			return cell;
		}

		//Create new inline string cell
		cell.DataType = CellValues.InlineString;

		//Add text to text cell
		var inlineString = new InlineString();
		if (string.IsNullOrWhiteSpace(cellText))
		{
			cellText = string.Empty;
		}
		
		var t = new Text { Text = SecurityElement.Escape(cellText) };
		inlineString.AppendChild(t);

		cell.AppendChild(inlineString);

		return cell;
	}

	/// <summary>
	/// Finds the Column to be worked on, based on the first column letter and target Column Index
	/// </summary>
	/// <param name="startingColumnName">Header name</param>
	/// <param name="targetColumnIndex">Index of the target column starting from the header</param>
	/// <returns></returns>
	private static string FindDestinationColLetter(string startingColumnName, int targetColumnIndex)
	{
		var columnLetters = startingColumnName.ToCharArray();

		for (var i = 1; i <= targetColumnIndex; i++)
		{
			if (columnLetters[columnLetters.Length - 1] == 'Z')
				columnLetters = ChangeColumnLetter(columnLetters, columnLetters.Length - 1);
			else
				columnLetters[columnLetters.Length - 1]++;
		}
		return new string(columnLetters);
	}

	private static char[] ChangeColumnLetter(char[] columnLetters, int columnIndex)
	{
		if (columnIndex == -1)
		{
			var newCharArray = new List<char> { 'A' };
			newCharArray.AddRange(columnLetters);
			columnLetters = newCharArray.ToArray();
		}
		else if (columnLetters[columnIndex] == 'Z')
		{
			columnLetters[columnIndex] = 'A';
			columnLetters = ChangeColumnLetter(columnLetters, --columnIndex);
		}
		else
		{
			columnLetters[columnIndex]++;
		}

		return columnIndex != -2 ? columnLetters : null;
	}

	#endregion
	

Reading an Excel File


	#region Reading an Excel File

	/// <summary>
	/// Get the DataTable from the Excel at the given file path
	/// </summary>
	/// <param name="filePath">Path of the excel file</param>
	/// <returns></returns>
	public static DataTable GetDataTableFromExcel(string filePath)
	{
		FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);

		//1. Reading from a binary Excel file ('97-2003 format; *.xls)
		//IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);            

		//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
		IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);

		//3. DataSet - The result of each spreadsheet will be created in the result.Tables
		//DataSet result = excelReader.AsDataSet();

		//4. DataSet - Create column names from first row
		excelReader.IsFirstRowAsColumnNames = true;
		DataSet result = excelReader.AsDataSet();

		//6. Free resources (IExcelDataReader is IDisposable)
		excelReader.Close();
		return result.Tables[0];
	}                

	#endregion

Using the above methods, you can easily start playing with the Excel documents using Open Xml. Go ahead and start playing! J

?
Advertisements

WCF Maximum Array Length Quota

Recently, while working with the WCF services I faced a problem which seemed to be very common but the solution to it is not clearly available. Let me explain what the problem is all about, when you try to consume your WCF service in another project like Silverlight, WPF, ASP.Net, etc; sometimes when you request the service for some data you may see the following error message:

“Error in deserializing body of reply message for operation ‘Your Method Name’. The maximum array length quota (16384) has been exceeded while reading XML data. This quota may be increased by changing the MaxArrayLength property on the XmlDictionaryReaderQuotas object used when creating the XML reader.”

This happens because of the size of response message i.e. when your service prepares the result into a byte[] as response to your request, it finds out that the “Maximum Array Length” is set to 16384 and since your response byte[] would be greater than this limit it throws an exception. In order to cure this problem, open the config file of the project where this WCF service is consumed – ServiceReferences.ClientConfig (Silverlight), app.config (WPF) or web.config(ASP.Net). Here you’ll see the binding settings of the service (please note: you may see multiple bindings, one for each service you have consumed. In this case, identify the binding of the service which is giving this exception). In the setting you’ll see something similar to below:


<binding name="CustomBinding_Testervice">
	<binaryMessageEncoding maxReadPoolSize="64" maxWritePoolSize="16" maxSessionSize="2048">
		<readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384" 
			maxBytesPerRead="4096" maxNameTableCharCount="16384" />
	</binaryMessageEncoding>
	<httpTransport manualAddressing="false" maxBufferPoolSize="524288" 
		maxReceivedMessageSize="65536" allowCookies="false" authenticationScheme="Anonymous" 
		bypassProxyOnLocal="false" decompressionEnabled="true" 
		hostNameComparisonMode="StrongWildcard" keepAliveEnabled="true" maxBufferSize="65536"
		proxyAuthenticationScheme="Anonymous" realm="" transferMode="Buffered"
		unsafeConnectionNtlmAuthentication="false" useDefaultWebProxy="true" />
</binding>

Over here you’ll see that the value of “maxArrayLength” is set to “16384”. Just increase this length (you can change other limits too) and your problem will be cured. After correction, the binding may look like this:


<binding name="CustomBinding_TestService">
	<binaryMessageEncoding maxReadPoolSize="2147483647" maxWritePoolSize="2147483647" 
		maxSessionSize="2147483647">
		<readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" 
		maxArrayLength="2147483647" maxBytesPerRead="2147483647" maxNameTableCharCount="2147483647" />
	</binaryMessageEncoding>
	<httpTransport manualAddressing="false" maxBufferPoolSize="2147483647" 
		maxReceivedMessageSize="2147483647" allowCookies="false" authenticationScheme="Anonymous"
		bypassProxyOnLocal="false" decompressionEnabled="true" 
		hostNameComparisonMode="StrongWildcard" keepAliveEnabled="true" maxBufferSize="2147483647" 
		proxyAuthenticationScheme="Anonymous" realm="" transferMode="Buffered" 
		unsafeConnectionNtlmAuthentication="false" useDefaultWebProxy="true" />
</binding>

Please see that the max length cannot go over “2147483647” as its max integer value WCF supports. Also, it’s not required to increase the limits of other properties but it’s a good practice and also helps in preventing any future unanticipated errors.