Tag Archive: Office Open XML


Open Xml on Azure

I was once deploying a project (that involves Open Xml operations) on Azure. The project would run fine locally on my machine and but the code just wouldn’t run. After lots of scratching, I discovered the 2 problematic situations faced most widely that leads to this problems. The root cause here is that Azure doesn’t support the OpenXml operation directly since the required dlls are not available on Azure. The solution to the problem is changing the configuration of your project references, so whenever you get stuck in this problem, just check out which of the following situations apply to you and make the necessary changes to make your code work.

Scenario 1: The excel operations are performed in your main web project (i.e. the project being deployed on the cloud):

  1. Expand the project’s “References” folder
  2. Select the “DocumentFormat.OpenXml” and “WindowsBase” dlls
  3. Right click on them and then select “Properties”
  4. Set “Copy Local = True
Scenario 2: The Open Xml operations are being done inside a dll and then this custom dll is being referenced in your web project. If this is the case, do the following:
  1. Add References to “DocumentFormat.OpenXml” and “WindowsBase” dll in your main web project
  2. In your web project, expand the “References” folder
  3. Select your custom dll and the “DocumentFormat.OpenXml” and “WindowsBase” dlls
  4. Right click on them and then select “Properties”
  5. Set “Copy Local = True
Rebuild your project and deploy again, and this time it’ll work!
Advertisements

Excel Operations – Open Xml Unleashed

My today’s post is about working with the Excel documents where I mention about how you can import data from an Excel file into a DataTable and Export a DataTable to an Excel File. I’ll be using Office Open Xml to accomplish the requirement which has its own pros and cons.

Pros:

  1. When reading/generating an Excel file through other methods, you need to have installed on the server machine the office runtime dlls and is some cases the office should be installed too. And while generating the excel file, always an instance of the Excel.exe is created on the server machine, meaning that if 100 files are getting generated 100 Excel.exe instances will be opened on the server machine, which could be stressful for the server performance and also handling the closing of the instances after the generation is finished is another problem. But in case of Office Open Xml, all you need to have is to install the Open Xml SDK (currently it’s in v2.0) and nothing else. You don’t need have office on the server and no instances or what so ever are created, so the performance of the server is always in control.
  2. The Open Xml format is completely interoperable, meaning the excel document created in this format can be worked on across any platform.

Cons:

There’s only one problem with Open Xml, it major supports only the documents created in Office 2007 or later (i.e. xlsx, docx), which means that if you wish to work on documents created in the older version of the office will not work as desired. But, the documents that are created in Office 2007 or later and saved in the Office 97-2003 format will still work.

Below I am going to post my code that contains the methods for working with the Excel Documents using the Open Xml SDK v2.0 which you can use directly for generating/reading excel documents.

Generating an Excel File

	#region Generating an Excel File

	/// <summary>
	/// Converts the DataTable into the excel format and stores at the target location (also takes care of the date values)
	/// </summary>
	/// <param name="sourceTable">Source DataTable</param>
	/// <param name="targetFilePath">Target File Path</param>
	/// <param name="dateColumnsInDataTable">List of Columns with Date datatype in the DataTable</param>
	public static void ConvertDataTableToExcel(DataTable sourceTable, string targetFilePath, List<string> dateColumnsInDataTable)
	{
		if (File.Exists(targetFilePath))
			File.Delete(targetFilePath);

		SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Create(targetFilePath, SpreadsheetDocumentType.Workbook);
		
		spreadsheetDocument.AddWorkbookPart();
		spreadsheetDocument.Close();

		InsertWorksheet(targetFilePath);

		using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open(targetFilePath, true))
		{
			var wrkSheet = myWorkbook.WorkbookPart.WorksheetParts.First().Worksheet;
			var sheetData = wrkSheet.GetFirstChild<SheetData>();

			var colValueList = new List<KeyValuePair<string, string>>();

			//Add Rows to the Sheet                                
			for (int j = 0; j < sourceTable.Columns.Count; j++)
			{
				colValueList.Add(new KeyValuePair<string, string>(FindDestinationColLetter("A", j), sourceTable.Columns[j].ColumnName));
			}

			var columnRow = CreateContentRow(1, colValueList);
			sheetData.AppendChild(columnRow);

			for (int i = 0; i < sourceTable.Rows.Count; i++)
			{
				var rowValList = new List<KeyValuePair<string, string>>();

				for (int j = 0; j < sourceTable.Columns.Count; j++)
				{
					string rowVal;
					if (dateColumnsInDataTable.Contains(sourceTable.Columns[j].ColumnName))
					{
						//Checks if the date is in date format or Julian format (that excel returns) and stores it accordingly
						var dateVal = sourceTable.Rows[i][j].ToString();
						if (IsDateFormat(dateVal))
						{
							rowVal = dateVal;
						}
						else
						{
							int no;
							rowVal = int.TryParse(dateVal, out no)
											  ? ExcelSerialDateToDMY(no)
											  : dateVal;
						}
					}
					else
						rowVal = sourceTable.Rows[i][j].ToString();

					rowValList.Add(new KeyValuePair<string, string>(FindDestinationColLetter("A", j), rowVal));
				}

				var row = CreateContentRow(i + 2, rowValList);
				sheetData.AppendChild(row);
			}
			wrkSheet.Save();
		}
	}

	/// <summary>
	/// Checks if the give string is a date
	/// </summary>
	/// <param name="dateStr">String</param>
	/// <returns></returns>
	private static bool IsDateFormat(string dateStr)
	{
		//Remove the trailing timestamp
		dateStr = dateStr.Split(' ')[0];

		DateTime date;
		if (DateTime.TryParse(dateStr, out date))
			return true;

		if (DateTime.TryParse(dateStr, DateTimeFormatInfo.InvariantInfo, DateTimeStyles.None, out date))
		{
			return true;
		}

		if (DateTime.TryParse(dateStr, DateTimeFormatInfo.InvariantInfo, DateTimeStyles.AllowWhiteSpaces, out date))
		{
			return true;
		}

		var dateArr = dateStr.Split('-');
		if (dateArr.Length == 3)
		{
			int no;
			return int.TryParse(dateArr[0], out no) && int.TryParse(dateArr[1], out no) && int.TryParse(dateArr[2], out no);
		}

		return false;
	}

	/// <summary>
	/// Converts the given Excel Julian date to the Date-Month-Year format
	/// </summary>
	/// <param name="nSerialDate">Date in the Excel Julian Date Format</param>
	/// <returns></returns>
	public static string ExcelSerialDateToDMY(int nSerialDate)
	{
		// Excel/Lotus 123 have a problem with 29-02-1900. 1900 is not a leap year, but Excel/Lotus 123 think it is...

		int nDay;
		int nMonth;
		int nYear;

		if (nSerialDate == 60)
		{
			nDay = 29;
			nMonth = 2;
			nYear = 1900;

			return nDay + "-" + nMonth + "-" + nYear;
		}

		if (nSerialDate < 60)
		{
			// Because of the 29-02-1900 problem, any serial date under 60 is one off... Compensate.
			nSerialDate++;
		}

		// Modified Julian to DMY calculation with an addition of 2415019
		var l = nSerialDate + 68569 + 2415019;
		var n = (4 * l) / 146097;
		l = l - (146097 * n + 3) / 4;
		var i = (4000 * (l + 1)) / 1461001;
		l = l - (1461 * i) / 4 + 31;
		var j = (80 * l) / 2447;
		nDay = l - (2447 * j) / 80;
		l = j / 11;
		nMonth = j + 2 - (12 * l);
		nYear = 100 * (n - 49) + i + l;

		return nDay + "-" + nMonth + "-" + nYear;
	}

	/// <summary>
	/// Inserts a worksheet inside the given excel path
	/// </summary>
	/// <param name="excelFilePath"></param>
	private static void InsertWorksheet(string excelFilePath)
	{
		// Open the document for editing.
		using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(excelFilePath, true))
		{
			// Add a blank WorksheetPart.
			spreadSheet.WorkbookPart.Workbook = new Workbook();
			var newWorksheetPart = spreadSheet.WorkbookPart.AddNewPart<WorksheetPart>();
			newWorksheetPart.Worksheet = new Worksheet(new SheetData());
			newWorksheetPart.Worksheet.Save();

			spreadSheet.WorkbookPart.Workbook.AppendChild(new Sheets());
			var sheets = spreadSheet.WorkbookPart.Workbook.GetFirstChild<Sheets>();
			string relationshipId = spreadSheet.WorkbookPart.GetIdOfPart(newWorksheetPart);

			// Get a unique ID for the new worksheet.
			uint sheetId = 1;
			if (sheets.Elements<Sheet>().Count() > 0)
			{
				sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
			}

			// Give the new worksheet a name.
			string sheetName = "Sheet" + sheetId;

			// Append the new worksheet and associate it with the workbook.
			var sheet = new Sheet { Id = relationshipId, SheetId = sheetId, Name = sheetName };
			sheets.Append(sheet);
			spreadSheet.WorkbookPart.Workbook.Save();
		}
	}

	/// <summary>
	/// Inserts the give row data in the excel file
	/// </summary>
	/// <param name="rowIndex">The index at which the row has to be inserted</param>
	/// <param name="valueList">The Row content</param>
	/// <returns></returns>
	private static Row CreateContentRow(int rowIndex, IEnumerable<KeyValuePair<string, string>> valueList)
	{
		//Create new row
		var row = new Row { RowIndex = (UInt32)rowIndex };

		foreach (var keyValuePair in valueList)
		{
			row.AppendChild(CreateTextCell(keyValuePair.Key, keyValuePair.Value, rowIndex));
		}

		return row;

	}

	/// <summary>
	/// Create a text cell that can be inserted in the excel row
	/// </summary>
	/// <param name="headerCellName">Header cell name</param>
	/// <param name="cellText">Cell Text</param>
	/// <param name="cellRowIndex">Row Index</param>
	/// <returns></returns>
	private static Cell CreateTextCell(string headerCellName, string cellText, int cellRowIndex)
	{
		int intText;
		var cell = new Cell { CellReference = headerCellName + cellRowIndex };
		if (int.TryParse(cellText, out intText))
		{
			var cellValue = new CellValue { Text = cellText };
			cell.AppendChild(cellValue);
			return cell;
		}

		//Create new inline string cell
		cell.DataType = CellValues.InlineString;

		//Add text to text cell
		var inlineString = new InlineString();
		if (string.IsNullOrWhiteSpace(cellText))
		{
			cellText = string.Empty;
		}
		
		var t = new Text { Text = SecurityElement.Escape(cellText) };
		inlineString.AppendChild(t);

		cell.AppendChild(inlineString);

		return cell;
	}

	/// <summary>
	/// Finds the Column to be worked on, based on the first column letter and target Column Index
	/// </summary>
	/// <param name="startingColumnName">Header name</param>
	/// <param name="targetColumnIndex">Index of the target column starting from the header</param>
	/// <returns></returns>
	private static string FindDestinationColLetter(string startingColumnName, int targetColumnIndex)
	{
		var columnLetters = startingColumnName.ToCharArray();

		for (var i = 1; i <= targetColumnIndex; i++)
		{
			if (columnLetters[columnLetters.Length - 1] == 'Z')
				columnLetters = ChangeColumnLetter(columnLetters, columnLetters.Length - 1);
			else
				columnLetters[columnLetters.Length - 1]++;
		}
		return new string(columnLetters);
	}

	private static char[] ChangeColumnLetter(char[] columnLetters, int columnIndex)
	{
		if (columnIndex == -1)
		{
			var newCharArray = new List<char> { 'A' };
			newCharArray.AddRange(columnLetters);
			columnLetters = newCharArray.ToArray();
		}
		else if (columnLetters[columnIndex] == 'Z')
		{
			columnLetters[columnIndex] = 'A';
			columnLetters = ChangeColumnLetter(columnLetters, --columnIndex);
		}
		else
		{
			columnLetters[columnIndex]++;
		}

		return columnIndex != -2 ? columnLetters : null;
	}

	#endregion
	

Reading an Excel File


	#region Reading an Excel File

	/// <summary>
	/// Get the DataTable from the Excel at the given file path
	/// </summary>
	/// <param name="filePath">Path of the excel file</param>
	/// <returns></returns>
	public static DataTable GetDataTableFromExcel(string filePath)
	{
		FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);

		//1. Reading from a binary Excel file ('97-2003 format; *.xls)
		//IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);            

		//2. Reading from a OpenXml Excel file (2007 format; *.xlsx)
		IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);

		//3. DataSet - The result of each spreadsheet will be created in the result.Tables
		//DataSet result = excelReader.AsDataSet();

		//4. DataSet - Create column names from first row
		excelReader.IsFirstRowAsColumnNames = true;
		DataSet result = excelReader.AsDataSet();

		//6. Free resources (IExcelDataReader is IDisposable)
		excelReader.Close();
		return result.Tables[0];
	}                

	#endregion

Using the above methods, you can easily start playing with the Excel documents using Open Xml. Go ahead and start playing! J

?
Image representing Microsoft as depicted in Cr...

Image via CrunchBase

FINALLY MY WORK GETS RECOGNIZED FOR SOMETHING IT VERY MUCH DESERVED!

April 03, 2008 10:33 IST

After nearly 14 months of discussion, debate and controversies surrounding them, the International Organisation for Standardisation (ISO) has finally declared that Microsoft’s Office Open XML (OOXML) file format has received the necessary number of votes for approval as an ISO/IEC international standard. Approval required at least two-thirds (i.e. 66.66 per cent) of the votes cast by national bodies participating in the joint technical committee (ISO/IEC JTC 1), to be positive; and no more than one-fourth (i.e. 25 per cent) of the total number of (ISO/IEC) national body votes cast to be negative. ISO/IEC DIS 29500 (the technical name for the draft proposal) was originally disapproved in the “fast-track vote” which ended in September 2007, when 3,500 comments were received. India too had said ‘no’ with comments.

However, the comments were addressed at a ballot resolution meeting (BRM) from 25-29 February, after which the national bodies had 30 days to modify their votes if they wished. India maintained its ‘No’. By eliminating redundancies, the comments had been reduced to 1,000 individual issues to be considered. However, there will reportedly be a two-month period to allow national bodies to lodge any formal appeals before the standard proceeds to official publication. There are already protests over votes cast by Norway and Germany.

There has, perhaps, never been a more intense global industry debate over standards since OOXML is backed by Apple, Novell, and closer home by Wipro, Infosys, TCS, and Nasscom. The rival Open
Document Format or ODF is supported by IBM, Sun Microsystems, Red Hat, Google, and in India, by the Department of Information Technology (DIT), National Informatics Centre (NIC), CDAC, IIT-Bombay and IIM-Ahmedabad.

ISO approval means government business for Microsoft since governments worldwide, including India, prefer standards that are ratified from bodies such as the ISO. Governments are wary of holding digital data in proprietary formats, which could make them hostage (vendor lock-ins) to a software vendor. States such as Delhi, Kerala and others from the North-East are heavy adopters of ODF file formats which are open and free (excluding maintenance and support).

“With 86 per cent of voting national bodies supporting ratification, there is overwhelming support for Open XML. This outcome is a clear win for the customers, technology providers and governments that want to choose the format that best meets their needs and have a voice in the evolution of this widely adopted standard,” said Tom Robertson, general manager of Interoperability and Standards at Microsoft, in a press statement.

Venkatesh Hariharan, co-founder, Open Source Foundation of India, said: “Standards in a crucial area like document formats impact the lives of all computer users daily. Therefore, the manner in which OOXML has been pushed through ISO to support the monopolistic aims of a single corporation is a matter of serious concern. The European Union is investigating the numerous irregularities on the
voting around OOXML and we will support their investigations.”

Meanwhile, the debate around OOXML and ODF appears to be a proxy for product competition in the marketplace, note observers. It is significant in part because it will have a major influence on the future success of Microsoft Office – one of Microsoft’s largest and most profitable product families. Non-governmental and legacy Microsoft Office users, on the other hand, are most likely not to bother
about which file formats their office applications use, given that Microsoft Office still has an over 90 per cent market share in many countries. In developing nations also the software major aggressively competes by subsidising its offerings for governments and students.

 

Source: http://www.rediff.com/money/2008/apr/03ms.htm

Closeup of a stone sign bearing the Microsoft ...

Image via Wikipedia

Hi All

A bit of good news for Microsoft and its fan and the developers in favor of Office Open XML to become an ISO Standard. Let me brief the history.

Microsoft gave out the Open XML standard as its weapon in the war of Open Source Vs Open Standards. More information about Open XML is available here. The overview of the Office Open XML describes the complete specification. And the developers can consult the Microsoft sponsored OpenXMLdeveloper.org!!

Open XML is already a standard under the  European Computer Manufacturers Association (ECMA), article no. 376. Here’s more information about it!!

And here’s the current news about it:

 

Amid lobbying, Microsoft is seen winning an international open standard vote
By Kevin J. O’Brien

Monday, September 3, 2007

 

BERLIN:
Amid intense lobbying, Microsoft is expected to squeak out a victory this week to have its open document format, Office Open XML, recognized as an international standard, people tracking the vote said Monday.

Supporters and opponents of the move, which would help Microsoft maintain its competitive advantage in the burgeoning field of open-format documents, said they believed that the company, the world’s largest software maker, would receive a global standard for Office Open XML.

“After what basically has amounted to unprecedented lobbying, I think that Microsoft’s standard is going to get the necessary amount of support,” said Pieter Hintjens, president of Foundation for a Free Information Infrastructure, a Brussels group that led the opposition. The underlying code of an “open” document is public, allowing
developers to improve and derive new products without having to pay royalties. The first open format to become an international standard, in May 2006, was OpenDocument Format, developed by a group led by International Business Machines.

Microsoft sought a similar status for Office Open XML so it could also sell software with “open” characteristics increasingly demanded by national and local governments in Belgium, France, Germany, the Netherlands and Brazil, as well as Massachusetts in the United States. Up to 104 countries of two global standards bodies, the International Organization for Standardization, known as ISO, and the International Electrotechnical Commission, or IEC, both based in Geneva, have since April been casting national votes on whether to
designate Office Open XML as a global standard.

The issue has split the groups, with some members asserting that the ISO and IEC should not be endorsing the commercial product of a single company. Others say a standards designation would reflect reality, since more than 90 percent of electronic documents are on Microsoft format. Electronic voting closed Sunday. Roger Frost, an ISO spokesman in
Geneva, said his organization was tallying the votes and expected to announce the results on Tuesday or Wednesday.

The European Computer Manufacturers Association, known as ECMA, a standards group based in Geneva, endorsed Office Open XML as a European standard last December and nominated it for fast-track consideration by the ISO. “It is really impossible to say what the outcome is going to be,” said Jan van den Beld, a former secretary general of the manufacturers’ group. “Basically, Office Open XML was a form of catch-up standardization,” said Van den Beld, who was ECMA’s secretary general from 1992 until last April. “Politics aside, there are 400 million users of the Office Open format, and we basically just recognized reality.”

Erich Andersen, Microsoft’s associate general counsel in Europe, said, “We are encouraged by the positive support the Open XML standard has received, and over the next phase of the process we will work closely with ECMA and the committee to address the many constructive comments that have been submitted.” Hintjens, a Brussels software developer whose group ran the opposition Web site called noooxml.org, said Monday that he believed Microsoft had managed to curry just enough support to win passage for its standard. According to Hintjens, whose group has been tallying the votes of participants, countries including Japan, Canada, India, China, Brazil, France and Britain voted against Microsoft’s proposal. France and Britain made their “no” votes conditional, meaning they could later change them to “yes” should Microsoft alter its 6,500-page standard to allay technical and liability concerns.

Microsoft’s bid was supported, Hintjens said, by countries including Switzerland, the United States, Portugal and Germany, as well as smaller ones like Trinidad and Tobago, Kenya and Ivory Coast, some of whom became active late in the voting at Microsoft’s urging. To win passage, Microsoft’s standard must gain support from at least two-thirds of 37 countries on an information technology panel of the ISO and IEC called the Joint Technical Committee 1. Also, Microsoft’s standard cannot be opposed by more than 25 percent of all countries
casting ballots. Van den Beld said 80 of 104 eligible countries have cast ballots, more than any previous standards vote.

Under ISO and IEC rules, a so-called ballot resolution meeting will be held Feb. 25 to 29 in Geneva, where Microsoft and national voting groups will attempt to iron out differences, leading to a broader consensus behind the standard. If Microsoft prevails in voting this week, it is under less pressure to make changes it opposes. Should it lose, however, it would be under more pressure to make changes, van den Beld said.


Copyright © 2007 The International Herald Tribune | www.iht.com

After all this war, I think soon Office Open XML will be an ISO Standard.

What’s your comment on it??