Friday, May 25, 2007

Validating an XML File using XSD in .NET .20

Below given is the.NET 1.1 code for validating an XML File using an XSD File.

.NET 1.1 Code

//.NET1.1 code
///
/// Methode to validate XML File
///

/// This method expects input XML as string
/// Path to schema file
/// true if xml is validated else false
private bool ValidateXmlUsingXsd(string XmlData,String SchemaPath)
{
XmlValidatingReader v = new XmlValidatingReader(XmlData, XmlNodeType.Document, null);
v.ValidationType = ValidationType.Schema;
v.ValidationEventHandler +=
new ValidationEventHandler(MyValidationEventHandler);

while (v.Read())
{
// Can add code here to process the content.
}
v.Close();

return isValid;

}
///
/// This event handler is called only when a validation error occurs
///

///
///
public static void MyValidationEventHandler(object sender,
ValidationEventArgs args)
{
//these two variables should be initialized as class level variables
isValid = false;
errorMessage = "Validation event\n" + args.Message;

}
///
/// Method to get XML in a string from an XML file
///

///
///
private string GetStringFromXML(string fileName)
{
StreamReader rd = new StreamReader(fileName);
string str = rd.ReadToEnd();
rd.Close();
return str;
}




Calling the method :
bool valid = ValidateXmlUsingXsd(str, txtXSD.Text);


There are some changes in the .NET 2.0 code for XML Validation .
XmlValidatingReader is marked as obsolete. Need to use XMLReader.Create() using XmlReaderSettngs instead

There are some behavioral changes between validation using the XmlReaderSettings and XmlSchemaSet classes and validation using the XmlValidatingReader class.

The XmlReaderSettings and XmlSchemaSet classes do not support XML-Data Reduced (XDR) schema validation.

The most important difference I found out is that to do XML data validation using a schema, settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings;
Flag must be enabled. Otherwise the Schema check error will not be displayed.

//.NET2.0 code
private bool ValidateXmlUsingXsd2(string XmlData,String SchemaPath)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(null, SchemaPath);
settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += new ValidationEventHandler(ValidationCallBack);

StringReader xmlStream = new StringReader(XmlData);
XmlReader reader = XmlReader.Create(xmlStream, settings);
while (reader.Read()) ;

return isValid;
}

private static void ValidationCallBack(object sender, ValidationEventArgs e)
{
isValid = false;
errorMessage = "Validation Error: " + e.Message;
}


More about XmlSchemaValidationFlags Enumeration


Member name
Description
AllowXmlAttributes
Allow xml:* attributes even if they are not defined in the schema. The attributes will be validated based on their data type.
None
Do not process identity constraints, inline schemas, schema location hints, or report schema validation warnings.
ProcessIdentityConstraints
Process identity constraints (xs:ID, xs:IDREF, xs:key, xs:keyref, xs:unique) encountered during validation.
ProcessInlineSchema
Process inline schemas encountered during validation.
ProcessSchemaLocation
Process schema location hints (xsi:schemaLocation, xsi:noNamespaceSchemaLocation) encountered during validation.
ReportValidationWarnings
Report schema validation warnings encountered during validation.


In a nutshell, Always set settings.ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings; when a schema validation is required for the XML in .NET 2.0

Saturday, May 19, 2007

Truncation of fields when CSV file is read using ADO.NET

I encountered a major problem with ADO.NET CSV reader in one of the projects where CSV File Import played a major role.
When a field in the CSV file with a “–“(hyphen) is read, the characters before the “–“are discarded. For example the model F-150 is read as –150 and T-Bird in the model field is not being read. And some of the values were missing in some of the fields.
Given below is the code I used :

string strConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + folderName + ";Extended Properties=\"text;HDR=Yes; FMT=Delimited\"";
string sqlSelect = "select * from [" + fileName + "]";
System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection(
strConnString.Trim());
conn.Open();
System.Data.OleDb.OleDbDataAdapter adapter = new System.Data.OleDb.OleDbDataAdapter(sqlSelect, conn);
DataSet ds = new DataSet();
adapter.Fill(ds, "Inventory");
return ds.Tables[0];


Samir found a solution, to add a schema file which identifies all fields in the csv file as string values.
And call DataSet.ReadXMLSchema() method to attach the schema to the DataSet. Also the schema constraints are not enforced .

modified code looks like this..


string strConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + folderName + ";Extended Properties=\"text;HDR=Yes; FMT=Delimited; IMEX=1\"";
string sqlSelect = "select * from [" + fileName + "]";
System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection(strConnString.Trim());
conn.Open();
System.Data.OleDb.OleDbDataAdapter adapter = new System.Data.OleDb.OleDbDataAdapter(sqlSelect, conn);
DataSet ds = new DataSet();

ds.ReadXmlSchema(Server.MapPath("xmlschema.xsd"));
ds.EnforceConstraints = false;


adapter.Fill(ds, "Inventory");
return ds.Tables[0];


Even after using this code values like "F-150" was read as -150 if there is only one row in the csv file.

F-100 reads as -100
K-100 reads as -100
S-100 reads as -100

100-F reads as -100
100-K reads as -100
100-S reads as -100
100$F reads as 100
100$K reads as 100
100$S reads as 100
F100 reads as 100
K100 reads as 100
S100 reads as 100
F\100 reads as 100
K\100 reads as 100
S\100 reads as 100
F.100 reads as .1
K.100 reads as .1
S.100 reads as .1
K-.\\$$$.\\1 reads as -0.1
K\\-22..$$\\21 reads as -22.221
-$\.FSK1 reads as -0.1

This happens only wen there is only one row in the csv file or more than half the values in a column is having the avove specified values

May be because ADO.NET does some internal calculation to treat F as floating point or something.. The fun part is that f-150 is read correctly.. problem is with capital letters :) .
So we ended up using a custom third party csv reader .
Moral of the story... never use ADO.NET csv reader... Always go for a custom CSV parser or a third party library.

Monday, May 14, 2007

Method to sort an array of strings in descending order of number of words in each array element

Just adding a method which i wrote for an application for which the requirement was scrapped.
Hope someone can refer to this silly method. ;)

///
/// Method to sort an array of strings in descending order of number of words in each array element
///

/// Array to be sorted
/// Array sorted in descending order of number of words in each array element

private static string[] SortArrayWithDescendingWordCount(string[] strArray)
{
//Array to store the number of words in each string of the array to be sorted
int[] wordLengths = new int[strArray.Length];
//variable to keep track of array index of wordLengths array.
int arrayIndex = 0;
foreach (string str in strArray)
{
//split the string in to an array of words and store the word count in wordLengths array.
wordLengths[arrayIndex] = str.Split(' ').Length; ;
arrayIndex++;
}
//Sort arrays(ascending order) by taking wordLenths array as key and strArray as value
Array.Sort(wordLengths, strArray);
//now reverse strArray array to sort the array in descending order of number of words in each array element
Array.Reverse(strArray);
return strArray;
}

Monday, March 05, 2007

VIewState issue in creating a Word Doc from Rendered ASPX

Early last week, i was involved in a pretty interesting assignment of creating a word document with some dynamically created data on the click of a button from an ASPX Page. I created a template aspx page for the word Document. In the load of the page, i took all the values from session and loaded the labels. Did a Server.Execute("Page.aspx", StringWriter) from the button click event. the StringWriter object was converted to a byte array and written in to the HTTP Stream using Response.BinaryWrite(byte[]). Everything went well and the word document was opened . But there was an uninvited guest in the word document when i opened it. it was a text box on top of the word doc with some junk values in it. After some investigations, I understood that it was the Viewstate variable created as an inside the rendedred HTML. I turned off viewstate by doing Page.EnableViewState = False. And tested again... but now also the text box appeared in the word document , but with no values in it. Even if you turn off ViewState, the viewstate variable will be created in the rendered HTML with no values in it. Workarounds:Now, there are 2 workarounds for this issue.(As suggested by 2 Avanade comunity members) 1. Use a usercontrol with the doc template created in it. Load values in the load event of the control. On Pag Load event, render the usercontrol using UserControl.RenderControl();This worked out perfectly.
2. Either use Server.Execute("Page.aspx", StringWriter) from another page or this.Render(HTMLTextWriter) from the current page and get the rendered HTML of the Word Doc page in a StringWriter, Convert it to a string and find out Viewstate declaration inside the string and remove it. We choose this method as we had some problem in creating usercontrol in the complex architecture.
I wrote the following method to remove ViewState from the rendered HTML:

//Method to remove __VIEWSTATE and stateTarget variables
private string RemoveInputFieldsFromRenderedHTML(string pageData)
{//pageDate variable contails the rendered HTML from the stringwriter.
int startIndex;
int endIndex;
//Variable to store the length of closing tag of Viewstate

string newPageData = "";
string viewStateStartTag = "string viewStateEndTag = "\" />";int lengthOfClosingTag = 4; //This can be re written as viewStateEndTag.Length()
startIndex = pageData.IndexOf(viewStateStartTag);
endIndex = pageData.IndexOf(viewStateEndTag,startIndex);
if(startIndex > 0)
newPageData = pageData.Remove( startIndex,(endIndex + lengthOfClosingTag) - startIndex);

return newPageData;
}


Hope this piece of code will be useful to somebody someday....

Saturday, February 24, 2007

I was hosting an intranet website in Windows server 2003 with IIS6 . I noticed that the server was taking too much time to respond when the site was accessed . Then in the task manager I saw w3wp.exe(worker process for ASP.NET) was taking more than 500MB of memory.As a result the OS was using too much virtual memory .(Hard disk busy light was always on ). I went to microsoft website and saw that there is a hotfix for this problem and after applying this hotfix , a registry tweak must be done . But I couldnt find the hotfix patch to download.

I restarted the machine and everything was working fine. Then i wrote a scrip to restart the server every sunday night. It was the only solution feasible at that point of time.

I wrote a post regarding this issue in www.dotnetjunkies.com and today i got a gud reply for it. The reply is as follows.


Re: w3wp.exe taking too much memoryPosted: 02-23-2007 11:46 PM
That w3wp.exe also has been creating havoc on our servers us so we wrote a simple application to control it. Details and download here:
http://www.linkhelpers.net/w3wp_fix.asp Works for us.

Details in the link is given below:

w3wp.exe FixSomehow, somewhere, someone released some untested code and hundreds of servers are crippled. I scanned through the forums an all the fixes were red herrings. So I decided to look into it myself. Looping like a loon it just eats up CPU resources like a voracious pig. The solution I came up with was to set w3wp.exe's priority to Idle. This seemed to help but the application would eventually crash and restart with the priority defaulting to Normal. Well baby sitting an executable is not realistic.
HeartBeatHeartBeat.exe is a stand alone application that will pulse every couple of seconds and reset w3wp.exe's priority to Idle. This appears to lessen the impact but is not a permanent fix. This fix has not been tested with servers running ISAPI filters which w3wp.exe is supposed to address. There are no guarantees or do we accept any liability for the use of this fix.
Access DatabasesFor some reason I still have not figured out, an *.ldb file causes IIS worker error dialogs. Once this file is deleted, the error dialogs stop. Randy Delos Reyes was instrumental in figuring most of this out.
Download HeartBeat
http://www.linkhelpers.net/public/HeartBeat.zipRemember, test this fix first before using it. We are not liable for any unforseen problems you may have. I do know that it works great for our servers.

--------------------------------------------

Try your luck . :) .