58

I am writing an XML validator with XSD.

Below is what I did, but when the validator reached the line while (list.Read()) it gives me the error

There is no Unicode byte order mark. Cannot switch to Unicode.

Can anybody help me fix it?

public class Validator { public void Validate(string xmlString) { Boolean bRet = true; string xmlPath = @"C:\x.xml"; string xsdPath = @"C:\general.xsd"; XmlReaderSettings Settings = new XmlReaderSettings(); Settings.Schemas.Add("", xsdPath); Settings.ValidationType = ValidationType.Schema; Settings.ValidationEventHandler += new ValidationEventHandler(SettingsValidationEventHandler); XmlReader list = XmlReader.Create(xmlPath, Settings); //StringBuilder output = new StringBuilder(); while (list.Read()) { } //File.WriteAllText(@"D:\Output.xml", output.ToString()); } static void SettingsValidationEventHandler(object sender, ValidationEventArgs e) { if (e.Severity == XmlSeverityType.Warning) { MessageBox.Show( "WARNING: "); MessageBox.Show(e.Message); } else if (e.Severity == XmlSeverityType.Error) { MessageBox.Show("ERROR: "); MessageBox.Show(e.Message); } } } 

XML

<?xml version="1.0" encoding="utf-16"?> <FlashList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" vin="xxxxxxxxxxxxx"> <flash ECUtype="xxx" /> </FlashList> 

XSD

<?xml version="1.0" encoding="utf-16"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="FlashList"> <xs:complexType> <xs:sequence> <xs:element name="flash" maxOccurs="unbounded" minOccurs="0"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute type="xs:string" name="ECUtype" use="optional"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="Error" maxOccurs="unbounded" minOccurs="0"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute type="xs:byte" name="code" use="optional" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:string" name="vin"/> </xs:complexType> </xs:element> </xs:schema> 
2
  • 3
    Are you sure the "physical" file x.xml is properly encoded? Open it with a text editor such as Sublime or jEdit, to check the actual encoding. Commented Apr 28, 2015 at 10:23
  • yes, I have made this XML file on the server side using the c# generated class from the same xsd file and it is well formed. this code is on the client side and I just want to validate my received xml file with the same xsd on the client side also Commented Apr 28, 2015 at 10:38

4 Answers 4

100

The reality of your file's encoding appears to conflict with that specified by your XML declaration. If your file actually uses one-byte characters, declaring encoding="utf-16" won't change it to use two-byte characters, for example.

Try removing the conflicting encoding from the XML declaration. Replace

<?xml version="1.0" encoding="utf-16"?> 

with

<?xml version="1.0"?> 

You may also be able to load the file into a string as a work-around using LoadXML().

Sign up to request clarification or add additional context in comments.

5 Comments

FWIW: <?xml version="1.0" encoding="utf-8"?> might do the trick too.
Yes, because utf-8 is the default encoding.
After encountering a similar error, this answer helped me solving my own problem. In my case, I was first creating the xml programmatically, then reading and writing to it at a later point. If you want to remove/change the encoding version in the processing instruction using xmlwriter, use writer.WriteProcessingInstruction("xml", "version='1.0'"); (with writer being an instance of XmlWriter). See msdn doc
The workaround "You may also be able to load the file into a string as a work-around using LoadXML()." worked for me.
But the question is if the workaround is safe to be implemented?
4

This error is thrown, when you declare encoding by UTF-16 in XML head, but physically don't save this file in such encoding.

You can check using simple Windows Notepad, clicking to Save As, and then in the bottom check encoding of xml file (probably it is still UTF-8, instead of UTF-16).

Screenshot of notepad encoding setting

Comments

3

If you are not able to change the xml file encoding as

<?xml version="1.0"?> 

Alternatively, you can read the xml content directly as raw xml instead of loading it with xml path.

XmlReader.Create(new StringReader(File.ReadAllText(fileName))); 

If you use XmlDocument;

var xmlDoc = new XmlDocument(); xmlDoc.LoadXml(File.ReadAllText(filePath)); 

2 Comments

Do not use File.ReadAllText. Always create a StreamReader and FileStream. Never allocate file-sized chunks in memory.
@Mr.TA If it is a known, small file, like settings or whatever File.ReadAllText is perfectly OK.
0

You can use a StreamReader to set the encoding:

 return (TReport)xmlSerializer.Deserialize( new StreamReader( new FileStream(filename, FileMode.Open, FileAccess.Read), Encoding.UTF8)); 

Depending on your application, it might not be optimal to use a string to pass the xml, consider a stream instead.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.