1

I have an XML in a string that I need to actually transform to html using an xsl.

I do the transform with XslCompiledTransform. In order for this to work, I am parsing the string that contains the XML to XML using XPathDocument.

However if I try to parse the string straight to the XPathDocument, then I get the error:

Illegal Characters in path.

So I had to include a StringReader in order to be able to parse the string to the XPathDocument. (Using the solutions in the posts I linked below.)

Here is my step by step procedure:

  1. The string is retrieved from SDL Trados Studio and it depends on the XML that is being worked on (how it was originally created and loaded for translations) the string sometimes has a BOM sometimes not. The 'xml' is actually parsed from the segments of the source and target text and the structure element. The textual elements are escaped for xml and the markup and text is joined in one string. (My separate post on the removal of the BOM is C# XPathDocument parsing string to XML with BOM.)

  2. The the string is then parsed into an XPathDocument using a StringReader.

  3. The transform is done with the XslCompiledTransform, using a StringBuilder and a StringWriter.

  4. Transformed xml (now html) is saved to a file.

Here is my code:

//Recreate XML file using an extractor returns a string array string strSourceXML = String.Join("", extractor.TextSrc); //strip BOM strSourceXML = strSourceXML.Substring(strSourceXML.IndexOf("<?")); //Transform XML with the preview XSL var xSourceDoc = new XPathDocument(strSourceXML); //Load XSL var xTr = new XslCompiledTransform(); var xslt = Settings.GetValue("WordPreview", "XSLTpath", ""); xTr.Load(xslt); //Parse XML string dynamic xSourceDoc; using (StringReader s = new StringReader(strSourceXML)) { xSourceDoc = new XPathDocument(s); } //Transform the XML StringBuilder sb1 = new StringBuilder(); StringWriter swSource = new StringWriter(sb1); xTr.Transform(xSourceDoc, null, swSource); //Transformed file saved to the disk string tmpSourceDoc = Path.GetTempFileName(); System.IO.StreamWriter writer1 = new System.IO.StreamWriter(tmpSourceDoc, false, Encoding.Unicode); writer1.Write(sb1.ToString()); writer1.Close(); 

My question is: Is there a simpler way to solve it? Any suggestions to transform the string straight using the XSLT? Or if not, is there a direct way to parse a string to the XPathDocument?

I have searched over many posts on Stack Overflow such as these:

But none of them give me the solution to do this simpler. Any suggestion is welcome. Thanks.

5
  • Why don't you just use something like this: XmlDocumentInstance.LoadXml(yourString); Read more here: msdn.microsoft.com/en-us/library/… Commented May 15, 2016 at 1:34
  • @DimitreNovatchev What code lines would I need to replace then? Can you make this comment into an answer with some code lines? Commented May 15, 2016 at 4:47
  • 1
    Just a look at the provided msdn link will help you -- there is a code example. LoadXml() produces the wanted XmlDocument instance from your string. Then you can transform that XmlDocument in the usual way. Commented May 15, 2016 at 5:09
  • I did of course! But it was not telling me about the transform this is why I asked. I used the XPathDocument because that's the type that worked with the transform. But I may have done some other mistake. So you say that the XslCompiledTransform takes the XmlDocument as well? Commented May 15, 2016 at 5:13
  • 1
    Yes. Read this. XmlDocument implements IXpathNavigable: msdn.microsoft.com/en-us/library/ms163435(v=vs.110).aspx Commented May 15, 2016 at 5:37

1 Answer 1

1

Don't need the intermediate StringBuilder and StringWriter.
XsltCompiledTransform instance can immediately writes to the stream on disk.

string strSourceXML = string.Concat(extractor.TextSrc); strSourceXML = strSourceXML.Substring(strSourceXML.IndexOf("<?")); var xTr = new XslCompiledTransform(); var xslt = Settings.GetValue("WordPreview", "XSLTpath", ""); xTr.Load(xslt); string tmpSourceDoc = Path.GetTempFileName(); using (var reader = new StringReader(strSourceXML)) using (var writer = new StreamWriter(tmpSourceDoc, false, Encoding.Unicode)) { var xSourceDoc = new XPathDocument(reader); xTr.Transform(xSourceDoc, null, writer); } 
Sign up to request clarification or add additional context in comments.

4 Comments

The two usings would the second one need to be put into the block of the first one? like using () {using() { //code }}?
One thing though, I need the StringWriter because I need too replace the header of the HTML to make it a Word doc, my earlier post on this: stackoverflow.com/a/37036506/6201755
@ib11 - A few using in a row is a short form. I would made the necessary html headers in the xsl transformation.
OK on using, thanks. Hmm, html headers in the transform... Dah... why did I not think of this? :S Well let's see...

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.