Java : Convert formatted xml file to one line string

Question

I have a formatted XML file, and I want to convert it to one line string, how can I do that.

Sample xml:

<?xml version="1.0" encoding="UTF-8"?> <books> <book> <title>Basic XML</title> <price>100</price> <qty>5</qty> </book> <book> <title>Basic Java</title> <price>200</price> <qty>15</qty> </book> </books>

Expected output

<?xml version="1.0" encoding="UTF-8"?><books><book> <title>Basic XML</title><price>100</price><qty>5</qty></book><book><title>Basic Java</title><price>200</price><qty>15</qty></book></books>

@Tomalak I need that to be pass to a cgi as an input and that cgi only accepts xml in one-line form. — Ianthe
– Ianthe, Commented Apr 4, 2011 at 14:32

ant · Accepted Answer · 2011-04-01 09:14:35Z

48

//filename is filepath string BufferedReader br = new BufferedReader(new FileReader(new File(filename))); String line; StringBuilder sb = new StringBuilder(); while((line=br.readLine())!= null){ sb.append(line.trim()); }

using StringBuilder is more efficient then concat http://kaioa.com/node/59

edited Apr 1, 2011 at 9:14

answered Apr 1, 2011 at 8:56

ant

22.9k36 gold badges139 silver badges185 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Ocaso Protal Over a year ago

This will not remove leading/trailing spaces, no?

Puce Over a year ago

This doesn't respect the encoding mentioned in the XML document, does it?

Fırat Çağlar Akbulut Over a year ago

sorry for offtopic comment but that link is expired and redirect users to irrelevant domains.

Mohammad Faisal · Accepted Answer · 2018-02-19 06:58:11Z

Run it through an XSLT identity transform with <xsl:output indent="no"> and <xsl:strip-space elements="*"/>

<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="no" /> <xsl:strip-space elements="*"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>

It will remove any of the non-significant whitespace and produce the expected output that you posted.

this seems to be a nice way but you did not mention how to run this XSLT in Java?

Illidanek · Accepted Answer · 2014-05-23 14:58:50Z

6

// 1. Read xml from file to StringBuilder (StringBuffer) // 2. call s = stringBuffer.toString() // 3. remove all "\n" and "\t": s.replaceAll("\n",""); s.replaceAll("\t","");

edited:

I made a small mistake, it is better to use StringBuilder in your case (I suppose you don't need thread-safe StringBuffer)

edited May 23, 2014 at 14:58

Illidanek

1,0161 gold badge18 silver badges34 bronze badges

answered Apr 1, 2011 at 8:54

lukastymo

26.9k14 gold badges56 silver badges66 bronze badges

4 Comments

Jeff Foster Over a year ago

What if there was whitespace between a content element e.g. <text>foo (newline) bar</text>?

lukastymo Over a year ago

double spaces, look at expected result, we have e.g. <book> <title> - after book is space. I don't think @sprenna want do something with spaces.

Ocaso Protal Over a year ago

It looks like an error in the example, b/c the other <book><title> combinations have no space in between

Ianthe Over a year ago

that is a typo, there shouldn't be any space in between. sorry for that.

Al Foиce ѫ · Accepted Answer · 2017-07-21 10:43:06Z

5

In java 1.8 and above

BufferedReader br = new BufferedReader(new FileReader(filePath)); String content = br.lines().collect(Collectors.joining("\n"));

edited Jul 21, 2017 at 10:43

Al Foиce ѫ

4,33112 gold badges44 silver badges51 bronze badges

answered Jul 21, 2017 at 10:14

vijay yadav

1312 silver badges3 bronze badges

1 Comment

Gediminas Rimsa Over a year ago

If the OP wants to minify the XML, something like this might work for most documents: reader.lines().map(String::trim).collect(Collectors.joining());. Note: it would likely fail in cases where element attributes are split over multiple lines.

Community · Accepted Answer · 2017-05-23 11:46:34Z

Using this answer which provides the code to use Dom4j to do pretty-printing, change the line that sets the output format from: createPrettyPrint() to: createCompactFormat()

public String unPrettyPrint(final String xml){ if (StringUtils.isBlank(xml)) { throw new RuntimeException("xml was null or blank in unPrettyPrint()"); } final StringWriter sw; try { final OutputFormat format = OutputFormat.createCompactFormat(); final org.dom4j.Document document = DocumentHelper.parseText(xml); sw = new StringWriter(); final XMLWriter writer = new XMLWriter(sw, format); writer.write(document); } catch (Exception e) { throw new RuntimeException("Error un-pretty printing xml:\n" + xml, e); } return sw.toString(); }

james.garriss · Accepted Answer · 2013-08-01 17:39:17Z

4

Open and read the file.

Reader r = new BufferedReader(filename); String ret = ""; while((String s = r.nextLine()!=null)) { ret+=s; } return ret;

edited Aug 1, 2013 at 17:39

james.garriss

13.4k7 gold badges86 silver badges101 bronze badges

answered Apr 1, 2011 at 8:53

user684934

3 Comments

lukastymo Over a year ago

ret +=s :(( don't do that, better use StringBuffer

user684934 Over a year ago

@smas :P it's not real code, I still haven't figured out to properly format on this site so I went for the most concise way. The idea still holds (if you import the relevant libraries, set up the variables like filename, and set up try try{} catch{} blocks)

ant Over a year ago

don't use string concat or stringbuffer as smas suggests, use StringBuilder kaioa.com/node/59

Valentyn Kolesnikov · Accepted Answer · 2022-01-24 16:58:51Z

Underscore-java library has static method U.formatXml(xmlstring). Live example

import com.github.underscore.U; import com.github.underscore.Xml; public class MyClass { public static void main(String[] args) { System.out.println(U.formatXml("<a>\n <b></b>\n <b></b>\n</a>", Xml.XmlStringBuilder.Step.COMPACT)); } } // output: <a><b></b><b></b></a>

Jeff Foster · Accepted Answer · 2011-04-01 08:55:48Z

I guess you want to read in, ignore the white space, and write it out again. Most XML packages have an option to ignore white space. For example, the DocumentBuilderFactory has setIgnoringElementContentWhitespace for this purpose.

Similarly if you are generating the XML by marshaling an object then JAXB has JAXB_FORMATTED_OUTPUT

user1113792 · Accepted Answer · 2016-08-02 17:39:46Z

The above solutions work if you are compressing all white space in the XML document. Other quick options are JDOM (using Format.getCompactFormat()) and dom4j (using OutputFormat.createCompactFormat()) when outputting the XML document.

However, I had a unique requirement to preserve the white space contained within the element's text value and these solutions did not work as I needed. All I needed was to remove the 'pretty-print' formatting added to the XML document.

The solution that I came up with can be explained in the following 3-step/regex process ... for the sake of understanding the algorithm for the solution.

String regex, updatedXml; // 1. remove all white space preceding a begin element tag: regex = "[\\n\\s]+(\\<[^/])"; updatedXml = originalXmlStr.replaceAll( regex, "$1" ); // 2. remove all white space following an end element tag: regex = "(\\</[a-zA-Z0-9-_\\.:]+\\>)[\\s]+"; updatedXml = updatedXml.replaceAll( regex, "$1" ); // 3. remove all white space following an empty element tag // (<some-element xmlns:attr1="some-value".... />): regex = "(/\\>)[\\s]+"; updatedXml = updatedXml.replaceAll( regex, "$1" );

NOTE: The pseudo-code is in Java ... the '$1' is the replacement string which is the 1st capture group.

This will simply remove the white space used when adding the 'pretty-print' format to an XML document, yet preserve all other white space when it is part of the element text value.

notebook · Accepted Answer · 2022-06-30 03:16:31Z

Below I present the prepared solution. Only the standard library of Java 1.8 was used.

XSLT:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="no"/> <xsl:strip-space elements="*"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>

Java:

public static String convertXmlToOneLine(String xml) throws TransformerException { final String xslt = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n" + "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">\n" + " <xsl:output indent=\"no\"/>\n" + " <xsl:strip-space elements=\"*\"/>\n" + " <xsl:template match=\"@*|node()\">\n" + " <xsl:copy>\n" + " <xsl:apply-templates select=\"@*|node()\"/>\n" + " </xsl:copy>\n" + " </xsl:template>\n" + "</xsl:stylesheet>"; /* prepare XSLT transformer from String */ Source xsltSource = new StreamSource(new StringReader(xslt)); TransformerFactory factory = TransformerFactory.newInstance(); Transformer transformer = factory.newTransformer(xsltSource); /* where to read the XML? */ Source source = new StreamSource(new StringReader(xml)); /* where to write the XML? */ StringWriter stringWriter = new StringWriter(); Result result = new StreamResult(stringWriter); /* transform XML to one line */ transformer.transform(source, result); return stringWriter.toString(); }

Sample output:

<?xml version="1.0" encoding="UTF-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output indent="no"/><xsl:strip-space elements="*"/><xsl:template match="@*|node()"><xsl:copy><xsl:apply-templates select="@*|node()"/></xsl:copy></xsl:template></xsl:stylesheet>

License: The MIT License

Charu Khurana · Accepted Answer · 2013-08-27 19:17:20Z

-2

FileUtils.readFileToString(fileName);

link

answered Aug 27, 2013 at 19:17

Charu Khurana

4,5518 gold badges52 silver badges82 bronze badges

1 Comment

Grambot Over a year ago

The link even dictates that the method is depreciated. I wouldn't recommend using this method when a simple buffer read with trim would suffice

Collectives™ on Stack Overflow

Java : Convert formatted xml file to one line string

11 Answers 11

3 Comments

2 Comments

4 Comments

1 Comment

Comments

3 Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

3 Comments

2 Comments

4 Comments

1 Comment

Comments

3 Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Related