0

I have a function called fetchXML that is suppose to write an XML file to my root directory called feed.xml, and then I want to console.log the data inside feed.xml. I use fs.readFile AND I specify the encoding with 'utf-8' as shown in this question: Why does Node.js' fs.readFile() return a buffer instead of string?

But still the result of my console.log is a buffer. I checked inside feed.xml and it does indeed contain xml.

var out = fs.createWriteStream('./feed.xml'); var fetchXML = function() { var feedURL = 'http://www2.jobs2careers.com/feed.php?id=1237-2595&c=1&pass=HeahE0W1ecAkkF0l'; var stream = request(feedURL).pipe(zlib.createGunzip()).pipe(out); stream.on('finish', function() { fs.readFile('./feed.xml', 'utf-8', function(err, data) { console.log(data); }); }); } fetchXML(); 
5
  • what node version? Commented Nov 18, 2016 at 1:48
  • The latest for experimental features. 7.1 Commented Nov 18, 2016 at 1:53
  • I assume the err returned from readFile is null? Also, is the output correct (other than being a buffer instead of a string)? Commented Nov 18, 2016 at 2:13
  • There isn't an error, but I'm not sure how to answer your question about whether the buffer is correct or not, I just know it's a buffer instead of a string containing the xml. Commented Nov 18, 2016 at 2:15
  • I mean, are the bytes correct ascii for the xml you are expecting? Also, what zlib library are you using? Commented Nov 18, 2016 at 2:19

1 Answer 1

1

The main issue here is that err is set in this case and it will tell you that toString() failed (due to the size of the file). It then leaves the data it read as a Buffer and passes that as the second argument to the callback.

This could be perceived as a partial bug since most people probably would not expect to see a second argument passed in, but at the same time err is set (and you should always handle errors) and it does give you an opportunity to do something else with the (raw binary) data that was already read into memory.

As far as solutions go, you will probably want a streaming parser for large amounts of data like this (hundreds of megabytes). For XML, one such module that provides a streaming interface is node-expat.

Sign up to request clarification or add additional context in comments.

5 Comments

I would like to take the xml and convert it to JSON with xml2js, but when I try that with the "data" it return undefined instead of having JSON. I thought that might have something to do with it returning a buffer.
You will want to use a streaming parser for very large documents, as I have now noted in my answer.
What exactly am I trying to accomplish with the streaming parser? I'm a bit confused as to why my current solution isn't working to be honest. I've never done anything like this before, so you'll have to revert to a basic explanation.
V8 has a max string size and when you ask node to convert the raw binary data to a JavaScript string, it fails because the data is too large. In this particular case the XML data is/was over 300mb and V8's current max string size is ~268mb. If you use a streaming parser, you can parse the XML chunk by chunk instead of trying to load the entire XML file as one giant string first and then parsing it.
Ok that's makes sense. For node-expat, which function do I need to use? I see a variety of them startElement, endElement etc, but it's not exactly self explanatory how to use them...

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.