Decoding GZIP encoded Body, BodyBytes (ByteArray) and BodyBytesArray from URLRead

Question

When calling the Stack Exchange API v2.2 using the new (Mma11) function URLRead we get a GZIP encoded Body in alternative formats: "Body", "BodyByteArray" and "BodyBytes" depending on the options.

How can we deal with each of these formats in order to decode the body in Mathematica?

I am particularly intrigued with operating on ByteArray

I know I could get the content using

Import[URLBuild[{"https://api.stackexchange.com", "2.2", "info"}, {"site" -> "mathematica"}], "RawJSON"]

My questions is about dealing explicitly with list of byte values as well as ByteArray that are encoded. Hopefully without creating a temporary file, to then read it back.

The code I'm using:

reply=URLRead[ URLBuild[{"https://api.stackexchange.com", "2.2", "info"}, {"site" -> "mathematica"}] , {"Headers", "StatusCode", "StatusCodeDescription", "ContentType", "BodyByteArray"}]

<|"Headers" -> { "cache-control" -> "private" , "content-type" -> "application/json; charset=utf-8" , "content-encoding" -> "gzip" , "access-control-allow-origin" -> "*" , "access-control-allow-methods" -> "GET, POST" , "access-control-allow-credentials" -> "false" , "x-content-type-options" -> "nosniff" , "date" -> "Thu, 11 May 2017 10:55:51 GMT" , "content-length" -> "234" } , "StatusCode" -> 200 , "StatusCodeDescription" -> "OK" , "ContentType" -> "application/json; charset=utf-8" , "BodyBytesArray" -> ByteArray[< 234 >] |>

By the way, strangely to me, Import[URLRead[url], "RawJSON"] doesn't work.

$\begingroup$ related: 45282 $\endgroup$

Kuba
– Kuba

2017-05-11 12:53:38 +00:00
Commented May 11, 2017 at 12:53 — Kuba
– Kuba, Commented May 11, 2017 at 12:53

Kuba · Accepted Answer · 2017-08-29 12:30:09Z

"BodyBytes"

a list of bytes from http response. Does not mean much without encoding information

"BodyByteArray"

afaict "BodyBytes" wrapped with ByteArray. (1.)

"Body"

afaict, a String - ToCharacterCode[#BodyBytes, encoding], where encoding is read from charset content-type header.

That is a problem for us. First of all it ignores content-encoding. Additionally json does not need charset sub-header but without it it won't be recognized as utf8 (in case without gzip). Don't know if that is expected, probably deserves a spearate question.

So, the safe way (4.) is through bytes, e.g.:

URLRead[ "https://api.stackexchange.com/2.2/info?site=mathematica" , "BodyBytes" ] // FromCharacterCode // ImportString[#, {"gzip", "RawJSON"}] & (*3.*)

<|"items" -> {<|"new_active_users" -> 0, "total_users" -> 31928, "badges_per_minute" -> 0.03, "total_badges" -> 92485, "total_votes" -> 567164, "total_comments" -> 325306, "answers_per_minute" -> 0.02, "questions_per_minute" -> 0.01, "total_answers" -> 63637, "total_accepted" -> 23610, "total_unanswered" -> 4679, "total_questions" -> 43005, "api_revision" -> "2017.5.3.25597"|>}, "has_more" -> False, "quota_max" -> ..., "quota_remaining" -> ...|>

Stack Exchange Network

Decoding GZIP encoded Body, BodyBytes (ByteArray) and BodyBytesArray from URLRead

1 Answer 1

Linked

Hot Network Questions

Decoding GZIP encoded Body, BodyBytes (ByteArray) and BodyBytesArray from URLRead

1 Answer 1

Linked

Related

Hot Network Questions