Equivalent JavaScript functions for Python's urllib.parse.quote() and urllib.parse.unquote()

Question

Are there any equivalent JavaScript functions for Python's urllib.parse.quote() and urllib.parse.unquote()?

The closest I've come across are encodeURI()/encodeURIComponent() and escape() (and their corresponding un-encoding functions), but they don't encode/decode the same set of special characters as far as I can tell.

(un)escape did the work for me

alex
– alex

2021-10-31 16:41:05 +00:00
Commented Oct 31, 2021 at 16:41 — alex
– alex, Commented Oct 31, 2021 at 16:41

mjhm · Accepted Answer · 2022-03-25 03:25:22Z

JavaScript | Python ----------------------------------- encodeURI(str) | urllib.parse.quote(str, safe='~@#$&()*!+=:;,?/\''); ----------------------------------- encodeURIComponent(str) | urllib.parse.quote(str, safe='~()*!\'')

On Python 3.7+ you can remove ~ from safe=.

Cameron · Accepted Answer · 2011-04-05 16:37:14Z

OK, I think I'm going to go with a hybrid custom set of functions:

Encode: Use encodeURIComponent(), then put slashes back in.
Decode: Decode any %hex values found.

Here's a more complete variant of what I ended up using (it handles Unicode properly, too):

function quoteUrl(url, safe) { if (typeof(safe) !== 'string') { safe = '/'; // Don't escape slashes by default } url = encodeURIComponent(url); // Unescape characters that were in the safe list toUnencode = [ ]; for (var i = safe.length - 1; i >= 0; --i) { var encoded = encodeURIComponent(safe[i]); if (encoded !== safe.charAt(i)) { // Ignore safe char if it wasn't escaped toUnencode.push(encoded); } } url = url.replace(new RegExp(toUnencode.join('|'), 'ig'), decodeURIComponent); return url; } var unquoteUrl = decodeURIComponent; // Make alias to have symmetric function names

Note that if you don't need "safe" characters when encoding ('/' by default in Python), then you can just use the built-in encodeURIComponent() and decodeURIComponent() functions directly.

Also, if there are Unicode characters (i.e. characters with codepoint >= 128) in the string, then to maintain compatibility with JavaScript's encodeURIComponent(), the Python quote_url() would have to be:

def quote_url(url, safe): """URL-encodes a string (either str (i.e. ASCII) or unicode); uses de-facto UTF-8 encoding to handle Unicode codepoints in given string. """ return urllib.quote(unicode(url).encode('utf-8'), safe)

And unquote_url() would be:

def unquote_url(url): """Decodes a URL that was encoded using quote_url. Returns a unicode instance. """ return urllib.unquote(url).decode('utf-8')

I used the unquote_url function but ran into issues when moving to Python 3 - the decode is automatic in python 3, in python 2, it is still required. I couldn't figure out a way to do it nicely that worked in both languages. My py3 code is urllib.parse.unquote(six.text_type(a))

Milimetric · Accepted Answer · 2017-07-27 15:23:56Z

6

The requests library is a bit more popular if you don't mind the extra dependency

from requests.utils import quote quote(str)

edited Jul 27, 2017 at 15:23

answered Feb 11, 2016 at 15:44

Milimetric

13.5k4 gold badges49 silver badges57 bronze badges

1 Comment

user3064538 Over a year ago

requests.utils.quote is just urllib.parse.quote. Don't install requests just for that function.

Timothy C. Quinn · Accepted Answer · 2022-05-31 14:29:17Z

Here are implementations based on a implementation on github repo purescript-python:

import urllib.parse as urllp def encodeURI(s): return urllp.quote(s, safe="~@#$&()*!+=:;,.?/'") def decodeURI(s): return urllp.unquote(s, errors="strict") def encodeURIComponent(s): return urllp.quote(s, safe="~()*!.'") def decodeURIComponent(s): return urllp.unquote(s, errors="strict")

Luke Stanley · Accepted Answer · 2010-06-01 03:47:21Z

Python: urllib.quote

Javascript:unescape

I haven't done extensive testing but for my purposes it works most of the time. I guess you have some specific characters that don't work. Maybe if I use some Asian text or something it will break :)

This came up when I googled so I put this in for all the others, if not specifically for the original question.

jiggy · Accepted Answer · 2009-06-03 19:26:24Z

0

Try a regex. Something like this:

mystring.replace(/[\xFF-\xFFFF]/g, "%" + "$&".charCodeAt(0));

That will replace any character above ordinal 255 with its corresponding %HEX representation.

answered Jun 3, 2009 at 19:26

jiggy

3,8261 gold badge27 silver badges41 bronze badges

2 Comments

Cameron Over a year ago

That's great for characters above 255, but there are some other funny ones that quote() catches that are below 255 (like '?', '&', '@', and others I don't know about)

jiggy Over a year ago

The brackets denote a character set which can include individual characters as well as ranges. You can just as easily write it as /[\?&@\xFF-\xFFFF]/g to achieve that result. You just need to escape any chars that also regex special chars (like ? or /).

score 0 · Accepted Answer · 2022-03-25 03:54:06Z

decodeURIComponent() is similar to unquote

const unquote = decodeURIComponent const unquote_plus = (s) => decodeURIComponent(s.replace(/\+/g, ' '))

except that Python is much more forgiving. If one of the two characters after a % is not a hex digit (or there's not two characters after a %), JavaScript will throw a URIError: URI malformed error, whereas Python will just leave the % as is.

encodeURIComponent() is not quite the same as quote, you need to percent encode a few more characters and un-escape /:

const quoteChar = (c) => '%' + c.charCodeAt(0).toString(16).padStart(2, '0').toUpperCase() const quote = (s) => encodeURIComponent(s).replace(/[()*!']/g, quoteChar).replace(/%2F/g, '/') const quote_plus = (s) => quote(s).replace(/%20/g, '+')

The characters that Python's quote doesn't escape is documented here and is listed as (on Python 3.7+) "Letters, digits, and the characters '_.-~' are never quoted. By default, this function is intended for quoting the path section of a URL. The optional safe parameter specifies additional ASCII characters that should not be quoted — its default value is '/'"

The characters that JavaScript's encodeURIComponent doesn't encode is documented here and is listed as uriAlpha (upper and lowercase ASCII letters), DecimalDigit and uriMark, which are - _ . ! ~ * ' ( ).

Ken H · Accepted Answer · 2023-03-21 16:57:22Z

I am passing text files back and forth between Python and JavaScript.

Although urllib.parse.quote (Python side) and decodeURIComponent (JavaScript side) seems to work OK, it may not work for every character correctly.

So I wrote my own function that should be 100% reliable, regardless of the characters in the text file.

On the Python side I use xxd to encode the file. xxd is a linux utility that converts the binary file to a string of 2 hex digits for each binary byte. The Python code to encode the file to a string of of hex codes from Python is:

mystring = os.popen("xxd -p "+your_file_name_here).read().replace('\n','')

If you want to do the xxd conversion in Python instead of using the external program, you can use these functions. They only work with text files, though. If you need to work with binary, stick with the external xxd program.

 def doxxd(s): xd="" c="" for i in range(0,len(s)): if (ord(s[i]))<16: c=hex( ord(s[i]) ).replace('0x','0') else: c=hex( ord(s[i]) ).replace('0x','') xd+=c return xd def unxxd(x): s="" #get two chars at a time for i in range(0,len(x),2): s+=chr(int('0x'+x[i:i+2],16)) return s

On the JavaScript side this function restores the hex code file back to the original text string:

function unxxd(str){ var s="" //get two chars at a time for (i=0;i<str.length;i=i+2){ s+=String.fromCharCode(parseInt("0x"+str.substr(i,2))) } return s }

Collectives™ on Stack Overflow

Equivalent JavaScript functions for Python's urllib.parse.quote() and urllib.parse.unquote()

8 Answers 8

Comments

1 Comment

1 Comment

Comments

Comments

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Comments

1 Comment

1 Comment

Comments

Comments

2 Comments

Comments

Comments

Linked

Related