How to determine the encoding of text

Question

I received some text that is encoded, but I don't know what charset was used. Is there a way to determine the encoding of a text file using Python? How can I detect the encoding/codepage of a text file deals with C#.

nosklo · Accepted Answer · 2020-11-29 22:11:18Z

EDIT: chardet seems to be unmantained but most of the answer applies. Check https://pypi.org/project/charset-normalizer/ for an alternative

Correctly detecting the encoding all times is impossible.

(From chardet FAQ:)

However, some encodings are optimized for specific languages, and languages are not random. Some character sequences pop up all the time, while other sequences make no sense. A person fluent in English who opens a newspaper and finds “txzqJv 2!dasd0a QqdKjvz” will instantly recognize that that isn't English (even though it is composed entirely of English letters). By studying lots of “typical” text, a computer algorithm can simulate this kind of fluency and make an educated guess about a text's language.

There is the chardet library that uses that study to try to detect encoding. chardet is a port of the auto-detection code in Mozilla.

You can also use UnicodeDammit. It will try the following methods:

An encoding discovered in the document itself: for instance, in an XML declaration or (for HTML documents) an http-equiv META tag. If Beautiful Soup finds this kind of encoding within the document, it parses the document again from the beginning and gives the new encoding a try. The only exception is if you explicitly specified an encoding, and that encoding actually worked: then it will ignore any encoding it finds in the document.
An encoding sniffed by looking at the first few bytes of the file. If an encoding is detected at this stage, it will be one of the UTF-* encodings, EBCDIC, or ASCII.
An encoding sniffed by the chardet library, if you have it installed.
UTF-8
Windows-1252

Thanks for the chardet reference. Seems good, although a bit slow.
@Geomorillo: There's no such thing as "the encoding standard". Text encoding is something as old as computing, it grew organically with time and needs, it wasn't planned. "Unicode" is an attempt to fix this.
And not a bad one, all things considered. What I would like to know is, how do I find out what encoding an open text file was opened with?
@dumbledad what I said is that correctly detecting it all times is impossible. All you can do is a guess, but it can fail sometimes, it won't work every time, due to encodings not being really detectable. To do the guess, you can use one of the tools I suggested in the answer
@LasseKärkkäinen the point of that answer is to show that corectly detecting encoding is impossible; the function you provide can guess right for your case, but is wrong for many cases.

Dominik Kaszewski · Accepted Answer · 2025-11-02 16:17:26Z

104

Another option for working out the encoding is to use libmagic (which is the code behind the file command). There are a profusion of Python bindings available.

The Python bindings that live in the file source tree are available as the python-magic (or python3-magic) debian package. It can determine the encoding of a file by doing:

import magic blob = open('unknown-file', 'rb').read() m = magic.open(magic.MAGIC_MIME_ENCODING) m.load() encoding = m.buffer(blob) # "utf-8", "us-ascii", etc.

There is an identically named, but incompatible, python-magic pip package on PyPI that also uses libmagic. It can also get the encoding, by doing:

import magic m = magic.Magic(mime_encoding=True) encoding = m.from_file('unknown-file')

edited Nov 2 at 16:17

Dominik Kaszewski

3,1441 gold badge16 silver badges31 bronze badges

answered Apr 24, 2013 at 23:10

Hamish Downer

17.2k17 gold badges94 silver badges85 bronze badges

9 Comments

MestreLion Over a year ago

libmagic is indeed a viable alternative to chardet. And great info on the distinct packages named python-magic! I'm sure this ambiguity bites many people

tripleee Over a year ago

file isn't particularly good at identifying human language in text files. It is excellent for identifying various container formats, though you sometimes have to know what it means ("Microsoft Office document" could mean an Outlook message, etc).

xtian Over a year ago

Looking for a way to manage file encoding mystery I found this post. Unfortunately, using the example code, I can't get past open(): UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 169799: invalid start byte. The file encoding according to vim's :set fileencoding is latin1.

xtian Over a year ago

If I use the optional argument errors='ignore', the output of the example code is the less helpful binary.

Tronic Over a year ago

@xtian You need to open in binary mode, i.e. open("filename.txt", "rb").

|

Peter Mortensen · Accepted Answer · 2025-03-12 00:26:29Z

Some encoding strategies (please uncomment to taste):

#!/bin/bash # tmpfile=$1 echo '-- info about file file ........' file -i $tmpfile enca -g $tmpfile echo 'recoding ........' #iconv -f iso-8859-2 -t utf-8 back_test.xml > $tmpfile #enca -x utf-8 $tmpfile #enca -g $tmpfile recode CP1250..UTF-8 $tmpfile

You might like to check the encoding by opening and reading the file in a form of a loop... but you might need to check the file size first:

# PYTHON encodings = ['utf-8', 'windows-1250', 'windows-1252'] # Add more for e in encodings: try: fh = codecs.open('file.txt', 'r', encoding=e) fh.readlines() fh.seek(0) except UnicodeDecodeError: print('got Unicode error with %s, trying different encoding' % e) else: print('opening the file with encoding: %s ' % e) break

You can also use io, like io.open(filepath, 'r', encoding='utf-8'), which is more convenient, because codecs doesn't convert \n automatically on reading and writing. More on HERE
This might not do what you would expect it to do: You only get a UnicodeDecodeError if the decoding fails by not representing a valid character in the current encoding. But you might still get garbled output if characters from the original encoding are simply being misinterpreted as different (valid) characters, without raising a UnicodeDecodeError.

Peter Mortensen · Accepted Answer · 2025-03-12 00:35:30Z

41

Here is an example of reading and taking at face value a chardet encoding prediction, reading n_lines from the file in the event it is large.

chardet also gives you a probability (i.e., confidence) of its encoding prediction (I haven't looked how they come up with that), which is returned with its prediction from chardet.predict(), so you could work that in somehow if you like.

import chardet from pathlib import Path def predict_encoding(file_path: Path, n_lines: int=20) -> str: '''Predict a file's encoding using chardet''' # Open the file as binary data with Path(file_path).open('rb') as f: # Join binary lines for specified number of lines rawdata = b''.join([f.readline() for _ in range(n_lines)]) return chardet.detect(rawdata)['encoding']

edited Mar 12 at 0:35

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Jul 18, 2017 at 13:01

ryanjdillon

19.2k10 gold badges88 silver badges113 bronze badges

5 Comments

ryanjdillon Over a year ago

Looking at this after getting an up-vote and now see that this solution could slow down if there were a lot of data on the first line. In some cases it would be better to read the data in differently.

n158 Over a year ago

I have modified this function this way: def predict_encoding(file_path, n=20): ... skip ... and then rawdata = b''.join([f.read() for _ in range(n)]) Have been tried this function on Python 3.6, worked perfectly with "ascii", "cp1252", "utf-8", "unicode" encodings. So this is definitely upvote.

Umar.H Over a year ago

this is very good for handling small datasets with a variety of formats. Tested this recursively on my root dir and it worked like a treat. Thanks buddy.

kuzzooroo Over a year ago

I'm not very familiar with reading data at the byte level. @n158, is there a chance one might stop reading bytes in the middle of a character and confuse chardet?

panofish Over a year ago

Of all the great suggestions, I found this the most useful. I also include @n158 suggestion to only read 20 bytes instead of 20 lines.

kgf3JfUtW · Accepted Answer · 2020-03-25 23:54:13Z

22

This might be helpful

from bs4 import UnicodeDammit with open('automate_data/billboard.csv', 'rb') as file: content = file.read() suggestion = UnicodeDammit(content) suggestion.original_encoding #'iso-8859-1'

edited Mar 25, 2020 at 23:54

kgf3JfUtW

15.2k12 gold badges63 silver badges92 bronze badges

answered Mar 25, 2020 at 23:40

richinex

4314 silver badges13 bronze badges

3 Comments

sakell Over a year ago

Thank you, I could determine the correct encoding now!

skinnedpanda Over a year ago

UnicodeDammit ought to be added to the standard library imo.

milahu Over a year ago

bs4.UnicodeDammit is based on chardet

rebahozkoc · Accepted Answer · 2021-08-08 13:36:36Z

If you are not satisfied with the automatic tools you can try all codecs and see which codec is right manually.

all_codecs = ['ascii', 'big5', 'big5hkscs', 'cp037', 'cp273', 'cp424', 'cp437', 'cp500', 'cp720', 'cp737', 'cp775', 'cp850', 'cp852', 'cp855', 'cp856', 'cp857', 'cp858', 'cp860', 'cp861', 'cp862', 'cp863', 'cp864', 'cp865', 'cp866', 'cp869', 'cp874', 'cp875', 'cp932', 'cp949', 'cp950', 'cp1006', 'cp1026', 'cp1125', 'cp1140', 'cp1250', 'cp1251', 'cp1252', 'cp1253', 'cp1254', 'cp1255', 'cp1256', 'cp1257', 'cp1258', 'euc_jp', 'euc_jis_2004', 'euc_jisx0213', 'euc_kr', 'gb2312', 'gbk', 'gb18030', 'hz', 'iso2022_jp', 'iso2022_jp_1', 'iso2022_jp_2', 'iso2022_jp_2004', 'iso2022_jp_3', 'iso2022_jp_ext', 'iso2022_kr', 'latin_1', 'iso8859_2', 'iso8859_3', 'iso8859_4', 'iso8859_5', 'iso8859_6', 'iso8859_7', 'iso8859_8', 'iso8859_9', 'iso8859_10', 'iso8859_11', 'iso8859_13', 'iso8859_14', 'iso8859_15', 'iso8859_16', 'johab', 'koi8_r', 'koi8_t', 'koi8_u', 'kz1048', 'mac_cyrillic', 'mac_greek', 'mac_iceland', 'mac_latin2', 'mac_roman', 'mac_turkish', 'ptcp154', 'shift_jis', 'shift_jis_2004', 'shift_jisx0213', 'utf_32', 'utf_32_be', 'utf_32_le', 'utf_16', 'utf_16_be', 'utf_16_le', 'utf_7', 'utf_8', 'utf_8_sig'] def find_codec(text): for i in all_codecs: for j in all_codecs: try: print(i, "to", j, text.encode(i).decode(j)) except: pass find_codec("The example string which includes ö, ü, or ÄŸ, Ã¶")

This script creates at least 9409 lines of output. So, if the output cannot fit to the terminal screen try to write the output to a text file.

As a complement to this good answer, I wrote a Python script that prints out a nice table of possible encoding mismatches, provided a small sample of the original correct string is guessed by users. Then, it suggests which encode/decode pair undoes the mismatch.
Thanks. Thru trial and error, I was finally able to get a SQL Server table with collation SQL_Latin1_General_CP1_CI_AS to emit data into a CSV using utf_8_sig.. It's a useful list of encodings (via Python/SQLAlchemy/PYODBC) ... So now Noël for example displays.
this is wasteful when the input starts with many ascii chars. each iteration will restart decoding from the first byte

Bimo · Accepted Answer · 2017-06-16 14:06:49Z

# Function: OpenRead(file) # A text file can be encoded using: # (1) The default operating system code page, Or # (2) utf8 with a BOM header # # If a text file is encoded with utf8, and does not have a BOM header, # the user can manually add a BOM header to the text file # using a text editor such as notepad++, and rerun the python script, # otherwise the file is read as a codepage file with the # invalid codepage characters removed import sys if int(sys.version[0]) != 3: print('Aborted: Python 3.x required') sys.exit(1) def bomType(file): """ returns file encoding string for open() function EXAMPLE: bom = bomtype(file) open(file, encoding=bom, errors='ignore') """ f = open(file, 'rb') b = f.read(4) f.close() if (b[0:3] == b'\xef\xbb\xbf'): return "utf8" # Python automatically detects endianess if utf-16 bom is present # write endianess generally determined by endianess of CPU if ((b[0:2] == b'\xfe\xff') or (b[0:2] == b'\xff\xfe')): return "utf16" if ((b[0:5] == b'\xfe\xff\x00\x00') or (b[0:5] == b'\x00\x00\xff\xfe')): return "utf32" # If BOM is not provided, then assume its the codepage # used by your operating system return "cp1252" # For the United States its: cp1252 def OpenRead(file): bom = bomType(file) return open(file, 'r', encoding=bom, errors='ignore') ####################### # Testing it ####################### fout = open("myfile1.txt", "w", encoding="cp1252") fout.write("* hi there (cp1252)") fout.close() fout = open("myfile2.txt", "w", encoding="utf8") fout.write("\u2022 hi there (utf8)") fout.close() # this case is still treated like codepage cp1252 # (User responsible for making sure that all utf8 files # have a BOM header) fout = open("badboy.txt", "wb") fout.write(b"hi there. barf(\x81\x8D\x90\x9D)") fout.close() # Read Example file with Bom Detection fin = OpenRead("myfile1.txt") L = fin.readline() print(L) fin.close() # Read Example file with Bom Detection fin = OpenRead("myfile2.txt") L =fin.readline() print(L) #requires QtConsole to view, Cmd.exe is cp1252 fin.close() # Read CP1252 with a few undefined chars without barfing fin = OpenRead("badboy.txt") L =fin.readline() print(L) fin.close() # Check that bad characters are still in badboy codepage file fin = open("badboy.txt", "rb") fin.read(20) fin.close()

Martin v. Löwis · Accepted Answer · 2009-01-12 17:32:27Z

It is, in principle, impossible to determine the encoding of a text file, in the general case. So no, there is no standard Python library to do that for you.

If you have more specific knowledge about the text file (e.g. that it is XML), there might be library functions.

Peter Mortensen · Accepted Answer · 2025-03-12 00:33:39Z

Depending on your platform, I just opt to use the Linux shell file command. This works for me since I am using it in a script that exclusively runs on one of our Linux machines.

Obviously, this isn't an ideal solution or answer, but it could be modified to fit your needs. In my case I just need to determine whether a file is UTF-8 or not.

import subprocess file_cmd = ['file', 'test.txt'] p = subprocess.Popen(file_cmd, stdout=subprocess.PIPE) cmd_output = p.stdout.readlines() # x will begin with the file type output as is observed using 'file' command x = cmd_output[0].split(": ")[1] return x.startswith('UTF-8')

Forking a new process is not needed. Python code already runs inside a process, and can call the proper system functions itself without the overhead of loading a new process.

cheersmate · Accepted Answer · 2024-10-16 11:43:16Z

`cchardet` faster alternative to `chardet`

Install: pip install cchardet

Use:

import cchardet as chardet filepath = Path(filename) blob = filepath.read_bytes() detection = chardet.detect(blob) encoding = detection["encoding"] confidence = detection["confidence"]

Martin Thurau · Accepted Answer · 2009-01-12 17:36:42Z

If you know the some content of the file you can try to decode it with several encoding and see which is missing. In general there is no way since a text file is a text file and those are stupid ;)

js2010 · Accepted Answer · 2025-03-12 01:11:39Z

This site has Python code for recognizing ASCII, encoding with BOMs, and UTF-8 without a BOM: 8. How to guess the encoding of a document.

Here's an example. I'm on OS X.

#!/usr/bin/python import sys def isUTF8(data): try: decoded = data.decode('UTF-8') except UnicodeDecodeError: return False else: for ch in decoded: if 0xD800 <= ord(ch) <= 0xDFFF: return False return True def get_bytes_from_file(filename): return open(filename, "rb").read() filename = sys.argv[1] data = get_bytes_from_file(filename) result = isUTF8(data) print(result) PS /Users/js> ./isutf8.py hi.txt True

A link to a solution is welcome, but please ensure your answer is useful without it: add context around the link so your fellow users will have some idea what it is and why it’s there, then quote the most relevant part of the page you're linking to in case the target page is unavailable. Answers that are little more than a link may be deleted.
The www.codecodex.com link is broken (probably domain expiry): "There's nothing here, yet."

Abpostman1 · Accepted Answer · 2022-10-12 12:47:28Z

A long time ago, I had this need.

Reading old code of mine, I found this:

 import urllib.request import chardet import os import settings [...] file = 'sources/dl/file.csv' media_folder = settings.MEDIA_ROOT file = os.path.join(media_folder, str(file)) if os.path.isfile(file): file_2_test = urllib.request.urlopen('file://' + file).read() encoding = (chardet.detect(file_2_test))['encoding'] return encoding

This worked for me and returned ascii

Mahmud Arfan · Accepted Answer · 2024-01-13 18:01:31Z

With this answer

I just want to add, for everyone's information, to install magic from the Python 3 pip:

pip install python-magic

Reference 1 Reference 2

Peter Mortensen · Accepted Answer · 2025-03-12 00:51:36Z

Using the Linux file -i command:

import subprocess file = "path/to/file/file.txt" encoding = subprocess.Popen("file -bi "+file, shell=True, stdout=subprocess.PIPE).stdout encoding = re.sub(r"(\\n)[^a-z0-9\-]", "", str(encoding.read()).split("=")[1], flags=re.IGNORECASE) print(encoding)

Peter Mortensen · Accepted Answer · 2025-03-12 00:52:34Z

You can use the python-magic package which does not load the whole file into memory:

import magic def detect( file_path, ): return magic.Magic( mime_encoding=True, ).from_file(file_path)

The output is the encoding name, for example:

iso-8859-1
us-ascii
utf-8

Peter Mortensen · Accepted Answer · 2025-03-12 03:13:22Z

Some text files are aware of their encoding, but most are not. Aware:

a text file having a BOM
an XML file is encoded in UTF-8 or its encoding is given in the preamble
a JSON file is always encoded in UTF-8

Not aware:

a CSV file
any random text file

Some encodings are versatile, i.e., they can decode any sequence of bytes, and some are not. US-ASCII is not versatile, since any byte greater than 127 is not mapped to any character. UTF-8 is not versatile since any sequence of bytes is not valid.

On the contrary, Latin-1, Windows-1252, etc. are versatile (even if some bytes are not officially mapped to a character):

>>> [b.to_bytes(1, 'big').decode("latin-1") for b in range(256)] ['\x00', ..., 'ÿ']

Given a random text file encoded in a sequence of bytes, you can't determine the encoding unless the file is aware of its encoding, because some encodings are versatile. But you can sometimes exclude non versatile encodings. All versatile encodings are still possible. The chardet modules uses the frequency of bytes to guess which encoding fits the best to the encoded text.

If you don't want to use this module or a similar one, here's a simple method:

check if the file is aware of its encoding (BOM)
check non versatile encodings and accept the first that can decode the bytes (ASCII before UTF-8, because it is stricter)
choose a fallback encoding.

The second step is a bit risky if you check only a sample, because some bytes in the rest of the file may be invalid.

The code:

def guess_encoding(data: bytes, fallback: str = "iso8859_15") -> str: """ A basic encoding detector. """ for bom, encoding in [ (codecs.BOM_UTF32_BE, "utf_32_be"), (codecs.BOM_UTF32_LE, "utf_32_le"), (codecs.BOM_UTF16_BE, "utf_16_be"), (codecs.BOM_UTF16_LE, "utf_16_le"), (codecs.BOM_UTF8, "utf_8_sig"), ]: if data.startswith(bom): return encoding if all(b < 128 for b in data): return "ascii" # You may want to use the fallback here if data is only a sample. decoder = codecs.getincrementaldecoder("utf_8")() try: decoder.decode(data, final=False) except UnicodeDecodeError: return fallback else: return "utf_8" # Not certain if data is only a sample

Remember that unversatile encodings may fail. The errors parameter of the decode method can be set to 'ignore' , 'replace' or 'backslashreplace' to avoid exceptions.

Peter Mortensen · Accepted Answer · 2025-03-12 03:15:13Z

Summary:

cchardet for small inputs
charset_normalizer for large inputs

When the input is random bytes (no text encoding is detected) then you probably want hex_escape

Compare some algorithms: guess-encoding-of-bytestring.py

#!/usr/bin/env python3 # guess text encoding of bytestring # [cchardet]: https://github.com/PyYoshi/cChardet # [faust-cchardet]: https://github.com/faust-streaming/cChardet # [uchardet]: https://gitlab.freedesktop.org/uchardet/uchardet # good for short strings # fails on long strings def guess_encoding_cchardet(bs: bytes): return cchardet.detect(bs).get("encoding") # [charset_normalizer]: https://github.com/jawah/charset_normalizer # [charset_normalizer#566]: https://github.com/jawah/charset_normalizer/issues/566 # good for long strings # fails on short strings # https://github.com/jawah/charset_normalizer/issues/486 # 20x faster than chardet [charset_normalizer] # -> 200x slower than cchardet # 5x slower than cchardet [charset_normalizer#566] # benchmark versus chardet # https://github.com/jawah/charset_normalizer/raw/master/bin/performance.py def guess_encoding_charset_normalizer(bs: bytes): match = charset_normalizer.from_bytes(bs).best() if match: return match.encoding return None # [rs_chardet]: https://github.com/emattiza/rs_chardet # 40x slower than cchardet [rs_chardet] def guess_encoding_rs_chardet(bs: bytes): return rs_chardet.detect_rs_enc_name(bs) # return rs_chardet.detect_codec(bs).name # [chardet]: https://github.com/chardet/chardet # 4000x slower than cchardet [rs_chardet] # 2000x slower than cchardet [cchardet] def guess_encoding_chardet(bs: bytes): return chardet.detect(bs).get("encoding") # [magic]: https://github.com/ahupp/python-magic # fails on short strings def guess_encoding_magic(bs: bytes): e = magic.detect_from_content(bs).encoding if e in ("binary", "unknown-8bit"): return None return e # [icu]: https://github.com/unicode-org/icu # fails on short strings def guess_encoding_icu(bs: bytes): try: return icu.CharsetDetector(bs).detect().getName() except icu.ICUError: return None if __name__ == "__main__": # test import random bytes_encoding_list = [ ("ü".encode("latin1"), "latin1"), ("üü".encode("latin1"), "latin1"), ("üüü".encode("latin1"), "latin1"), ] for _ in range(10): bytes_encoding_list += [ (random.randbytes(20), None), ] def test(guess_encoding): global bytes_encoding_list module_name = guess_encoding._name for input_bytes, expected_encoding in bytes_encoding_list: assert isinstance(input_bytes, bytes) # TODO better... guessed_encoding = guess_encoding(input_bytes) actual_string = None if guessed_encoding: try: actual_string = input_bytes.decode(guessed_encoding) except Exception as exc: if expected_encoding == None: print(f"{module_name}: fail. found wrong encoding {guessed_encoding} in random bytes {input_bytes}") continue else: print(f"{module_name}: FIXME failed to decode bytes: {exc}") if expected_encoding == None: # the guessed encoding can be anything -> dont compare encoding if guessed_encoding == None: print(f"{module_name}: ok. found no encoding in random bytes {input_bytes}") else: print(f"{module_name}: ok. found encoding {guessed_encoding} in random bytes {input_bytes} -> string {actual_string!r}") else: expected_string = input_bytes.decode(expected_encoding) if actual_string == expected_string: print(f"{module_name}: ok. decoded {actual_string} from {guessed_encoding} bytes {input_bytes}") else: #print(f"{module_name}: fail. actual {actual_string!r} from {guessed_encoding}. expected {expected_string!r} from {expected_encoding} bytes {input_bytes}") print(f"{module_name}: fail. string: {actual_string!r} != {expected_string!r}. encoding: {guessed_encoding} != {expected_encoding}. bytes: {input_bytes}") for k in list(globals().keys()): if not k.startswith("guess_encoding_"): continue module_name = k[15:] module_found = False try: module = __import__(module_name) globals()[module_name] = module module_found = True except ModuleNotFoundError as exc: print(f"{module_name}: module not found. hint: pip install {module_name}") pass if module_found: guess_encoding = locals()[k] guess_encoding._name = module_name test(guess_encoding)

Peter Mortensen · Accepted Answer · 2025-03-12 01:22:37Z

You can use the chardet module:

import chardet with open (filepath , "rb") as f: data= f.read() encode=chardet.UniversalDetector() encode.close() print(encode.result)

Or you can use the chardet3 command in Linux, but it takes some time:

chardet3 fileName

Example:

chardet3 donnee/dir/donnee.csv donnee/dir/donnee.csv: ISO-8859-1 with confidence 0.73

The python code does nothing, data variable is not used, it should be returning something like this: return chardet.detect(data) but is slow as hell, more info here

Collectives™ on Stack Overflow

How to determine the encoding of text

19 Answers 19

16 Comments

9 Comments

2 Comments

5 Comments

3 Comments

3 Comments

Comments

Comments

1 Comment

`cchardet` faster alternative to `chardet`

1 Comment

Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

19 Answers 19

16 Comments

9 Comments

2 Comments

5 Comments

3 Comments

3 Comments

Comments

Comments

1 Comment

cchardet faster alternative to chardet

1 Comment

Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Linked

Related

`cchardet` faster alternative to `chardet`