Timeline for How to count all characters including spaces?
Current License: CC BY-SA 3.0
14 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Apr 21, 2015 at 22:03 | comment | added | alexis | Well, it's kind of important! I looked more carefully and I get the impression it depends on the Unix "locale" for the particular computer; if it's not set to UTF-8, you'll get completely wrong character counts if there are many non-ascii characters. Latin-1 and od -c would be a safer catch-all (since characters are one byte in Latin-1) | |
| Apr 21, 2015 at 14:35 | comment | added | egreg | @alexis I guess it depends on your OS. | |
| Apr 21, 2015 at 14:22 | comment | added | alexis | Utf-8 seems like a strange option for character counting, since it expands characters into multiple bytes. Does wc -m understand UTF-8? (I suspect it might, but it's not explained in its documentation). | |
| Jul 2, 2012 at 8:37 | comment | added | egreg | @AbhimanyuArora I can only point to this link | |
| Jul 2, 2012 at 8:31 | comment | added | Abhimanyu Arora | @egreg:Grazie, I have windows XP, can you tell me please whether it applies in this case as well? And is xpdf to be installed via \usepackage? | |
| Jul 2, 2012 at 8:29 | comment | added | egreg | pdftotext is a program coming with xpdf; how to invoke it depends on the operating system: on Unix systems it's called from the command line. | |
| Jul 2, 2012 at 8:23 | comment | added | Abhimanyu Arora | Ciao @egreg: where exactly is this command pdftotext... to be typed? | |
| Mar 22, 2012 at 21:56 | history | edited | egreg | CC BY-SA 3.0 | Alternative using catdvi. Set encodings for output file |
| S Mar 22, 2012 at 21:56 | history | suggested | Bob | CC BY-SA 3.0 | Alternative using catdvi. Set encodings for output file |
| Mar 22, 2012 at 21:50 | review | Suggested edits | |||
| S Mar 22, 2012 at 21:56 | |||||
| Mar 22, 2012 at 21:48 | history | bounty awarded | Bob | ||
| Mar 22, 2012 at 21:48 | vote | accept | Bob | ||
| Mar 22, 2012 at 21:48 | comment | added | Bob | Thanks for your answer. This lead me to the idea to use catdvi. Using catdvi -s document.dvi | wc -m it gives me some good results. pdftotext has some problems reproducing special chars. | |
| Mar 21, 2012 at 17:09 | history | answered | egreg | CC BY-SA 3.0 |