43
\$\begingroup\$

Every Unicode character has a name, like "LATIN CAPITAL LETTER A". A Unicode character name may contain only uppercase letters, white spaces, and the minus sign.

Write a program that reads a text and outputs the names of each character on a new line. For example, if the input were "Hello, World!", the output would be

LATIN CAPITAL LETTER H LATIN SMALL LETTER E LATIN SMALL LETTER L LATIN SMALL LETTER L LATIN SMALL LETTER O COMMA SPACE LATIN CAPITAL LETTER W LATIN SMALL LETTER O LATIN SMALL LETTER R LATIN SMALL LETTER L LATIN SMALL LETTER D EXCLAMATION MARK 
  • Input should come from a file or from user input, not just a string in code.
  • Output should be written to a file or stdout or printed to the screen.
  • Internet and external libraries are not allowed, all necessary data should be in the code.
  • Assume that the input only contains printable ASCII characters in the Basic Latin code range 32-126. You can ignore a trailing newline.
  • All programming languages allowed. Shortest code in bytes wins.

The official Unicode character names can be found here. Other sources:

This is my first question so I'd appreciate any suggestion if this can be improved.

For the purpose of this challenge the list below shall be normative.

 32 0020 SPACE ! 33 0021 EXCLAMATION MARK " 34 0022 QUOTATION MARK # 35 0023 NUMBER SIGN $ 36 0024 DOLLAR SIGN % 37 0025 PERCENT SIGN & 38 0026 AMPERSAND ' 39 0027 APOSTROPHE ( 40 0028 LEFT PARENTHESIS ) 41 0029 RIGHT PARENTHESIS * 42 002A ASTERISK + 43 002B PLUS SIGN , 44 002C COMMA - 45 002D HYPHEN-MINUS . 46 002E FULL STOP / 47 002F SOLIDUS 0 48 0030 DIGIT ZERO 1 49 0031 DIGIT ONE 2 50 0032 DIGIT TWO 3 51 0033 DIGIT THREE 4 52 0034 DIGIT FOUR 5 53 0035 DIGIT FIVE 6 54 0036 DIGIT SIX 7 55 0037 DIGIT SEVEN 8 56 0038 DIGIT EIGHT 9 57 0039 DIGIT NINE : 58 003A COLON ; 59 003B SEMICOLON < 60 003C LESS-THAN SIGN = 61 003D EQUALS SIGN > 62 003E GREATER-THAN SIGN ? 63 003F QUESTION MARK @ 64 0040 COMMERCIAL AT A 65 0041 LATIN CAPITAL LETTER A B 66 0042 LATIN CAPITAL LETTER B C 67 0043 LATIN CAPITAL LETTER C D 68 0044 LATIN CAPITAL LETTER D E 69 0045 LATIN CAPITAL LETTER E F 70 0046 LATIN CAPITAL LETTER F G 71 0047 LATIN CAPITAL LETTER G H 72 0048 LATIN CAPITAL LETTER H I 73 0049 LATIN CAPITAL LETTER I J 74 004A LATIN CAPITAL LETTER J K 75 004B LATIN CAPITAL LETTER K L 76 004C LATIN CAPITAL LETTER L M 77 004D LATIN CAPITAL LETTER M N 78 004E LATIN CAPITAL LETTER N O 79 004F LATIN CAPITAL LETTER O P 80 0050 LATIN CAPITAL LETTER P Q 81 0051 LATIN CAPITAL LETTER Q R 82 0052 LATIN CAPITAL LETTER R S 83 0053 LATIN CAPITAL LETTER S T 84 0054 LATIN CAPITAL LETTER T U 85 0055 LATIN CAPITAL LETTER U V 86 0056 LATIN CAPITAL LETTER V W 87 0057 LATIN CAPITAL LETTER W X 88 0058 LATIN CAPITAL LETTER X Y 89 0059 LATIN CAPITAL LETTER Y Z 90 005A LATIN CAPITAL LETTER Z [ 91 005B LEFT SQUARE BRACKET \ 92 005C REVERSE SOLIDUS ] 93 005D RIGHT SQUARE BRACKET ^ 94 005E CIRCUMFLEX ACCENT _ 95 005F LOW LINE ` 96 0060 GRAVE ACCENT a 97 0061 LATIN SMALL LETTER A b 98 0062 LATIN SMALL LETTER B c 99 0063 LATIN SMALL LETTER C d 100 0064 LATIN SMALL LETTER D e 101 0065 LATIN SMALL LETTER E f 102 0066 LATIN SMALL LETTER F g 103 0067 LATIN SMALL LETTER G h 104 0068 LATIN SMALL LETTER H i 105 0069 LATIN SMALL LETTER I j 106 006A LATIN SMALL LETTER J k 107 006B LATIN SMALL LETTER K l 108 006C LATIN SMALL LETTER L m 109 006D LATIN SMALL LETTER M n 110 006E LATIN SMALL LETTER N o 111 006F LATIN SMALL LETTER O p 112 0070 LATIN SMALL LETTER P q 113 0071 LATIN SMALL LETTER Q r 114 0072 LATIN SMALL LETTER R s 115 0073 LATIN SMALL LETTER S t 116 0074 LATIN SMALL LETTER T u 117 0075 LATIN SMALL LETTER U v 118 0076 LATIN SMALL LETTER V w 119 0077 LATIN SMALL LETTER W x 120 0078 LATIN SMALL LETTER X y 121 0079 LATIN SMALL LETTER Y z 122 007A LATIN SMALL LETTER Z { 123 007B LEFT CURLY BRACKET | 124 007C VERTICAL LINE } 125 007D RIGHT CURLY BRACKET ~ 126 007E TILDE 
\$\endgroup\$
5
  • 5
    \$\begingroup\$ Hi, I've gone ahead and edited your question, roll back if you disagree. You don't need more sources of the information, you need one, normative version in the question, to avoid issues with discrepancies. I picked ssec.wisc.edu/~tomw/java/unicode.html#x0000 as it was the most concise. Other than that, +1 \$\endgroup\$ Commented Sep 6, 2015 at 15:21
  • \$\begingroup\$ Thanks for the edit @steveverrill, I was too lazy to do that myself. \$\endgroup\$ Commented Sep 6, 2015 at 15:26
  • \$\begingroup\$ Apparently the values are available as part of Windows, in C:\Windows\System32\getuname.dll. Does this also count as an "external library", even if it's built in to Windows? \$\endgroup\$ Commented Sep 7, 2015 at 12:18
  • 7
    \$\begingroup\$ I just learned the word solidus. \$\endgroup\$ Commented Sep 23, 2015 at 18:20
  • 4
    \$\begingroup\$ I’m voting to close this question because, as discussed in chat here, it's unclear what exactly constitutes an "external library". \$\endgroup\$ Commented Mar 15, 2022 at 15:39

19 Answers 19

31
\$\begingroup\$

Java - 113 bytes (152 if read from command line)

Edit: removed useless curly brackets.

Edit2: removed unnecessary variable.

Edit3: Instead of Character.getName() I use c.getName().

Edit4: Passing string as command line argument.

With command line argument (113 bytes):

class Z{public static void main(String[]x){for(Character c:x[0].toCharArray())System.out.println(c.getName(c));}} 

With read line (152 bytes):

class Z{public static void main(String[]x){for(Character c:new java.util.Scanner(System.in).nextLine().toCharArray())System.out.println(c.getName(c));}} 

Java has everything needed. I'm sure this could be golfed down.

\$\endgroup\$
5
  • 7
    \$\begingroup\$ Damn! A builtin! In order to make this an interesting challenge, I would consider this to be in non-compliance with "all necessary data should be in the code." Very clever, though. \$\endgroup\$ Commented Sep 6, 2015 at 15:35
  • 1
    \$\begingroup\$ @steveverrill Oh well :) . I've seen another challenge where common lisp did something similar (counting from one to 100 if I remember right). \$\endgroup\$ Commented Sep 6, 2015 at 15:41
  • 6
    \$\begingroup\$ Wow, this time Java has the chance to beat a lot of golfing languages. \$\endgroup\$ Commented Sep 6, 2015 at 19:47
  • 4
    \$\begingroup\$ Alternative Java 8 solution: x[0].chars().forEach(i->System.out.println(Character.getName(i))); This saves 2 chars compared to the command-line argument solution (by replacing the for-loop). \$\endgroup\$ Commented Sep 7, 2015 at 7:05
  • 3
    \$\begingroup\$ Or maybe x[0].chars().map(' '::getName).forEach(System.out::println); \$\endgroup\$ Commented Sep 8, 2015 at 18:26
18
\$\begingroup\$

Python 3, 56 bytes

Uses a built-in function unicodedata.name(), so this may be non-competent. The Java answer did it similarly, so I thought it was at least worth posting.

from unicodedata import* for i in input():print(name(i)) 
\$\endgroup\$
4
  • \$\begingroup\$ I've also wanted to post one in python but my java answer was cheaty enough :) . \$\endgroup\$ Commented Sep 6, 2015 at 18:56
  • 1
    \$\begingroup\$ Surely for i in input():print(unicodedata.name(i)) is shorter? \$\endgroup\$ Commented Sep 8, 2015 at 4:01
  • 1
    \$\begingroup\$ @Eric No. You have to import unicodedata, so that's longer. \$\endgroup\$ Commented Sep 8, 2015 at 13:24
  • \$\begingroup\$ Same length using lambda+map \$\endgroup\$ Commented Jul 6, 2021 at 3:04
18
\$\begingroup\$

JavaScript (ES6) 594 618 626

Note I could save ~30 bytes compressing the long string with atob/btoa, but the utf8 character above '~' are not well accepted by the Stack Exchange post editor. I prefer to keep a running snippet instead.

Edit 8 chars saved thx @Ypnypn

Obvious compression of repeated words. The newline inside backticks is significant and counted.

Test running the snippet in Firefox.

// TEST SUITE // for testing purpose, redefine alert() to write inside the snippet body alert=x=>O.innerHTML=x // for testing purpose, redefine prompt() to have a default text containing all characters _prompt=prompt prompt=(i,s)=>{ for(s='',i=32;i<127;i++)s+=String.fromCharCode(i); return _prompt("Insert your message or keep the default",s); } // That's the answer code: z='SPACE/EXCLAMA0QUOTA0NUMBER1DOLLAR1PERCENT1AMPERSAND/APOSTROPHE3242ASTERISK/PLUS1COMMA/HYPHEN-MINUS/FULL STOP/78ZERO8ONE8TWO8THREE8FOUR8FIVE8SIX8SEVEN8EIGHT8NINE86SEMI6LESS-THAN1EQUALS1GREATER-THAN1QUES0COMMERCIAL AT3SQUARE5REVERSE 7/4SQUARE5CIRCUMFLEX9/LOW LINE/GRAVE93CURLY5VERTICAL LINE/4CURLY5TILDE'.replace(/\d/g,c=>'TION MARK/, SIGN/,PARENTHESIS/,/LEFT ,RIGHT , BRACKET/,COLON/,SOLIDUS,/DIGIT , ACCENT'.split`,`[c]).split`/`,alert([...prompt()].map(c=>(q=c.charCodeAt()-32)<33?z[q]:q<59?'LATIN CAPITAL LETTER '+c:q<65?z[q-26]:q<91?'LATIN SMALL LETTER '+c.toUpperCase():z[q-52]).join` `)
<pre id=O></pre>

\$\endgroup\$
0
10
\$\begingroup\$

R, 54 bytes 62

library(Unicode) cat(u_char_name(utf8ToInt(scan(,""))),sep="\n") 

Edit: per @flodels comment, I need to read it from connection first, so had to add scan. This is also probably non-competent solution according to all the rules.

Usage

> cat(u_char_name(utf8ToInt(scan(,""))),sep="\n") 1: 'Hello, World!' 2: Read 1 item LATIN CAPITAL LETTER H LATIN SMALL LETTER E LATIN SMALL LETTER L LATIN SMALL LETTER L LATIN SMALL LETTER O COMMA SPACE LATIN CAPITAL LETTER W LATIN SMALL LETTER O LATIN SMALL LETTER R LATIN SMALL LETTER L LATIN SMALL LETTER D EXCLAMATION MARK 

You can also wrap it up into a function for more convenient usage

UNI <- function(x)cat(paste0(u_char_name(utf8ToInt(x)),"\n")) 

Then, the usage is just

UNI("Hello, World!") 
\$\endgroup\$
11
  • 1
    \$\begingroup\$ Your byte count is correct :) \$\endgroup\$ Commented Sep 7, 2015 at 9:03
  • 1
    \$\begingroup\$ And welcome to PPCG! :D \$\endgroup\$ Commented Sep 7, 2015 at 9:03
  • \$\begingroup\$ Good for you having a built in for the task, but the output is not what is requested - 4 columns table instead of 1 column table. I think you should add some core to obtain the correct output \$\endgroup\$ Commented Sep 7, 2015 at 9:25
  • \$\begingroup\$ @edc65 that easy to fix, I just thought of it as a bonus. \$\endgroup\$ Commented Sep 7, 2015 at 9:30
  • \$\begingroup\$ @edc65 fixed it. \$\endgroup\$ Commented Sep 7, 2015 at 9:37
7
\$\begingroup\$

C, 644 656

Full program, reading from standard input

Test on Ideone

This is a porting of my JavaScript answer to C. The C language is good at manipulating single characters as numbers (no need of .toUpperCase and the like), but it's weaker in string manipulation.

char*s,*p,*q,b[999],*d=b+99,c,*l[129]; main(k){for(k=32,p="/SPACE/EXCLAMAaQUOTAaNUMBERbDOLLARbPERCENTbAMPERSAND/APOSTROPHEdcecASTERISK/PLUSbCOMMA/HYPHEN-MINUS/FULL STOP/hiZEROiONEiTWOiTHREEiFOURiFIVEiSIXiSEVENiEIGHTiNINE/gSEMIgLESSnbEQUALSbGREATERnbQUESaCOMMERCIAL ATdkfREVERSE h/ekfCIRCUMFLEXj/LOWmGRAVEjdlfVERTICALmelfTILDE/"; c=*p;p++)c>96?q?(p=q,q=0):(q=p,p=strchr("aTION MARK/b SIGN/cPARENTHESIS/d/LEFT eRIGHT f BRACKET/gCOLON/hSOLIDUSi/DIGIT j ACCENTkSQUARElCURLYm LINE/n-THANz",c)):c-47?*d++=c:(*d++=0,l[k++]=d); for(;~(k=getchar());puts(k<65?l[k]:(k&31)<27?b:l[k<97?k-26:k-52]))sprintf(b,"LATIN %s LETTER %c",k<91?"CAPITAL":"SMALL",k&95);} 

Less golfed

char *all = "/SPACE/EXCLAMAaQUOTAaNUMBERbDOLLARbPERCENTbAMPERSAND/APOSTROPHEdcecASTERISK/PLUSbCOMMA/HYPHEN-MINUS/FULL STOP/hiZEROiONEiTWOiTHREEiFOURiFIVEiSIXiSEVENiEIGHTiNINE/gSEMIgLESSnbEQUALSbGREATERnbQUESaCOMMERCIAL ATdkfREVERSE h/ekfCIRCUMFLEXj/LOWmGRAVEjdlfVERTICALmelfTILDE/"; char *subs = "aTION MARK/b SIGN/cPARENTHESIS/d/LEFT eRIGHT f BRACKET/gCOLON/hSOLIDUSi/DIGIT j ACCENTkSQUARElCURLYm LINE/n-THANz"; main(int k) { char c, *s, *p, *q=0, b[999], // work buffer *d = b+99, // first part of buffer is used later *l[129]; // characters descriptions (used 32 to 126) // Uncompress the descriptions of all char except letters for(k = 32, p = all; c = *p; ++p) { c >= 'a' // substitution word are marked as lowercase letters ? q ? (p = q, q = 0) : (q = p, p = strchr(subs, c)) : c != '/' ? *d++ = c : (*d++ = 0, l[k++] = d); // end char description } // Scan the input string and print each char description for(; (k=getchar()) != -1; ) { sprintf(b,"LATIN %s LETTER %c", k<91 ? "CAPITAL":"SMALL", k & 95); puts( k<65 ? l[k] : k<91 ? b : k<97 ? l[k-26] : k<123 ? b : l[k-52]); } } 
\$\endgroup\$
7
\$\begingroup\$

Perl 6, 21 bytes

I did not see a rule specifically against using a built-in method for getting the unicode names.
(Also the Java answer which is the currently highest voted one does the same)

.say for get.uninames 
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Perl 6 is weird. I love it, though. \$\endgroup\$ Commented Apr 30, 2016 at 15:30
7
\$\begingroup\$

Perl 5 + -M5.10.0 -MCompress::Zlib -00F, 459 bytes

@d=split/,/,inflateInit->inflate(<DATA>);say$d[-32+ord]for@F __END__ x.m.or.0.......!.#....I&I.......v......0O.)q.g.*...PH.q....u..k...85.:sgM\.8P..Fu......r..=tR...e......9k.{.2.xUj.P\_}..Qr...!..[..$O.2.%{.bmgl.....|U.tnC...K-?..;.3..1.\s......K>.....T..y.$v.E.....6.....JaM.$.G!.5$...A..5.....y...g....s.....Y0...s..1o...av.............;..)..R..G...8..t...K)k.e.~.J..Gi. .r\.v>.........!L.'..pF. \.f......aAX.6....P..BG8......._.. ....W...s"9.t.... .2...... 

Try it online!

Explanation

Pretty straight-forward, @d is constructed from Zlib compressed data (stored after __END__), which is just the compressed list of raw comma separated character names that is inflated, split on , and can be indexed via ord - 32. -00 sets the input record separator to \x00 (NUL) which slurps the __DATA__ in one call to <DATA> instead of having it split on newline. @F is initialised with the input data as an array (via -F), which is iterated over and the corresponding index from @d is output.


Perl 5 + -M5.10.0 -F, 672 bytes

$b=BRACKET;say+(SPACE,<"{EXCLAMATION,QUOTATION} MARK">,<"{NUMBER,DOLLAR,PERCENT} SIGN">,AMPERSAND,APOSTROPHE,<"{LEFT,RIGHT} PARENTHESIS">,ASTERISK,'PLUS SIGN',COMMA,'HYPHEN-MINUS','FULL STOP',$s=SOLIDUS,<"DIGIT {ZERO,ONE,TWO,THREE,FOUR,FIVE,SIX,SEVEN,EIGHT,NINE}">,<{,SEMI}COLON>,<"{LESS-THAN,EQUALS,GREATER-THAN} SIGN">,'QUESTION MARK','COMMERCIAL AT',<"LATIN CAPITAL LETTER {A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z}">,"LEFT SQUARE $b","REVERSE $s","RIGHT SQUARE $b",'CIRCUMFLEX ACCENT','LOW LINE','GRAVE ACCENT',<"LATIN SMALL LETTER {A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z}">,"LEFT CURLY $b",'VERTICAL LINE',"RIGHT CURLY $b",TILDE)[-32+ord]for@F 

Try it online!

Explanation

A (marginally) less boring version of the script without tool-based compression, mostly taking advantage of globs.

\$\endgroup\$
6
\$\begingroup\$

awk - 794 739

 1 LATIN CAPITAL LETTER B 2 LATIN CAPITAL LETTER E 3 LATIN CAPITAL LETTER G 4 LATIN CAPITAL LETTER I 5 LATIN CAPITAL LETTER N 6 LEFT CURLY BRACKET 7 LATIN SMALL LETTER S 8 LATIN SMALL LETTER P 9 LATIN SMALL LETTER L 10 LATIN SMALL LETTER I 11 LATIN SMALL LETTER T 12 LEFT PARENTHESIS 13 QUOTATION MARK 14 LATIN SMALL LETTER I 15 COMMA 16 LATIN CAPITAL LETTER L 17 LATIN SMALL LETTER V 18 COMMA 19 LATIN CAPITAL LETTER S 20 LATIN SMALL LETTER V 21 COMMA 22 LATIN SMALL LETTER A 23 LATIN SMALL LETTER X 24 COMMA 25 CIRCUMFLEX ACCENT 26 LATIN SMALL LETTER X 27 COMMA 28 LEFT SQUARE BRACKET 29 LATIN SMALL LETTER X 30 COMMA 31 LATIN CAPITAL LETTER Q 32 COMMA 33 LATIN CAPITAL LETTER O 34 COMMA 35 LATIN SMALL LETTER T 36 LATIN CAPITAL LETTER K 37 COMMA 38 LATIN SMALL LETTER C 39 LATIN CAPITAL LETTER K 40 COMMA 41 LATIN CAPITAL LETTER V 42 COMMA 43 LATIN SMALL LETTER Q 44 LATIN SMALL LETTER X 45 COMMA 46 LATIN SMALL LETTER G 47 COMMA 48 LATIN CAPITAL LETTER I 49 COMMA 50 LATIN SMALL LETTER W 51 LATIN SMALL LETTER U 52 COMMA 53 LATIN CAPITAL LETTER X 54 COMMA 55 LATIN SMALL LETTER B 56 LATIN SMALL LETTER Y 57 COMMA 58 LATIN SMALL LETTER B 59 LEFT CURLY BRACKET 60 COMMA 61 LATIN SMALL LETTER B 62 LATIN SMALL LETTER Z 63 COMMA 64 LATIN SMALL LETTER B 65 LATIN SMALL LETTER D 66 COMMA 67 LATIN SMALL LETTER B 68 LATIN SMALL LETTER P 69 COMMA 70 LATIN SMALL LETTER B 71 LATIN SMALL LETTER R 72 COMMA 73 LATIN SMALL LETTER B 74 RIGHT CURLY BRACKET 75 COMMA 76 LATIN SMALL LETTER B 77 LATIN SMALL LETTER K 78 COMMA 79 LATIN SMALL LETTER B 80 LATIN SMALL LETTER L 81 COMMA 82 LATIN SMALL LETTER B 83 LATIN SMALL LETTER O 84 COMMA 85 LATIN SMALL LETTER E 86 COMMA 87 LATIN CAPITAL LETTER P 88 COMMA 89 LATIN CAPITAL LETTER R 90 LATIN SMALL LETTER X 91 COMMA 92 LOW LINE 93 LATIN SMALL LETTER X 94 COMMA 95 LATIN CAPITAL LETTER J 96 LATIN SMALL LETTER X 97 COMMA 98 LATIN CAPITAL LETTER U 99 LATIN SMALL LETTER V 100 COMMA 101 LATIN CAPITAL LETTER M 102 TILDE 103 COMMA 104 SPACE 105 LATIN SMALL LETTER T 106 GRAVE ACCENT 107 LATIN CAPITAL LETTER Y 108 COMMA 109 LATIN CAPITAL LETTER Z 110 LATIN CAPITAL LETTER X 111 COMMA 112 LATIN SMALL LETTER C 113 GRAVE ACCENT 114 LATIN CAPITAL LETTER Y 115 COMMA 116 LATIN CAPITAL LETTER N 117 REVERSE SOLIDUS 118 REVERSE SOLIDUS 119 COMMA 120 VERTICAL LINE 121 LATIN SMALL LETTER S 122 COMMA 123 LATIN SMALL LETTER M 124 REVERSE SOLIDUS 125 REVERSE SOLIDUS 126 COMMA 127 SPACE 128 LATIN SMALL LETTER T 129 LATIN SMALL LETTER H 130 LATIN CAPITAL LETTER Y 131 COMMA 132 LATIN CAPITAL LETTER T 133 LATIN SMALL LETTER S 134 COMMA 135 LATIN SMALL LETTER C 136 LATIN SMALL LETTER H 137 LATIN CAPITAL LETTER Y 138 COMMA 139 LATIN SMALL LETTER F 140 SPACE 141 LATIN CAPITAL LETTER H 142 LATIN CAPITAL LETTER Y 143 LATIN CAPITAL LETTER P 144 LATIN CAPITAL LETTER H 145 LATIN CAPITAL LETTER E 146 LATIN CAPITAL LETTER N 147 HYPHEN-MINUS 148 LATIN CAPITAL LETTER M 149 LATIN CAPITAL LETTER I 150 LATIN CAPITAL LETTER N 151 LATIN CAPITAL LETTER U 152 LATIN CAPITAL LETTER S 153 SPACE 154 LATIN CAPITAL LETTER G 155 LATIN CAPITAL LETTER R 156 LATIN CAPITAL LETTER E 157 LATIN CAPITAL LETTER A 158 LATIN CAPITAL LETTER T 159 LATIN CAPITAL LETTER E 160 LATIN CAPITAL LETTER R 161 HYPHEN-MINUS 162 LATIN CAPITAL LETTER T 163 LATIN CAPITAL LETTER H 164 LATIN CAPITAL LETTER A 165 LATIN CAPITAL LETTER N 166 SPACE 167 LATIN CAPITAL LETTER P 168 LATIN CAPITAL LETTER A 169 LATIN CAPITAL LETTER R 170 LATIN CAPITAL LETTER E 171 LATIN CAPITAL LETTER N 172 LATIN CAPITAL LETTER T 173 LATIN CAPITAL LETTER H 174 LATIN CAPITAL LETTER E 175 LATIN CAPITAL LETTER S 176 LATIN CAPITAL LETTER I 177 LATIN CAPITAL LETTER S 178 SPACE 179 LATIN CAPITAL LETTER E 180 LATIN CAPITAL LETTER X 181 LATIN CAPITAL LETTER C 182 LATIN CAPITAL LETTER L 183 LATIN CAPITAL LETTER A 184 LATIN CAPITAL LETTER M 185 LATIN CAPITAL LETTER A 186 LATIN CAPITAL LETTER T 187 LATIN CAPITAL LETTER I 188 LATIN CAPITAL LETTER O 189 LATIN CAPITAL LETTER N 190 SPACE 191 LATIN CAPITAL LETTER C 192 LATIN CAPITAL LETTER O 193 LATIN CAPITAL LETTER M 194 LATIN CAPITAL LETTER M 195 LATIN CAPITAL LETTER E 196 LATIN CAPITAL LETTER R 197 LATIN CAPITAL LETTER C 198 LATIN CAPITAL LETTER I 199 LATIN CAPITAL LETTER A 200 LATIN CAPITAL LETTER L 201 SPACE 202 LATIN CAPITAL LETTER C 203 LATIN CAPITAL LETTER I 204 LATIN CAPITAL LETTER R 205 LATIN CAPITAL LETTER C 206 LATIN CAPITAL LETTER U 207 LATIN CAPITAL LETTER M 208 LATIN CAPITAL LETTER F 209 LATIN CAPITAL LETTER L 210 LATIN CAPITAL LETTER E 211 LATIN CAPITAL LETTER X 212 SPACE 213 LATIN CAPITAL LETTER A 214 LATIN CAPITAL LETTER P 215 LATIN CAPITAL LETTER O 216 LATIN CAPITAL LETTER S 217 LATIN CAPITAL LETTER T 218 LATIN CAPITAL LETTER R 219 LATIN CAPITAL LETTER O 220 LATIN CAPITAL LETTER P 221 LATIN CAPITAL LETTER H 222 LATIN CAPITAL LETTER E 223 SPACE 224 LATIN CAPITAL LETTER S 225 LATIN CAPITAL LETTER E 226 LATIN CAPITAL LETTER M 227 LATIN CAPITAL LETTER I 228 LATIN CAPITAL LETTER C 229 LATIN CAPITAL LETTER O 230 LATIN CAPITAL LETTER L 231 LATIN CAPITAL LETTER O 232 LATIN CAPITAL LETTER N 233 SPACE 234 LATIN CAPITAL LETTER A 235 LATIN CAPITAL LETTER M 236 LATIN CAPITAL LETTER P 237 LATIN CAPITAL LETTER E 238 LATIN CAPITAL LETTER R 239 LATIN CAPITAL LETTER S 240 LATIN CAPITAL LETTER A 241 LATIN CAPITAL LETTER N 242 LATIN CAPITAL LETTER D 243 SPACE 244 LATIN CAPITAL LETTER L 245 LATIN CAPITAL LETTER E 246 LATIN CAPITAL LETTER S 247 LATIN CAPITAL LETTER S 248 HYPHEN-MINUS 249 LATIN CAPITAL LETTER T 250 LATIN CAPITAL LETTER H 251 LATIN CAPITAL LETTER A 252 LATIN CAPITAL LETTER N 253 SPACE 254 LATIN CAPITAL LETTER Q 255 LATIN CAPITAL LETTER U 256 LATIN CAPITAL LETTER O 257 LATIN CAPITAL LETTER T 258 LATIN CAPITAL LETTER A 259 LATIN CAPITAL LETTER T 260 LATIN CAPITAL LETTER I 261 LATIN CAPITAL LETTER O 262 LATIN CAPITAL LETTER N 263 SPACE 264 LATIN CAPITAL LETTER V 265 LATIN CAPITAL LETTER E 266 LATIN CAPITAL LETTER R 267 LATIN CAPITAL LETTER T 268 LATIN CAPITAL LETTER I 269 LATIN CAPITAL LETTER C 270 LATIN CAPITAL LETTER A 271 LATIN CAPITAL LETTER L 272 SPACE 273 LATIN CAPITAL LETTER Q 274 LATIN CAPITAL LETTER U 275 LATIN CAPITAL LETTER E 276 LATIN CAPITAL LETTER S 277 LATIN CAPITAL LETTER T 278 LATIN CAPITAL LETTER I 279 LATIN CAPITAL LETTER O 280 LATIN CAPITAL LETTER N 281 SPACE 282 LATIN CAPITAL LETTER A 283 LATIN CAPITAL LETTER S 284 LATIN CAPITAL LETTER T 285 LATIN CAPITAL LETTER E 286 LATIN CAPITAL LETTER R 287 LATIN CAPITAL LETTER I 288 LATIN CAPITAL LETTER S 289 LATIN CAPITAL LETTER K 290 SPACE 291 LATIN CAPITAL LETTER C 292 LATIN CAPITAL LETTER A 293 LATIN CAPITAL LETTER P 294 LATIN CAPITAL LETTER I 295 LATIN CAPITAL LETTER T 296 LATIN CAPITAL LETTER A 297 LATIN CAPITAL LETTER L 298 SPACE 299 LATIN CAPITAL LETTER S 300 LATIN CAPITAL LETTER O 301 LATIN CAPITAL LETTER L 302 LATIN CAPITAL LETTER I 303 LATIN CAPITAL LETTER D 304 LATIN CAPITAL LETTER U 305 LATIN CAPITAL LETTER S 306 SPACE 307 LATIN CAPITAL LETTER B 308 LATIN CAPITAL LETTER R 309 LATIN CAPITAL LETTER A 310 LATIN CAPITAL LETTER C 311 LATIN CAPITAL LETTER K 312 LATIN CAPITAL LETTER E 313 LATIN CAPITAL LETTER T 314 SPACE 315 LATIN CAPITAL LETTER R 316 LATIN CAPITAL LETTER E 317 LATIN CAPITAL LETTER V 318 LATIN CAPITAL LETTER E 319 LATIN CAPITAL LETTER R 320 LATIN CAPITAL LETTER S 321 LATIN CAPITAL LETTER E 322 SPACE 323 LATIN CAPITAL LETTER P 324 LATIN CAPITAL LETTER E 325 LATIN CAPITAL LETTER R 326 LATIN CAPITAL LETTER C 327 LATIN CAPITAL LETTER E 328 LATIN CAPITAL LETTER N 329 LATIN CAPITAL LETTER T 330 SPACE 331 LATIN CAPITAL LETTER A 332 LATIN CAPITAL LETTER C 333 LATIN CAPITAL LETTER C 334 LATIN CAPITAL LETTER E 335 LATIN CAPITAL LETTER N 336 LATIN CAPITAL LETTER T 337 SPACE 338 LATIN CAPITAL LETTER L 339 LATIN CAPITAL LETTER E 340 LATIN CAPITAL LETTER T 341 LATIN CAPITAL LETTER T 342 LATIN CAPITAL LETTER E 343 LATIN CAPITAL LETTER R 344 SPACE 345 LATIN CAPITAL LETTER D 346 LATIN CAPITAL LETTER O 347 LATIN CAPITAL LETTER L 348 LATIN CAPITAL LETTER L 349 LATIN CAPITAL LETTER A 350 LATIN CAPITAL LETTER R 351 SPACE 352 LATIN CAPITAL LETTER E 353 LATIN CAPITAL LETTER Q 354 LATIN CAPITAL LETTER U 355 LATIN CAPITAL LETTER A 356 LATIN CAPITAL LETTER L 357 LATIN CAPITAL LETTER S 358 SPACE 359 LATIN CAPITAL LETTER S 360 LATIN CAPITAL LETTER Q 361 LATIN CAPITAL LETTER U 362 LATIN CAPITAL LETTER A 363 LATIN CAPITAL LETTER R 364 LATIN CAPITAL LETTER E 365 SPACE 366 LATIN CAPITAL LETTER N 367 LATIN CAPITAL LETTER U 368 LATIN CAPITAL LETTER M 369 LATIN CAPITAL LETTER B 370 LATIN CAPITAL LETTER E 371 LATIN CAPITAL LETTER R 372 SPACE 373 LATIN CAPITAL LETTER D 374 LATIN CAPITAL LETTER I 375 LATIN CAPITAL LETTER G 376 LATIN CAPITAL LETTER I 377 LATIN CAPITAL LETTER T 378 SPACE 379 LATIN CAPITAL LETTER R 380 LATIN CAPITAL LETTER I 381 LATIN CAPITAL LETTER G 382 LATIN CAPITAL LETTER H 383 LATIN CAPITAL LETTER T 384 SPACE 385 LATIN CAPITAL LETTER T 386 LATIN CAPITAL LETTER H 387 LATIN CAPITAL LETTER R 388 LATIN CAPITAL LETTER E 389 LATIN CAPITAL LETTER E 390 SPACE 391 LATIN CAPITAL LETTER C 392 LATIN CAPITAL LETTER O 393 LATIN CAPITAL LETTER L 394 LATIN CAPITAL LETTER O 395 LATIN CAPITAL LETTER N 396 SPACE 397 LATIN CAPITAL LETTER T 398 LATIN CAPITAL LETTER I 399 LATIN CAPITAL LETTER L 400 LATIN CAPITAL LETTER D 401 LATIN CAPITAL LETTER E 402 SPACE 403 LATIN CAPITAL LETTER C 404 LATIN CAPITAL LETTER O 405 LATIN CAPITAL LETTER M 406 LATIN CAPITAL LETTER M 407 LATIN CAPITAL LETTER A 408 SPACE 409 LATIN CAPITAL LETTER C 410 LATIN CAPITAL LETTER U 411 LATIN CAPITAL LETTER R 412 LATIN CAPITAL LETTER L 413 LATIN CAPITAL LETTER Y 414 SPACE 415 LATIN CAPITAL LETTER S 416 LATIN CAPITAL LETTER P 417 LATIN CAPITAL LETTER A 418 LATIN CAPITAL LETTER C 419 LATIN CAPITAL LETTER E 420 SPACE 421 LATIN CAPITAL LETTER S 422 LATIN CAPITAL LETTER M 423 LATIN CAPITAL LETTER A 424 LATIN CAPITAL LETTER L 425 LATIN CAPITAL LETTER L 426 SPACE 427 LATIN CAPITAL LETTER S 428 LATIN CAPITAL LETTER E 429 LATIN CAPITAL LETTER V 430 LATIN CAPITAL LETTER E 431 LATIN CAPITAL LETTER N 432 SPACE 433 LATIN CAPITAL LETTER E 434 LATIN CAPITAL LETTER I 435 LATIN CAPITAL LETTER G 436 LATIN CAPITAL LETTER H 437 LATIN CAPITAL LETTER T 438 SPACE 439 LATIN CAPITAL LETTER G 440 LATIN CAPITAL LETTER R 441 LATIN CAPITAL LETTER A 442 LATIN CAPITAL LETTER V 443 LATIN CAPITAL LETTER E 444 SPACE 445 LATIN CAPITAL LETTER L 446 LATIN CAPITAL LETTER A 447 LATIN CAPITAL LETTER T 448 LATIN CAPITAL LETTER I 449 LATIN CAPITAL LETTER N 450 SPACE 451 LATIN CAPITAL LETTER N 452 LATIN CAPITAL LETTER I 453 LATIN CAPITAL LETTER N 454 LATIN CAPITAL LETTER E 455 SPACE 456 LATIN CAPITAL LETTER F 457 LATIN CAPITAL LETTER O 458 LATIN CAPITAL LETTER U 459 LATIN CAPITAL LETTER R 460 SPACE 461 LATIN CAPITAL LETTER P 462 LATIN CAPITAL LETTER L 463 LATIN CAPITAL LETTER U 464 LATIN CAPITAL LETTER S 465 SPACE 466 LATIN CAPITAL LETTER F 467 LATIN CAPITAL LETTER I 468 LATIN CAPITAL LETTER V 469 LATIN CAPITAL LETTER E 470 SPACE 471 LATIN CAPITAL LETTER L 472 LATIN CAPITAL LETTER I 473 LATIN CAPITAL LETTER N 474 LATIN CAPITAL LETTER E 475 SPACE 476 LATIN CAPITAL LETTER L 477 LATIN CAPITAL LETTER E 478 LATIN CAPITAL LETTER F 479 LATIN CAPITAL LETTER T 480 SPACE 481 LATIN CAPITAL LETTER S 482 LATIN CAPITAL LETTER T 483 LATIN CAPITAL LETTER O 484 LATIN CAPITAL LETTER P 485 SPACE 486 LATIN CAPITAL LETTER M 487 LATIN CAPITAL LETTER A 488 LATIN CAPITAL LETTER R 489 LATIN CAPITAL LETTER K 490 SPACE 491 LATIN CAPITAL LETTER F 492 LATIN CAPITAL LETTER U 493 LATIN CAPITAL LETTER L 494 LATIN CAPITAL LETTER L 495 SPACE 496 LATIN CAPITAL LETTER S 497 LATIN CAPITAL LETTER I 498 LATIN CAPITAL LETTER G 499 LATIN CAPITAL LETTER N 500 SPACE 501 LATIN CAPITAL LETTER Z 502 LATIN CAPITAL LETTER E 503 LATIN CAPITAL LETTER R 504 LATIN CAPITAL LETTER O 505 SPACE 506 LATIN CAPITAL LETTER T 507 LATIN CAPITAL LETTER W 508 LATIN CAPITAL LETTER O 509 SPACE 510 LATIN CAPITAL LETTER O 511 LATIN CAPITAL LETTER N 512 LATIN CAPITAL LETTER E 513 SPACE 514 LATIN CAPITAL LETTER L 515 LATIN CAPITAL LETTER O 516 LATIN CAPITAL LETTER W 517 SPACE 518 LATIN CAPITAL LETTER S 519 LATIN CAPITAL LETTER I 520 LATIN CAPITAL LETTER X 521 SPACE 522 LATIN CAPITAL LETTER A 523 LATIN CAPITAL LETTER T 524 QUOTATION MARK 525 COMMA 526 LATIN SMALL LETTER W 527 RIGHT PARENTHESIS 528 SEMICOLON 529 LATIN SMALL LETTER Y 530 EQUALS SIGN 531 LATIN SMALL LETTER W 532 LEFT SQUARE BRACKET 533 DIGIT TWO 534 RIGHT SQUARE BRACKET 535 SEMICOLON 536 LATIN SMALL LETTER F 537 LATIN SMALL LETTER O 538 LATIN SMALL LETTER R 539 LEFT PARENTHESIS 540 LATIN SMALL LETTER X 541 EQUALS SIGN 542 LATIN SMALL LETTER W 543 LEFT SQUARE BRACKET 544 DIGIT ONE 545 RIGHT SQUARE BRACKET 546 SEMICOLON 547 LATIN SMALL LETTER I 548 PLUS SIGN 549 PLUS SIGN 550 LESS-THAN SIGN 551 DIGIT TWO 552 DIGIT SIX 553 SEMICOLON 554 LATIN SMALL LETTER X 555 EQUALS SIGN 556 LATIN SMALL LETTER X 557 QUOTATION MARK 558 LATIN SMALL LETTER N 559 LATIN CAPITAL LETTER W 560 RIGHT SQUARE BRACKET 561 COMMA 562 QUOTATION MARK 563 RIGHT PARENTHESIS 564 LATIN SMALL LETTER Y 565 EQUALS SIGN 566 LATIN SMALL LETTER Y 567 QUOTATION MARK 568 LATIN SMALL LETTER N 569 LATIN SMALL LETTER J 570 RIGHT SQUARE BRACKET 571 COMMA 572 QUOTATION MARK 573 SEMICOLON 574 LATIN SMALL LETTER F 575 LATIN SMALL LETTER O 576 LATIN SMALL LETTER R 577 LEFT PARENTHESIS 578 LATIN SMALL LETTER S 579 LATIN SMALL LETTER P 580 LATIN SMALL LETTER L 581 LATIN SMALL LETTER I 582 LATIN SMALL LETTER T 583 LEFT PARENTHESIS 584 LATIN SMALL LETTER X 585 SPACE 586 LATIN SMALL LETTER Y 587 SPACE 588 LATIN SMALL LETTER W 589 LEFT SQUARE BRACKET 590 DIGIT THREE 591 RIGHT SQUARE BRACKET 592 COMMA 593 LATIN SMALL LETTER B 594 COMMA 595 QUOTATION MARK 596 COMMA 597 QUOTATION MARK 598 RIGHT PARENTHESIS 599 SEMICOLON 600 LATIN SMALL LETTER J 601 PLUS SIGN 602 PLUS SIGN 603 LESS-THAN SIGN 604 DIGIT ONE 605 DIGIT TWO 606 DIGIT SIX 607 SEMICOLON 608 LATIN CAPITAL LETTER F 609 LATIN CAPITAL LETTER S 610 EQUALS SIGN 611 LOW LINE 612 RIGHT PARENTHESIS 613 LATIN SMALL LETTER D 614 LEFT SQUARE BRACKET 615 LATIN SMALL LETTER S 616 LATIN SMALL LETTER P 617 LATIN SMALL LETTER R 618 LATIN SMALL LETTER I 619 LATIN SMALL LETTER N 620 LATIN SMALL LETTER T 621 LATIN SMALL LETTER F 622 LEFT PARENTHESIS 623 QUOTATION MARK 624 PERCENT SIGN 625 LATIN SMALL LETTER C 626 QUOTATION MARK 627 COMMA 628 LATIN SMALL LETTER J 629 RIGHT PARENTHESIS 630 RIGHT SQUARE BRACKET 631 EQUALS SIGN 632 LATIN SMALL LETTER J 633 RIGHT CURLY BRACKET 634 LEFT CURLY BRACKET 635 LATIN SMALL LETTER F 636 LATIN SMALL LETTER O 637 LATIN SMALL LETTER R 638 LEFT PARENTHESIS 639 LATIN SMALL LETTER K 640 EQUALS SIGN 641 DIGIT ZERO 642 SEMICOLON 643 LATIN SMALL LETTER K 644 PLUS SIGN 645 PLUS SIGN 646 LESS-THAN SIGN 647 LATIN CAPITAL LETTER N 648 LATIN CAPITAL LETTER F 649 SEMICOLON 650 LATIN SMALL LETTER P 651 LATIN SMALL LETTER R 652 LATIN SMALL LETTER I 653 LATIN SMALL LETTER N 654 LATIN SMALL LETTER T 655 SPACE 656 LATIN SMALL LETTER I 657 EQUALS SIGN 658 LOW LINE 659 RIGHT PARENTHESIS 660 LATIN SMALL LETTER W 661 LATIN SMALL LETTER H 662 LATIN SMALL LETTER I 663 LATIN SMALL LETTER L 664 LATIN SMALL LETTER E 665 LEFT PARENTHESIS 666 LATIN SMALL LETTER I 667 PLUS SIGN 668 PLUS SIGN 669 LESS-THAN SIGN 670 LATIN SMALL LETTER S 671 LATIN SMALL LETTER P 672 LATIN SMALL LETTER L 673 LATIN SMALL LETTER I 674 LATIN SMALL LETTER T 675 LEFT PARENTHESIS 676 LATIN SMALL LETTER B 677 LEFT SQUARE BRACKET 678 LATIN SMALL LETTER D 679 LEFT SQUARE BRACKET 680 DOLLAR SIGN 681 LATIN SMALL LETTER K 682 RIGHT SQUARE BRACKET 683 HYPHEN-MINUS 684 DIGIT THREE 685 DIGIT ONE 686 RIGHT SQUARE BRACKET 687 COMMA 688 LATIN SMALL LETTER Q 689 RIGHT PARENTHESIS 690 RIGHT PARENTHESIS 691 LATIN SMALL LETTER P 692 LATIN SMALL LETTER R 693 LATIN SMALL LETTER I 694 LATIN SMALL LETTER N 695 LATIN SMALL LETTER T 696 LATIN SMALL LETTER F 697 LEFT PARENTHESIS 698 LATIN SMALL LETTER Z 699 EQUALS SIGN 700 LATIN SMALL LETTER W 701 LEFT SQUARE BRACKET 702 LATIN SMALL LETTER D 703 LEFT SQUARE BRACKET 704 LATIN SMALL LETTER Q 705 LEFT SQUARE BRACKET 706 LATIN SMALL LETTER I 707 RIGHT SQUARE BRACKET 708 RIGHT SQUARE BRACKET 709 HYPHEN-MINUS 710 DIGIT SIX 711 DIGIT NINE 712 RIGHT SQUARE BRACKET 713 RIGHT PARENTHESIS 714 QUOTATION MARK 715 SPACE 716 QUOTATION MARK 717 LEFT PARENTHESIS 718 LATIN SMALL LETTER Z 719 TILDE 720 SOLIDUS 721 LATIN CAPITAL LETTER T 722 LATIN CAPITAL LETTER T 723 SOLIDUS 724 QUESTION MARK 725 LATIN SMALL LETTER T 726 LATIN SMALL LETTER O 727 LATIN SMALL LETTER U 728 LATIN SMALL LETTER P 729 LATIN SMALL LETTER P 730 LATIN SMALL LETTER E 731 LATIN SMALL LETTER R 732 LEFT PARENTHESIS 733 DOLLAR SIGN 734 LATIN SMALL LETTER K 735 RIGHT PARENTHESIS 736 COLON 737 LOW LINE 738 RIGHT PARENTHESIS 739 RIGHT CURLY BRACKET 

Just kidding ;D

BEGIN{split("i,Lv,Sv,ax,^x,[x,Q,O,tK,cK,V,qx,g,I,wu,X,by,b{,bz,bd,bp,br,b},bk,bl,bo,e,P,Rx,_x,Jx,Uv,M~, t`Y,ZX,c`Y,N\\,|s,m\\, thY,Ts,chY,f HYPHEN-MINUS GREATER-THAN PARENTHESIS EXCLAMATION COMMERCIAL CIRCUMFLEX APOSTROPHE SEMICOLON AMPERSAND LESS-THAN QUOTATION VERTICAL QUESTION ASTERISK CAPITAL SOLIDUS BRACKET REVERSE PERCENT ACCENT LETTER DOLLAR EQUALS SQUARE NUMBER DIGIT RIGHT THREE COLON TILDE COMMA CURLY SPACE SMALL SEVEN EIGHT GRAVE LATIN NINE FOUR PLUS FIVE LINE LEFT STOP MARK FULL SIGN ZERO TWO ONE LOW SIX AT",w);x=w[1];for(y=w[2];C++<26;x=x"nW],")y=y"nj],";for(split(x y w[3],b,",");j++<126;FS=_)d[sprintf("%c",j)]=j}{for(k=0;k++<NF;print i=_)while(i++<split(b[d[$k]-31],q))printf(z=w[d[q[i]]-69])" "(z~/TT/?toupper($k):_)} 

Works with stdin/stdout.

More "readable":

BEGIN{ # This string (508 bytes) holds a representation of the character names in # the right order, plus a list of the used words. split("i,Lv,Sv,ax,^x,[x,Q,O,tK,cK,V,qx,g,I,wu,X,by,b{,bz,bd,bp,br,b},bk,bl,bo, e,P,Rx,_x,Jx,Uv,M~, t`Y,ZX,c`Y,N\\,|s,m\\, thY,Ts,chY,f HYPHEN-MINUS GREATER-T HAN PARENTHESIS EXCLAMATION COMMERCIAL CIRCUMFLEX APOSTROPHE SEMICOLON AMPERSA ND LESS-THAN QUOTATION VERTICAL QUESTION ASTERISK CAPITAL SOLIDUS BRACKET REVE RSE PERCENT ACCENT LETTER DOLLAR EQUALS SQUARE NUMBER DIGIT RIGHT THREE COLON TILDE COMMA CURLY SPACE SMALL SEVEN EIGHT GRAVE LATIN NINE FOUR PLUS FIVE LINE LEFT STOP MARK FULL SIGN ZERO TWO ONE LOW SIX AT",w); # Since the letters each appear 26 times I construct that part at runtime. # The array b will hold the coded combinations of which words need to # be printed for each input character. x=w[1]; for(y=w[2];C++<26;x=x"nW],") y=y"nj],"; # The array d is an ASCIICodeFromChar function replacement. # I set the field separator to empty, so each character of the input is # an input field. That's why using a BEGIN part was mandatory. for(split(x y w[3],b,",");j++<126;FS=_) d[sprintf("%c",j)]=j } # Here I go through the element of b that matches the input and print # the requested words, using the input to produce a capital letter if # needed. I excluded these from the word list to save another 26 bytes { for(k=0;k++<NF;print i=_) while(i++<split(b[d[$k]-31],q)) printf(z=w[d[q[i]]-69])" "(z~/TT/?toupper($k):_) } 
\$\endgroup\$
6
\$\begingroup\$

C++11, 739 bytes

#include<iostream> #define D,"DIGIT " #define G" SIGN", int main(){std::string a=" BRACKET",s="SQUARE"+a,c="CURLY"+a,t="TION MARK",p="PARENTHESIS",l="LEFT ",r="RIGHT ",x="LATIN ",y="L LETTER ",z[]{"SPACE","EXCLAMA"+t,"QUOTA"+t,"NUMBER"G"DOLLAR"G"PERCENT"G"AMPERSAND","APOSTROPHE",l+p,r+p,"ASTERISK","PLUS"G"COMMA","HYPHEN-MINUS","FULL STOP","SOLIDUS"D"ZERO"D"ONE"D"TWO"D"THREE"D"FOUR"D"FIVE"D"SIX"D"SEVEN"D"EIGHT"D"NINE","COLON","SEMICOLON","LESS-THAN"G"EQUALS"G"GREATER-THAN"G"QUES"+t,"COMMERCIAL AT",l+s,"REVERSE SOLIDUS",r+s,"CIRCUMFLEX ACCENT","LOW LINE","GRAVE ACCENT",l+c,"VERTICAL LINE",r+c,"TILDE"};getline(std::cin,s);for(char c:s)std::cout<<(c<65?z[c-32]:c<91?x+"CAPITA"+y+c:(c-=32,c<65?z[c-26]:c<91?x+"SMAL"+y+c:z[c-52]))+"\n";} 

Based on sweerpotato's solution, but modified heavily.

\$\endgroup\$
1
  • \$\begingroup\$ Nicely done :~)! \$\endgroup\$ Commented Sep 8, 2015 at 19:36
5
\$\begingroup\$

Common Lisp (SBCL), 52 79

(map()(lambda(y)(format t"~:@(~A~)~%"(substitute #\ #\_(char-name y))))(read)) 

This is built-in and implementation-dependent, so you may want to ignore it when choosing the accepted answer. This is not enough to beat Python, unfortunately. The updated version conforms to the expected output (I have to replace underscores by spaces).

Example

CL-USER> (map()(lambda(y)(format t"~:@(~A~)~%"(substitute #\ #\_(char-name y))))(read)) "(λ(r)(* 2 ᴨ r))" LEFT PARENTHESIS GREEK SMALL LETTER LAMDA LEFT PARENTHESIS LATIN SMALL LETTER R RIGHT PARENTHESIS LEFT PARENTHESIS ASTERISK SPACE DIGIT TWO SPACE GREEK LETTER SMALL CAPITAL PI SPACE LATIN SMALL LETTER R RIGHT PARENTHESIS RIGHT PARENTHESIS 
\$\endgroup\$
5
\$\begingroup\$

C++14, 1043 1000 998 996 972 bytes

Grotesque solution in C++14:

#include<iostream> #include<map> #define b cout #define d string #define e },{ using namespace std;char l='\n';d s[]{"DIGIT ","LATIN CAPITAL LETTER ","LATIN SMALL LETTER "};map<char, d> m{{' ',"SPACE"e'!',"EXCLAMATION MARK"e'\"',"QUOTATION MARK"e'#',"NUMBER SIGN"e'$',"DOLLAR SIGN"e'%',"PERCENT SIGN"e'&',"AMPERSAND"e'\'',"APOSTROPHE"e'(',"LEFT PARENTHESIS"e')',"RIGHT PARENTHESIS"e'*',"ASTERISK"e'+',"PLUS SIGN"e',',"COMMA"e'-',"HYPHEN-MINUS"e'.',"FULL STOP"e'/',"SOLIDUS"e':',"COLON"e';',"SEMICOLON"e'<',"LESS-THAN SIGN"e'=',"EQUALS SIGN"e'>',"GREATER-THAN SIGN"e'?',"QUESTION MARK"e'@',"COMMERCIAL AT"e'[',"LEFT SQUARE BRACKET"e'\\',"REVERSE SOLIDUS"e']',"RIGHT SQUARE BRACKET"e'^',"CIRCUMFLEX ACCENT"e'_',"LOW LINE"e'`',"GRAVE ACCENT"e'{',"LEFT CURLY BRACKET"e'|',"VERTICAL LINE"e'}',"RIGHT CURLY BRACKET"e'~',"TILDE"}};int main(){d str;getline(cin,str);for(char c:str){islower(c)?b<<s[2]<<(char)(c-32):isupper(c)?b<<s[1]<<c:isdigit(c)?b<<*s<<c:b<<m.at(c);b<<l;}} 

Thanks to kirbyfan64sos for golfing off two bytes

\$\endgroup\$
2
  • \$\begingroup\$ Can you do *s instead of s[0]? \$\endgroup\$ Commented Sep 7, 2015 at 23:41
  • \$\begingroup\$ Sure can! Totally missed that \$\endgroup\$ Commented Sep 8, 2015 at 19:04
4
\$\begingroup\$

Pyth, 41

$from unicodedata import name as neg$Vz_N 

Uses same builtin as mbomb007's python answer. Note that this cannot be executed online because the $ operator is unsafe.

\$\endgroup\$
4
\$\begingroup\$

CJam, 517

l{i32-["SPACE""EXCLAMA""TION MARK":T+"QUOTA"T+"NUMBER DOLLAR PERCENT"{S/" SIGN"am*~}:H~"AMPERSAND""APOSTROPHE""LEFT PARENTHESIS":L"RIGHT ":R1$5>+"ASTERISK""PLUS"H"COMMA""HYPHEN-MINUS""FULL STOP""SOLIDUS":D"DIGIT "a"ZERO ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE"S/m*~"COLON""SEMI"1$+"LESS-THAN EQUALS GREATER-THAN"H"QUES"T+"COMMERCIAL AT""CAPITA"{["LATIN "\"L LETTER "]a'[,65>m*~L5<}:Z~"SQUARE BRACKET":Q+"REVERSE "D+RQ+"CIRCUMFLEX ACCENT""LOW LINE""GRAVE"2$A>+"SMAL"Z"CURLY"33$B>+:C+"VERTICAL LINE"RC+"TILDE"]=N}/ 

+"ASTERISK""PLUS"H"COMMA""HYPHEN-MINUS""FULL STOP""SOLIDUS":D"DIGIT "a"ZERO ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE"S/m*~"COLON""SEMI"1$+"LESS-THAN EQUALS GREATER-THAN"H"QUES"T+"COMMERCIAL AT""CAPITA"{["LATIN "\"L LETTER "]a'[,65>m*~L5<}:Z~"SQUARE BRACKET":Q+"REVERSE "D+RQ+"CIRCUMFLEX ACCENT""LOW LINE""GRAVE"2$A>+"SMAL"Z"CURLY"33$B>+:C+"VERTICAL LINE"RC+"TILDE"]=N}/" rel="nofollow">Online version

I have tried different solutions but simply storing all the names in a huge array seems most efficient.

This is my first real CJam program by the way.

\$\endgroup\$
4
\$\begingroup\$

Clojure, 56 bytes

(doseq[c(read-line)](println(Character/getName(int c)))) 

Inspired by @peter's answer. Uses Clojure for the Java interop.

\$\endgroup\$
4
\$\begingroup\$

Perl - 894 bytes

Lovingly crafted by hand. First time golfing in Perl so any tips are appreciated.

$_=$ARGV[0];s/(.)/$1\n/g;s/([A-Z])/& CAPITAL' $1/g;s/([a-z])/& SMALL' \U$1/g;s/,/COMMA/g;s/& /LATIN /g;s/' / LETT, /g;s/&/AMP,SAND/g;s/'/APOSTROPHE/g;s/ \n/SPACE\n/g;s/\*/AST,ISK/g;s/-/HYPHEN-MINUS/g;s/\./FULL STOP/g;s/@/COMM,CIAL AT/g;s/~/TILDE/g;s/:/&/g;s/;/SEMI&/g;s/&/COLON/g;s/\|/V,TICAL&/g;s/_/LOW&/g;s/&/ LINE/g;s/\^/CIRCUMFLEX&/g;s/`/GRAVE&/g;s/&/ ACCENT/g;s/\//&/g;s/\\/REV,SE &/g;s/&/SOLIDUS/g;s/!/!&/g;s/"/"&/g;s/\?/?&/g;s/!/EXCLAMA/g;s/"/QUOTA/g;s/\?/QUES/g;s/&/TION MARK/g;s/#/NUMB,&/g;s/\$/DOLLAR&/g;s/%/P,CENT&/g;s/\+/PLUS&/g;s/</LESS-THAN&/g;s/=/EQUALS&/g;s/>/GREAT,-THAN&/g;s/&/ SIGN/g;s/\(/<&/g;s/\)/>&/g;s/&/ PARENTHESIS/g;s/\[/<&/g;s/\]/>&/g;s/&/ SQUARE'/g;s/{/<&/g;s/}/>&/g;s/&/ CURLY'/g;s/'/ BRACKET/g;s/</LEFT/g;s/>/RIGHT/g;s/0/&Z,O/g;s/1/&ONE/g;s/2/&TWO/g;s/3/&THREE/g;s/4/&FOUR/g;s/5/&FIVE/g;s/6/&SIX/g;s/7/&SEVEN/g;s/8/&EIGHT/g;s/9/&NINE/g;s/&/DIGIT /g;s/,/ER/g;print; 
\$\endgroup\$
3
\$\begingroup\$

C++14 716 706 704

#include<iostream> char*q,x,b[584],*t=b,a[]=R"(space}exclamation|mark}quot"-number|sign}dolla!apercent!mam"%sand}apostrophe}left|par%3hesis}righ"Wasterisk}plus*<comma}hy)#n{minus}full|stop}solid"Ldigit|zero!Tone!Gtw"kthre#&four!Uiv#&six!Heve>^!_e6r!ani,1colon}semi!Fless{than8Eequal:$grea<s$2quesMj>EJoial|at}lQ9n|capit"?let('|Jes+\re|bracket}r5urse|C5M?%2circumflex|acXR}low|l:bgrave#'0=smaNy0+curly*s/Ytic4z)$/\$itilde)",*s=a;int c,z,l='{';int main(){for(;x=*s++;)if(z=x-32,x>96)*t++=x<l?z:"- "[x%l];else for(c=z*95+*s++-32,q=t-c/13,x=3+c%13;x--;)*t++=*q++;while(std::cin.get(x)){for(s=b,z=0,c=x<65?x-32:x<91?z=33:x<97?x-57:x<l?z=40:x-82;c--;)while(*s++);auto&o=std::cout<<s;(z?o.put(x&~32):o)<<"\n";}} 

Live version.

With some whitespace:

#include <iostream> // a is compressed using an LZ like compression scheme char *q, x, b[584], *t = b, a[] = R"(space}exclamation|mark}quot"-number|sign}dolla!apercent!mam"%sand}apostrophe}left|par%3hesis}righ"Wasterisk}plus*<comma}hy)#n{minus}full|stop}solid"Ldigit|zero!Tone!Gtw"kthre#&four!Uiv#&six!Heve>^!_e6r!ani,1colon}semi!Fless{than8Eequal:$grea<s$2quesMj>EJoial|at}lQ9n|capit"?let('|Jes+\re|bracket}r5urse|C5M?%2circumflex|acXR}low|l:bgrave#'0=smaNy0+curly*s/Ytic4z)$/\$itilde)", *s = a; int c, z, l = '{'; int main() { // Decompress from a into b for (; x = *s++;) if (z = x - 32, x > 96) *t++ = x < l ? z : "- "[x % l]; else for (c = z * 95 + *s++ - 32, q = t - c / 13, x = 3 + c % 13; x--;) *t++ = *q++; // Process input a char at a time, performing a lookup into b for the c'th null separated string while (std::cin.get(x)) { for (s = b, z = 0, c = x < 65 ? x - 32 : x < 91 ? z = 33 : x < 97 ? x - 57 : x < l ? z = 40 : x - 82; c--;) while (*s++) ; auto& o = std::cout << s; (z ? o.put(x & ~32) : o) << "\n"; } } 

The compressed string a decompresses to:

space}exclamation|mark}quotation|mark}number|sign}dollar|sign}percent|sign}ampersand}apostrophe}left|parenthesis}right|parenthesis}asterisk}plus|sign}comma}hyphen{minus}full|stop}solidus}digit|zero}digit|one}digit|two}digit|three}digit|four}digit|five}digit|six}digit|seven}digit|eight}digit|nine}colon}semicolon}less{than|sign}equals|sign}greater{than|sign}question|mark}commercial|at}latin|capital|letter|}left|square|bracket}reverse|solidus}right|square|bracket}circumflex|accent}low|line}grave|accent}latin|small|letter|}left|curly|bracket}vertical|line}right|curly|bracket}tilde

And during decompression } is replaced with \0, | with (space) and { with - and lowercase letters are converted to uppercase.

The string is compressed LZ style as either a literal [a-~] or a two character encoded offset/length to a match earlier in the string.

\$\endgroup\$
1
\$\begingroup\$

Factor, 58 bytes

[ readln [ char>name "-"" " replace >upper print ] each ] 

Pretty simple; does the exact same thing as the Java and Perl 6 answers.

\$\endgroup\$
1
\$\begingroup\$

Go, 145 bytes

package main import(."golang.org/x/text/unicode/runenames" ."fmt") func main(){r:='!' for{_,e:=Scanf("%c",&r) if e!=nil{break} println(Name(r))}} 

Some have address a concern with the above answer. I tried vendoring the code in question, but my answer bypassed the 65,536 byte limit [1], so I think the above answer is fine as is.

  1. https://github.com/golang/text/blob/v0.3.6/unicode/runenames/tables13.0.0.go
\$\endgroup\$
6
  • 1
    \$\begingroup\$ From the question, Internet and external libraries are not allowed, all necessary data should be in the code. \$\endgroup\$ Commented Jul 6, 2021 at 3:33
  • 1
    \$\begingroup\$ When I run code myself, it says cannot find package "golang.org/x/text/unicode/runenames" in any of: /usr/lib/golang/src/golang.org/x/text/unicode/runenames (from $GOROOT) /home/runner/go/src/golang.org/x/text/unicode/runenames (from $GOPATH) \$\endgroup\$ Commented Jul 6, 2021 at 4:18
  • 1
    \$\begingroup\$ But it is my job to make sure that answers comply with the spirit and letter of the challenge. This runenames import appears to be an external library, which is disallowed. \$\endgroup\$ Commented Jul 6, 2021 at 13:07
  • 2
    \$\begingroup\$ Whether JoKing is a moderator or not is irrelevant here - any user should be able to check if an answer is valid by the rules. Obviously, not everyone is familiar with Go, so it is up to you to demonstrate that your answer follows all rules, including "Internet and external libraries are not allowed" If this answer requires the internet to work, then it isn't valid per the question. It'd be helpful if you could verify this or not, as you clearly know more about Go than either me or JoKing \$\endgroup\$ Commented Jul 6, 2021 at 14:12
  • 1
    \$\begingroup\$ discussion in chat starting here \$\endgroup\$ Commented Jul 6, 2021 at 14:21
-1
\$\begingroup\$

PHP>=7, 54 Bytes

for(;a&$c=$argn[$i++];)echo" ".IntlChar::charName($c); 

IntlChar::charName

\$\endgroup\$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.