26
\$\begingroup\$

Short and sweet description of the challenge:

Based off the ideas of several other questions on this site, your challenge is to write the most creative code in any program that takes as input a number written out in English and converts it to integer form.

Really dry, long and thorough specifications:

  • Your program will receive as input an integer in lowercase English between zero and nine hundred ninety-nine thousand nine hundred ninety-nine inclusive.
  • It must output only the integer form of the number between 0 and 999999 and nothing else (no whitespace).
  • The input will NOT contain , or and, as in one thousand, two hundred or five hundred and thirty-two.
  • When the tens and ones places are both nonzero and the tens place is greater than 1, they will be separated by a HYPHEN-MINUS character - instead of a space. Ditto for the ten thousands and thousands places. For example, six hundred fifty-four thousand three hundred twenty-one.
  • The program may have undefined behavior for any other input.

Some examples of a well-behaved program:

zero -> 0
fifteen -> 15
ninety -> 90
seven hundred four -> 704
sixty-nine thousand four hundred eleven -> 69411
five hundred twenty thousand two -> 520002

\$\endgroup\$
7
  • \$\begingroup\$ This isn't especially creative, nor does it precisely match the specification here, but it might be useful as a starting point: github.com/ghewgill/text2num/blob/master/text2num.py \$\endgroup\$ Commented Jun 24, 2014 at 2:16
  • \$\begingroup\$ I could almost post my answer to this question. \$\endgroup\$ Commented Jun 24, 2014 at 3:21
  • \$\begingroup\$ Why do complicated string parsing? pastebin.com/WyXevnxb \$\endgroup\$ Commented Jun 24, 2014 at 8:47
  • 1
    \$\begingroup\$ By the way, I saw an IOCCC entry which is the answer of this question. \$\endgroup\$ Commented Jun 25, 2014 at 2:22
  • 2
    \$\begingroup\$ What about things like "four and twenty?" \$\endgroup\$ Commented Jun 25, 2014 at 3:13

9 Answers 9

93
\$\begingroup\$

Applescript

A silly, hacky mash-up that might upset some Cupertino/Mountain View folks, but I think its a creative silly, hacky mash-up.

set myNumber to text returned of (display dialog ¬ "Enter number as text:" buttons {"Continue…"} ¬ default answer "" default button 1) tell application "Google Chrome" activate open location "https://www.google.com" end tell delay 5 say "ok google. " & myNumber delay 2 tell application "System Events" tell application process "Google Chrome" set fullURL to value of text field 1 of toolbar 1 of window 1 end tell end tell set AppleScript's text item delimiters to "=" display alert item 2 of text items of fullURL 

Uses OSX text to speech to speak the text number, and google audio search to listen for it and convert it to an integer.

Requirements

  • OSX
  • Google chrome
  • speech-recognition enabled in your google account
  • volume turned up to a reasonable level

The delay timings may need to be adjusted depending on your chrome load time and google lookup time.

Example input:

enter image description here

Example output:

enter image description here

\$\endgroup\$
5
  • 13
    \$\begingroup\$ I think that might be just a little creative... ;) \$\endgroup\$ Commented Jun 24, 2014 at 2:57
  • 5
    \$\begingroup\$ Lol, this is cool \$\endgroup\$ Commented Jun 24, 2014 at 4:58
  • 2
    \$\begingroup\$ Maybe too creative. \$\endgroup\$ Commented Jun 25, 2014 at 0:37
  • \$\begingroup\$ After one week, your answer is clearly in the lead at 74 votes, so I think that means that.. you win! By the way, mind if I use this code? It would be really useful for a lot of real-world projects I'm working on right now! ;) \$\endgroup\$ Commented Jul 1, 2014 at 19:33
  • 3
    \$\begingroup\$ @Abraham Thanks! You're joking about using this in production code, right? \$\endgroup\$ Commented Jul 1, 2014 at 22:15
35
\$\begingroup\$

Bash, 93 64 55 characters*

In the fantastic bsd-games package that's available on most Linux operating systems, there's a small command-line toy called number. It turns numbers into English text, that is, it does the exact opposite of this question. It really is the exact opposite: all the rules in the question are followed by number. It's almost too good to be a coincidence.

$ number 42 forty-two. 

Of course, number doesn't answer the question. We want it the other way around. I thought about this for a while, tried string parsing and all that, then realised that I can just call number on all 999.999 numbers and see if something matches the input. If so, the first line where it matches has twice the line number I'm looking for (number prints a line of dots after every number). Simple as that. So, without further ado, here's the complete code for my entry:

seq 0 999999|number -l|awk "/$1/{print (NR-1)/2;exit}" 

It even short-circuits, so converting "two" is quite fast, and even higher numbers are usually decoded in under a second on my box. Here's an example run:

wn@box /tmp> bash unnumber.sh "zero" 0 wn@box /tmp> bash unnumber.sh "fifteen" 15 wn@box /tmp> bash unnumber.sh "ninety" 90 wn@box /tmp> bash unnumber.sh "seven hundred four" 704 wn@box /tmp> bash unnumber.sh "sixty-nine thousand four hundred eleven" 69411 wn@box /tmp> bash unnumber.sh "five hundred twenty thousand two" 520002 

Of course, you'll need to have number installed for this to work.


*: Yes, I know, this is not a code-golf challenge, but shortness is pretty much the only discerning quality of my entry, so... :)

\$\endgroup\$
2
  • 8
    \$\begingroup\$ +1. For me, using number in reverse is the most creative thing about this answer. The golfiness is good too though :) \$\endgroup\$ Commented Jun 24, 2014 at 16:34
  • 1
    \$\begingroup\$ This is actually quite creative! I like it! \$\endgroup\$ Commented Jun 27, 2014 at 14:54
13
\$\begingroup\$

Javascript

(function parse(input) { var pat = "ze/on/tw/th.?r/fo/fi/ix/se/ei/ni/ten/ele".split("/"); var num = "", last = 0, token = input.replace(/-/g, " ").split(" "); for(var i in token) { var t = token[i]; for(var p in pat) if(t.match(RegExp(pat[p])) !== null) num += "+" + p; if(t.indexOf("een") >= 0) num += "+10"; if(t.indexOf("lve") >= 0) num += "+10"; if(t.indexOf("ty") >= 0) num += "*10"; if(t.indexOf("dr") >= 0) { last = 100; num += "*100"; } if(t.indexOf("us") >= 0) { if(last < 1000) num = "(" + num + ")"; last = 0; num += "*1000"; } } alert(eval(num)); })(prompt()); 

Do you like some eval()?

Run this script on your browser's console.

Edit: Thanks for the feedback. Bugs fixed (again).

\$\endgroup\$
6
  • \$\begingroup\$ really nice code ^^ \$\endgroup\$ Commented Jun 24, 2014 at 8:50
  • 2
    \$\begingroup\$ When you type something like "one hundred sixteen", it'll give 126. \$\endgroup\$ Commented Jun 24, 2014 at 12:32
  • \$\begingroup\$ This program fails for some numbers starting at twelve when it returns 23. \$\endgroup\$ Commented Jun 24, 2014 at 13:13
  • \$\begingroup\$ Fails on "twenty". \$\endgroup\$ Commented Jun 25, 2014 at 4:06
  • \$\begingroup\$ seven thousand three hundred thirty five give me 10335 \$\endgroup\$ Commented Jun 25, 2014 at 5:36
7
\$\begingroup\$

Python

Just to get the ball rolling.

import re table = {'zero':0,'one':1,'two':2,'three':3,'four':4,'five':5,'six':6,'seven':7,'eight':8,'nine':9, 'ten':10,'eleven':11,'twelve':12,'thirteen':13,'fourteen':14,'fifteen':15,'sixteen':16,'seventeen':17,'eighteen':18,'nineteen':19, 'twenty':20,'thirty':30,'forty':40,'fifty':50,'sixty':60,'ninety':90} modifier = {'hundred':100,'thousand':1000} while True: text = raw_input() result = 0 tmp = 0 last_multiplier = 1 for word in re.split('[- ]', text): multiplier = modifier.get(word, 1) if multiplier > last_multiplier: result = (result+tmp)*multiplier tmp = 0 else: tmp *= multiplier if multiplier != 1: last_multiplier = multiplier tmp += table.get(word,0) print result+tmp 
\$\endgroup\$
5
\$\begingroup\$

Perl + CPAN

Why reinvent the wheel, when it has been done already?

use feature 'say'; use Lingua::EN::Words2Nums; say words2nums $_ while <>; 

This program reads English strings from standard input (or from one or more files specified as command line arguments), one per line, and prints out the corresponding numbers to standard output.

I have tested this code using both the sample inputs from the challenge, as well as an exhaustive test suite consisting of the numbers from 0 to 999999 converted to text using the bsd-games number utility (thanks, Wander Nauta!), and it correctly parses all of them. As a bonus, it also understands such inputs as e.g. minus seven (−7), four and twenty (24), four score and seven (87), one gross (144), a baker's dozen (13), eleventy-one (111) and googol (10100).

(Note: In addition to the Perl interpreter itself, this program also requires the CPAN module Lingua::EN::Words2Nums. Here are some instructions for installing CPAN modules. Debian / Ubuntu Linux users may also install this module via the APT package manager as liblingua-en-words2nums-perl.)

\$\endgroup\$
4
\$\begingroup\$

Python

A general recursive solution, with validity checking. Could be simplified for the range of numbers required, but here's to showing off I guess:

terms = 'zero one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen'.split() tee = 'twenty thirty forty fifty sixty seventy eighty ninety'.split() for t in tee: terms.append(t) for s in terms[1:10]: terms.append(t+'-'+s) terms = dict(zip(terms, range(100))) modifiers = [('hundred', 100), ('thousand', 1000), ('million', 10**6), ('billion', 10**9)] def read_num(words): if len(words) == 0: return 0 elif len(words) == 1: if words[0] in terms: return terms[words[0]] else: raise ValueError(words[0]+' is not a valid english number.') else: for word, value in reversed(modifiers): if word in words: i = words.index(word) return read_num(words[:i])*value+read_num(words[i+1:]) raise ValueError(' '.join(words)+' is not a valid english number.') while True: try: print(read_num(input().split())) except ValueError as e: print(e) 
\$\endgroup\$
2
\$\begingroup\$

VBScript 474

This is a fairly routine answer... unfortunately, so routine that @Snack pretty much posted the same process but before me.

i=split(REPLACE(REPLACE(inputbox(""),"lve","een"),"tho","k")) o=split("z on tw th fo fi si se ei ni ten ele") y=split("red *100) k )*1000 ty *10) een +10)") z="" p=0 for t=0 to UBOUND(i) s=split(i(t),"-") u=ubound(s) r=s(0) for x=0 to UBOUND(o) IF INSTR(r,o(x)) THEN z=z+"+"+CSTR(x) END IF IF u Then IF INSTR(s(1),o(x)) THEN z=z+CSTR(x) END IF END IF next for m=0 to UBOUND(y) IF INSTR(r,y(m))AND u=0 THEN z=z+y(m+1) p=p+1 END IF next next Execute("MSGBOX "+String(p,"(")+z) 
\$\endgroup\$
1
\$\begingroup\$

Haskell

Similar to other recursive solutions I guess, but I took the time to make it clean.

Here is the complete source with all explanations : http://ideone.com/fc8zcB

-- Define a type for a parser from a list of tokens to the value they represent. type NParse = [Token] -> Int -- Map of literal tokens (0-9, 11-19 and tens) to their names. literals = [ ("zero", 0), ("one", 1), ("two", 2), ("three", 3), ("four", 4), ("five", 5), ("six", 6), ("seven", 7), ("eight", 8), ("nine", 9), ("eleven", 11), ("twelve", 12), ("thirteen", 13), ("fourteen", 14), ("fifteen", 15), ("sixteen", 16), ("seventeen", 17), ("eighteen", 18), ("nineteen", 19), ("ten", 10), ("twenty", 20), ("thirty", 30), ("fourty", 40), ("fifty", 50), ("sixty", 60), ("seventy", 70), ("eighty", 80), ("ninety", 90) ] -- Splits the input string into tokens. -- We do one special transformation: replace dshes by a new token. Such that "fifty-three" becomes "fifty tens three". prepare :: String -> [Token] -- Let's do the easy stuff and just parse literals first. We just have to look them up in the literals map. -- This is our base parser. parseL :: NParse parseL [tok] = case lookup tok literals of Just x -> x -- We're going to exploit the fact that the input strings have a tree-like structure like so -- thousand -- hundred hundred -- ten ten ten ten -- lit lit lit lit lit lit lit lit -- And recursively parse that tree until we only have literal values. -- -- When parsing the tree -- thousand -- h1 h2 -- The resulting value is 1000 * h1 + h2. -- And this works similarly for all levels of the tree. -- So instead of writing specific parsers for all levels, let's just write a generic one : {- genParse :: NParse : the sub parser -> Int : the left part multiplier -> Token : the boundary token -> NParse : returns a new parser -} genParse :: NParse -> Int -> Token -> NParse genParse delegate mul tok = newParser where newParser [] = 0 newParser str = case splitAround str tok of -- Split around the boundary token, sub-parse the left and right parts, and combine them (l,r) -> (delegate l) * mul + (delegate r) -- And so here's the result: parseNumber :: String -> Int parseNumber = parseM . prepare where -- Here are all intermediary parsers for each level parseT = genParse parseL 1 "tens" -- multiplier is irregular, because the fifty in fifty-three is already multiplied by 10 parseH = genParse parseT 100 "hundred" parseK = genParse parseH 1000 "thousand" parseM = genParse parseK 1000000 "million" -- For fun :D test = (parseNumber "five hundred twenty-three thousand six hundred twelve million two thousand one") == 523612002001 
\$\endgroup\$
0
\$\begingroup\$

Common Lisp, 94

(write(cdr(assoc(read-line)(loop for i to 999999 collect(cons(format()"~r"i)i)):test #'equalp))) 

Number to text conversion is built in to CL, but not the other way around. Builds a reverse mapping for the numbers and checks the input on it.

\$\endgroup\$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.