Convert English to a number [closed]

Question

Short and sweet description of the challenge:

Based off the ideas of several other questions on this site, your challenge is to write the most creative code in any program that takes as input a number written out in English and converts it to integer form.

Really dry, long and thorough specifications:

Your program will receive as input an integer in lowercase English between zero and nine hundred ninety-nine thousand nine hundred ninety-nine inclusive.
It must output only the integer form of the number between 0 and 999999 and nothing else (no whitespace).
The input will NOT contain , or and, as in one thousand, two hundred or five hundred and thirty-two.
When the tens and ones places are both nonzero and the tens place is greater than 1, they will be separated by a HYPHEN-MINUS character - instead of a space. Ditto for the ten thousands and thousands places. For example, six hundred fifty-four thousand three hundred twenty-one.
The program may have undefined behavior for any other input.

Some examples of a well-behaved program:

zero -> 0
fifteen -> 15
ninety -> 90
seven hundred four -> 704
sixty-nine thousand four hundred eleven -> 69411
five hundred twenty thousand two -> 520002

This isn't especially creative, nor does it precisely match the specification here, but it might be useful as a starting point: github.com/ghewgill/text2num/blob/master/text2num.py — Greg Hewgill
– Greg Hewgill, Commented Jun 24, 2014 at 2:16
By the way, I saw an IOCCC entry which is the answer of this question. — Snack
– Snack, Commented Jun 25, 2014 at 2:22

Community · Accepted Answer · 2020-06-17 09:04:33Z

Applescript

A silly, hacky mash-up that might upset some Cupertino/Mountain View folks, but I think its a creative silly, hacky mash-up.

set myNumber to text returned of (display dialog ¬ "Enter number as text:" buttons {"Continue…"} ¬ default answer "" default button 1) tell application "Google Chrome" activate open location "https://www.google.com" end tell delay 5 say "ok google. " & myNumber delay 2 tell application "System Events" tell application process "Google Chrome" set fullURL to value of text field 1 of toolbar 1 of window 1 end tell end tell set AppleScript's text item delimiters to "=" display alert item 2 of text items of fullURL

Uses OSX text to speech to speak the text number, and google audio search to listen for it and convert it to an integer.

Requirements

OSX
Google chrome
speech-recognition enabled in your google account
volume turned up to a reasonable level

The delay timings may need to be adjusted depending on your chrome load time and google lookup time.

Example input:

enter image description here

Example output:

enter image description here

After one week, your answer is clearly in the lead at 74 votes, so I think that means that.. you win! By the way, mind if I use this code? It would be really useful for a lot of real-world projects I'm working on right now! ;) — Abraham
– Abraham, Commented Jul 1, 2014 at 19:33
@Abraham Thanks! You're joking about using this in production code, right? — Digital Trauma
– Digital Trauma, Commented Jul 1, 2014 at 22:15

Wander Nauta · Accepted Answer · 2015-10-27 13:24:57Z

Bash, 93 64 55 characters*

In the fantastic bsd-games package that's available on most Linux operating systems, there's a small command-line toy called number. It turns numbers into English text, that is, it does the exact opposite of this question. It really is the exact opposite: all the rules in the question are followed by number. It's almost too good to be a coincidence.

$ number 42 forty-two.

Of course, number doesn't answer the question. We want it the other way around. I thought about this for a while, tried string parsing and all that, then realised that I can just call number on all 999.999 numbers and see if something matches the input. If so, the first line where it matches has twice the line number I'm looking for (number prints a line of dots after every number). Simple as that. So, without further ado, here's the complete code for my entry:

seq 0 999999|number -l|awk "/$1/{print (NR-1)/2;exit}"

It even short-circuits, so converting "two" is quite fast, and even higher numbers are usually decoded in under a second on my box. Here's an example run:

wn@box /tmp> bash unnumber.sh "zero" 0 wn@box /tmp> bash unnumber.sh "fifteen" 15 wn@box /tmp> bash unnumber.sh "ninety" 90 wn@box /tmp> bash unnumber.sh "seven hundred four" 704 wn@box /tmp> bash unnumber.sh "sixty-nine thousand four hundred eleven" 69411 wn@box /tmp> bash unnumber.sh "five hundred twenty thousand two" 520002

Of course, you'll need to have number installed for this to work.

*: Yes, I know, this is not a code-golf challenge, but shortness is pretty much the only discerning quality of my entry, so... :)

+1. For me, using number in reverse is the most creative thing about this answer. The golfiness is good too though :) — Digital Trauma
– Digital Trauma, Commented Jun 24, 2014 at 16:34

Snack · Accepted Answer · 2014-06-25 05:35:47Z

Javascript

(function parse(input) { var pat = "ze/on/tw/th.?r/fo/fi/ix/se/ei/ni/ten/ele".split("/"); var num = "", last = 0, token = input.replace(/-/g, " ").split(" "); for(var i in token) { var t = token[i]; for(var p in pat) if(t.match(RegExp(pat[p])) !== null) num += "+" + p; if(t.indexOf("een") >= 0) num += "+10"; if(t.indexOf("lve") >= 0) num += "+10"; if(t.indexOf("ty") >= 0) num += "*10"; if(t.indexOf("dr") >= 0) { last = 100; num += "*100"; } if(t.indexOf("us") >= 0) { if(last < 1000) num = "(" + num + ")"; last = 0; num += "*1000"; } } alert(eval(num)); })(prompt());

Do you like some eval()?

Run this script on your browser's console.

Edit: Thanks for the feedback. Bugs fixed (again).

When you type something like "one hundred sixteen", it'll give 126. — scrblnrd3
– scrblnrd3, Commented Jun 24, 2014 at 12:32
This program fails for some numbers starting at twelve when it returns 23. — Abraham
– Abraham, Commented Jun 24, 2014 at 13:13

justhalf · Accepted Answer · 2014-06-24 02:39:03Z

Python

Just to get the ball rolling.

import re table = {'zero':0,'one':1,'two':2,'three':3,'four':4,'five':5,'six':6,'seven':7,'eight':8,'nine':9, 'ten':10,'eleven':11,'twelve':12,'thirteen':13,'fourteen':14,'fifteen':15,'sixteen':16,'seventeen':17,'eighteen':18,'nineteen':19, 'twenty':20,'thirty':30,'forty':40,'fifty':50,'sixty':60,'ninety':90} modifier = {'hundred':100,'thousand':1000} while True: text = raw_input() result = 0 tmp = 0 last_multiplier = 1 for word in re.split('[- ]', text): multiplier = modifier.get(word, 1) if multiplier > last_multiplier: result = (result+tmp)*multiplier tmp = 0 else: tmp *= multiplier if multiplier != 1: last_multiplier = multiplier tmp += table.get(word,0) print result+tmp

Ilmari Karonen · Accepted Answer · 2014-06-25 11:52:18Z

Perl + CPAN

Why reinvent the wheel, when it has been done already?

use feature 'say'; use Lingua::EN::Words2Nums; say words2nums $_ while <>;

This program reads English strings from standard input (or from one or more files specified as command line arguments), one per line, and prints out the corresponding numbers to standard output.

I have tested this code using both the sample inputs from the challenge, as well as an exhaustive test suite consisting of the numbers from 0 to 999999 converted to text using the bsd-games number utility (thanks, Wander Nauta!), and it correctly parses all of them. As a bonus, it also understands such inputs as e.g. minus seven (−7), four and twenty (24), four score and seven (87), one gross (144), a baker's dozen (13), eleventy-one (111) and googol (10¹⁰⁰).

(Note: In addition to the Perl interpreter itself, this program also requires the CPAN module Lingua::EN::Words2Nums. Here are some instructions for installing CPAN modules. Debian / Ubuntu Linux users may also install this module via the APT package manager as liblingua-en-words2nums-perl.)

user3094403 · Accepted Answer · 2014-06-24 07:43:43Z

Python

A general recursive solution, with validity checking. Could be simplified for the range of numbers required, but here's to showing off I guess:

terms = 'zero one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen'.split() tee = 'twenty thirty forty fifty sixty seventy eighty ninety'.split() for t in tee: terms.append(t) for s in terms[1:10]: terms.append(t+'-'+s) terms = dict(zip(terms, range(100))) modifiers = [('hundred', 100), ('thousand', 1000), ('million', 10**6), ('billion', 10**9)] def read_num(words): if len(words) == 0: return 0 elif len(words) == 1: if words[0] in terms: return terms[words[0]] else: raise ValueError(words[0]+' is not a valid english number.') else: for word, value in reversed(modifiers): if word in words: i = words.index(word) return read_num(words[:i])*value+read_num(words[i+1:]) raise ValueError(' '.join(words)+' is not a valid english number.') while True: try: print(read_num(input().split())) except ValueError as e: print(e)

comfortablydrei · Accepted Answer · 2014-06-24 21:54:02Z

VBScript 474

This is a fairly routine answer... unfortunately, so routine that @Snack pretty much posted the same process but before me.

i=split(REPLACE(REPLACE(inputbox(""),"lve","een"),"tho","k")) o=split("z on tw th fo fi si se ei ni ten ele") y=split("red *100) k )*1000 ty *10) een +10)") z="" p=0 for t=0 to UBOUND(i) s=split(i(t),"-") u=ubound(s) r=s(0) for x=0 to UBOUND(o) IF INSTR(r,o(x)) THEN z=z+"+"+CSTR(x) END IF IF u Then IF INSTR(s(1),o(x)) THEN z=z+CSTR(x) END IF END IF next for m=0 to UBOUND(y) IF INSTR(r,y(m))AND u=0 THEN z=z+y(m+1) p=p+1 END IF next next Execute("MSGBOX "+String(p,"(")+z)

ARRG · Accepted Answer · 2014-06-25 16:41:16Z

Haskell

Similar to other recursive solutions I guess, but I took the time to make it clean.

Here is the complete source with all explanations : http://ideone.com/fc8zcB

-- Define a type for a parser from a list of tokens to the value they represent. type NParse = [Token] -> Int -- Map of literal tokens (0-9, 11-19 and tens) to their names. literals = [ ("zero", 0), ("one", 1), ("two", 2), ("three", 3), ("four", 4), ("five", 5), ("six", 6), ("seven", 7), ("eight", 8), ("nine", 9), ("eleven", 11), ("twelve", 12), ("thirteen", 13), ("fourteen", 14), ("fifteen", 15), ("sixteen", 16), ("seventeen", 17), ("eighteen", 18), ("nineteen", 19), ("ten", 10), ("twenty", 20), ("thirty", 30), ("fourty", 40), ("fifty", 50), ("sixty", 60), ("seventy", 70), ("eighty", 80), ("ninety", 90) ] -- Splits the input string into tokens. -- We do one special transformation: replace dshes by a new token. Such that "fifty-three" becomes "fifty tens three". prepare :: String -> [Token] -- Let's do the easy stuff and just parse literals first. We just have to look them up in the literals map. -- This is our base parser. parseL :: NParse parseL [tok] = case lookup tok literals of Just x -> x -- We're going to exploit the fact that the input strings have a tree-like structure like so -- thousand -- hundred hundred -- ten ten ten ten -- lit lit lit lit lit lit lit lit -- And recursively parse that tree until we only have literal values. -- -- When parsing the tree -- thousand -- h1 h2 -- The resulting value is 1000 * h1 + h2. -- And this works similarly for all levels of the tree. -- So instead of writing specific parsers for all levels, let's just write a generic one : {- genParse :: NParse : the sub parser -> Int : the left part multiplier -> Token : the boundary token -> NParse : returns a new parser -} genParse :: NParse -> Int -> Token -> NParse genParse delegate mul tok = newParser where newParser [] = 0 newParser str = case splitAround str tok of -- Split around the boundary token, sub-parse the left and right parts, and combine them (l,r) -> (delegate l) * mul + (delegate r) -- And so here's the result: parseNumber :: String -> Int parseNumber = parseM . prepare where -- Here are all intermediary parsers for each level parseT = genParse parseL 1 "tens" -- multiplier is irregular, because the fifty in fifty-three is already multiplied by 10 parseH = genParse parseT 100 "hundred" parseK = genParse parseH 1000 "thousand" parseM = genParse parseK 1000000 "million" -- For fun :D test = (parseNumber "five hundred twenty-three thousand six hundred twelve million two thousand one") == 523612002001

habs · Accepted Answer · 2016-12-26 06:14:37Z

Common Lisp, 94

(write(cdr(assoc(read-line)(loop for i to 999999 collect(cons(format()"~r"i)i)):test #'equalp)))

Number to text conversion is built in to CL, but not the other way around. Builds a reverse mapping for the numbers and checks the input on it.

Stack Exchange Network

Convert English to a number [closed]

9 Answers 9

Applescript

Bash, 93 64 55 characters*

Javascript

Python

Perl + CPAN

Python

VBScript 474

Haskell

Common Lisp, 94

Linked

Hot Network Questions

Convert English to a number [closed]

9 Answers 9

Applescript

Bash, 93 64 55 characters*

Javascript

Python

Perl + CPAN

Python

VBScript 474

Haskell

Common Lisp, 94

Linked

Related

Hot Network Questions