39
\$\begingroup\$

Introduction

How much of the English alphabet does a given string use? The previous sentence uses 77%. It has 20 unique letters (howmucftenglisapbdvr), and 20/26 ≃ 0.77.

Challenge

For an input string, return the percentage of letters of the English alphabet present in the string.

  • The answer can be in percentage or in decimal form.

  • The input string can have upper and lower case, as well as punctuation. However you can assume they have no diacritics or accentuated characters.

Test cases

Input

"Did you put your name in the Goblet of Fire, Harry?" he asked calmly. 

Some valid outputs

77%, 76.9, 0.7692 

Input:

The quick brown fox jumps over the lazy dog 

All valid outputs:

100%, 100, 1 

The expected output for "@#$%^&*?!" and "" is 0.

\$\endgroup\$
13
  • 3
    \$\begingroup\$ Suggested test cases: "@#$%^&*?!", "" \$\endgroup\$ Commented Jun 21, 2019 at 10:35
  • 5
    \$\begingroup\$ If 77% and 76.9 is accepted, is 77 accepted too? \$\endgroup\$ Commented Jun 21, 2019 at 10:43
  • 1
    \$\begingroup\$ Percentages can have decimal parts too... \$\endgroup\$ Commented Jun 21, 2019 at 11:54
  • 2
    \$\begingroup\$ @Shaggy Last edit for OP was 16 hours ago, your answer was at 15 and your comment at 14. I mean, you're right but ??? \$\endgroup\$ Commented Jun 22, 2019 at 4:16
  • 6
    \$\begingroup\$ If 20/26 may be rounded to 0.7692, 0.769 or 0.77, can I also round it to 0.8, 1 or 0? ;-) \$\endgroup\$ Commented Jun 22, 2019 at 22:42

60 Answers 60

1
2
1
\$\begingroup\$

C, 95 bytes

f(char*s){int a[256]={},z;while(*s)a[*s++|32]=1;for(z=97;z<'z';*a+=a[z++]);return(*a*100)/26;} 

(note: rounds down)

Alternate decimal-returning version (95 bytes):

float f(char*s){int a[256]={},z;while(*s&&a[*s++|32]=1);for(z=97;z<'z';*a+=a[z++]);return*a/26.;} 

This borrows some from @Steadybox' answer.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ Welcome! Good first answer. It might be helpful for people reading your answer if you provide a short explanation of your code or an ungolfed version. It may also be helpful to provide a link to an online interpreter with your runnable code (see some other answers for examples). Many use TIO, and here's the gcc interpreter \$\endgroup\$ Commented Jun 23, 2019 at 0:13
1
\$\begingroup\$

K4, 14 13 bytes

Solution:

avg .Q.a in _ 

Explanation:

Rather stolen from inspired by Luis Mendo's Octave solution...

avg .Q.a in _ / the solution _ / lowercase the input in / 'in' function .Q.a / "abcdefghijklmnopqrstuvwxyz" avg / average (mean) 
\$\endgroup\$
1
\$\begingroup\$

Stax 1.1.4 online interpreter, 8 bytes, noncompeting

äQæ╟r◘Oñ 

Run and debug it at staxlang.xyz!

Unpacked (9 bytes) and explanation:

Va%26!/vN Va Push the lowercase alphabet % Length...?! Shouldn't this always be 26? 26! Push 26.0 / Divide vN Subtract from one (decrement and negate) 

That shouldn't work. Looking at it, you would expect an output of 0 always. Heck, it doesn't even take input! There's a bug in the online interpreter, however, which I have exploited for this answer.

Now, I've marked this answer noncompeting for a reason. As far as I can tell, this exploit requires some human interaction to set up. Here's what you gotta do:

  • Put your input in the input field
  • Unpack
  • Insert v at the start of the code and |b immediately after Va
  • Run
  • Remove the characters you added to the code
  • Repack

Now you have an 8-byte program that will give the correct output each time you run it! At least until you change the input field or reload the page.

What in the seven hells is going on:

Va%26!/vN Va Push the lowercase alphabet, EXCLUDING characters that existed in the input field in either case any time the vVa|b version was run %26!/vN Everything else works as expected 

That little bug handles case checking and filtering for free, at the expense of leaving me with the wrong set of letters (wasting two bytes on the vN). I think this can be improved rather easily, but I'm at work right now.

\$\endgroup\$
1
\$\begingroup\$

Python 3, 51 49 bytes

51 -> 49 bytes, thanks to alexz02

lambda s:len({*filter(str.isalpha,s.lower())})/26 

Try it online!

\$\endgroup\$
2
  • \$\begingroup\$ 49 bytes lambda s:len({*filter(str.isalpha,s.lower())})/26 \$\endgroup\$ Commented Jun 24, 2019 at 8:34
  • \$\begingroup\$ @alexz02 Thank you! :) \$\endgroup\$ Commented Jun 24, 2019 at 8:42
1
\$\begingroup\$

Haskell, 52 bytes

f s=sum[1|n<-[0..25],or[elem([c..]!!n)s|c<-"aA"]]/26 

Try it online!

Avoids case conversion (which base Haskell lacks) and ASCII-code conversions (which are lengthy) in favor of writing [c..] to enumerate characters. For example, ['A'..] is a very long list that starts with ABCDEFGHI....

\$\endgroup\$
1
\$\begingroup\$

Ohm v2, 7 bytes

ÁαA}εÆm 

Try it online!

\$\endgroup\$
1
\$\begingroup\$

jq -R, 36 + 3 = 39 bytes

1/length*([scan("[a-zA-Z]")]|length) 

The -R flag is required, otherwise stdin needs to be a quoted string.

\$\endgroup\$
1
  • \$\begingroup\$ Does this work? For instance, it counts all occurrences, including duplicates, so the quick brown fox ends up having less than 100%. See the tests on TIO. \$\endgroup\$ Commented Jul 11, 2023 at 23:29
1
\$\begingroup\$

Emacs Lisp, 91 bytes

(lambda(a)(load"cl")(/(count-if(lambda(x)(< ?` x ?z))(remove-duplicates(downcase a)))26.0)) 

Try it online!

\$\endgroup\$
1
\$\begingroup\$

C (gcc), 72 bytes

i;j;f(char*s){for(i=j=0;i++<26;)j+=index(s,i+64)!=index(s,i+96);j/=.26;} 

Try it online!

-9B from c--

x86 opcode, 35 bytes

00000180: 575e 31ff ac2c 4172 030f abc7 0441 75f4 W^1..,Ar.....Au. 00000190: c1e7 06f3 0fb8 c750 df04 2458 6a1a de34 .......P..$Xj..4 000001a0: 2458 c3 

Try it online!

\$\endgroup\$
0
1
\$\begingroup\$

Go, 210 196 bytes

import(."strings";U"unicode") func f(s string)float64{s=Map(func(r rune)rune{if!U.IsLetter(r){r=-1};return r},ToLower(s)) m:=make(map[rune]int) for _,r:=range s{m[r]++} return float64(len(m))/26.} 

Attempt This Online!

Returns a float in [0,1].

\$\endgroup\$
1
\$\begingroup\$

Vyxal 3, 6 bytes

ʀuȦL₂÷ 

Try it Online!

length of unique lowercase letters divided by 26

\$\endgroup\$
1
\$\begingroup\$

Uiua SBCS, 9 bytes

÷26⧻◴▽±.⌵ 

Try it!

÷26⧻◴▽±.⌵ ⌵ # uppercase . # duplicate ± # mask of cases (-1 for lower, 1 for upper, 0 for neither) ▽ # keep (only the letters) ◴ # remove duplicates ⧻ # length ÷26 # divided by 26 
\$\endgroup\$
1
\$\begingroup\$

Scala 3, 50 bytes

_.filter(_.isLetter).toLowerCase.distinct.size/26f 

Attempt This Online!

\$\endgroup\$
0
\$\begingroup\$

V, 30, 29 bytes

ÓÁ òó㈁“±òAÝ/26.0|Óá C=" 

Try it online!

\$\endgroup\$
0
\$\begingroup\$

EXCEL, 55 bytes

Cell A1 as input. Place in any cell by doing Ctrl+Shift+Enter . +2 bytes if {} is included in the count.

=SUM(--ISNUMBER(FIND(CHAR(ROW(A65:A91)),UPPER(A1))))/26 
\$\endgroup\$
0
\$\begingroup\$

JavaScript (ES6), 61 54 51 bytes

s=>new Set(s.toLowerCase().match(/[a-z]/g)).size/26 
\$\endgroup\$
3
  • 1
    \$\begingroup\$ 57 bytes \$\endgroup\$ Commented Jun 21, 2019 at 11:36
  • 2
    \$\begingroup\$ You can't just do replace(/[^a-z]/g) because matched characters will be replaced with 'undefined' (as a string). For instance, "*abc" will create the set Set { 'u', 'n', 'd', 'e', 'f', 'i', 'a', 'b', 'c' }. \$\endgroup\$ Commented Jun 21, 2019 at 12:34
  • 2
    \$\begingroup\$ Was just about to post this myself: 51 bytes \$\endgroup\$ Commented Jun 21, 2019 at 14:04
0
\$\begingroup\$

MATLAB, 53 bytes

Anonymous function taking a string:

@(a)length(unique(upper(a(isstrprop(a,'alpha')))))/26 
\$\endgroup\$
0
0
\$\begingroup\$

Oracle, 199 bytes

CREATE FUNCTION f(s LONG)RETURN FLOAT IS r FLOAT;BEGIN SELECT COUNT(DISTINCT c)/26 INTO r FROM(SELECT LOWER(SUBSTR(s,LEVEL,1))c FROM dual CONNECT BY LEVEL<=LENGTH(s))WHERE c>'`'AND'{'>c;RETURN r;END; 

More readable version:

CREATE FUNCTION f(s LONG) RETURN FLOAT IS r FLOAT; BEGIN SELECT COUNT(DISTINCT c) / 26 INTO r FROM ( SELECT LOWER(SUBSTR(s, LEVEL, 1)) AS c FROM dual CONNECT BY LEVEL <= LENGTH(s) ) WHERE c > '`' AND '{' > c; RETURN r; END; 

Try it on SQL Fiddle!

\$\endgroup\$
0
\$\begingroup\$

Haskell, 86 Bytes

Still golfable, probably.

import Data.Char import Data.List ((/26).toEnum.length.nub.filter isLower.(toLower<$>)) 

Is there any better way to convert an Int to a Float?

\$\endgroup\$
1
  • \$\begingroup\$ map toLower is shorter than (toLower<$>) \$\endgroup\$ Commented Aug 10, 2019 at 8:23
0
\$\begingroup\$

Wolfram Language (Mathematica), 39 bytes

Length[Alphabet[]⋂ToLowerCase@#]/26.& 

Try it online!

\$\endgroup\$
0
\$\begingroup\$

Wolfram Language (Mathematica), 71 70 59 bytes

Count[Union@ToCharacterCode@ToUpperCase@#,x_/;64<x<91]/26.& 

Try it online!

Thanks to attinat for suggesting Union to replace DeleteDuplicates.

\$\endgroup\$
1
  • \$\begingroup\$ Union is 11 bytes shorter than DeleteDuplicates. \$\endgroup\$ Commented Aug 5, 2019 at 5:02
0
\$\begingroup\$

C# .NET, 158 bytes

public class P{public static void Main(string[]z){var i=0;for(int q=65;q<91;q++)if(z[0].ToUpper().IndexOf((char)(q))>-1)i++;System.Console.Write(100D/26*i);}} 

Try Online

\$\endgroup\$
0
\$\begingroup\$

MathGolf, 8 bytes

▄æl!\╧]▓ 

Try it online!

Based on the 05AB1E solution, so be sure to upvote that. I noticed a bug in the "contains" operator, which if resolved would remove the need of the swap.

Explanation

▄ lowercase alphabet as string æ start block of length 4 l! push input lowercased \ swap top elements ╧ pop a, b, a.contains(b) loop ends here ] end array / wrap stack in array ▓ get average of list 
\$\endgroup\$
0
\$\begingroup\$

Factor + spelling, 44 bytes

[ >lower ALPHABET within cardinality 26 /f ] 

Try it online!

  • >lower Convert the input to lowercase.
  • ALPHABET Place the lowercase alphabet on the data stack.
  • within Take from the input only those elements that are in the alphabet.
  • cardinality Count how many elements are in the resulting set.
  • 26 /f Divide by 26 and force the result to be a float.
\$\endgroup\$
0
\$\begingroup\$

AWK, 72 bytes

gsub("[^a-zA-Z]",z)+gsub(z," "){for(;$++a;)b+=!c[tolower($a)]++}$0=b/.26 

Try it online!

The first bit in this code is a test that's always truthy, so the associated code block will always run no matter what the input string contains. The purpose of the gsub calls is to remove all the non-alphabetic characters and then to split the input string into space separated characters. That re-evaluates the positional parameters, making the input a list of individual a-zA-Z characters.

gsub("[^a-zA-Z]",z)+gsub(z," "){ } 

The codeblock can then iterate over each positional parameters to process all the alphabetic characters in the input string.

 for(;$++a;) 

The body of the loop is one AWK statement that does all the real work. It counts the number of times each unique (lowercase) character is seen. And anytime the current counter c[...char...] is 0 it increments the tally of unique characters.

 b+=!c[tolower($a)]++ 

Once all the characters have been scanned, the final answer is the tally divided by the number of characters in the alphabet divided by 100. Assigning that value to $0 wipes out all the other positional parameters and prints the percentage by default.

 $0=b/.26 
\$\endgroup\$
0
\$\begingroup\$

Thunno 2, 5 bytes

LẠ€Ƈm 

Try it online!

Thunno 2, 7 bytes

LỊUl26/ 

Try it online!

Explanations

LẠ€Ƈm # Implicit input L # Lowercase Ạ # Lowercase alphabet € # Single function map: Ƈ # Contains m # Mean # Implicit output 
LỊUl26/ # Implicit input L # Lowercase Ị # Only alphabetic U # Uniquify l # Length 26/ # Divide by 26 # Implicit output 
\$\endgroup\$
0
\$\begingroup\$

x86-64 machine code, 47 bytes

Standard x64 System V ABI, signature long double get_acr(char* str). The string is modified by the function, so it cannot be read-only.

57 5e 31 c9 ac ff c1 80 66 ff df 84 c0 75 f5 6a 1a df 04 24 5a d9 ee 6a 41 58 57 51 f2 ae 75 04 d9 e8 de c1 59 5f ff c0 ff ca 75 ee de f1 c3 

Try it online!

Explanation

; long double get_acr(char* str) ; follows x64 SysV ABI ; input RDI: null-delimited writable string (gets modified) ; output ST0: percentage in range [0, 1] get_acr: ; Get string length in rcx and convert characters to uppercase push rdi pop rsi ; get str reference in rsi xor ecx, ecx ; len = 0 __get_acr_convlp: lodsb ; get byte into rsi ; non-zero char: increment length and handle conversion ; if it is zero, we increment, do (effectively) no conversions, and store ; back. this +1 in ecx doesn't matter though, since it's okay if we search ; for letters in the zero byte later on. inc ecx ; convert by removing bit for 32 to force lowercase letters uppercase ; in other words, and al by ~32 (0b1101 1111, 0xdf). and byte [rsi-1], 0xdf ; loop back up if al != 0 test al, al jnz __get_acr_convlp ; init counter rdx: do 26 times ; also get 26 into st1 for division later push 26 fild word [rsp] pop rdx fldz ; sum starts at 0.0 ; We only have uppercase letters now to look for ; Initial letter: 'A' push 65 pop rax __get_acr_loop: push rdi ; save rdi push rcx ; save rcx ; rcx times: search for al repne scasb ; go until rcx = 0 or found al (zf = 1) ; add 1 if zf, do nothing otherwise jnz __get_acr_dno fld1 faddp __get_acr_dno: pop rcx ; restore rcx pop rdi ; restore rdi inc eax ; next letter dec edx jnz __get_acr_loop ; loop back to top edx times ; Divide st0 by st1 and pop to get the fraction to return fdivrp st1, st0 ret 
\$\endgroup\$
0
\$\begingroup\$

Racket – 115 bytes

(~r(*(/(set-count(apply set(filter char-alphabetic?(string->list(string-upcase(read-line))))))26)100)#:precision 2) 

Try it online!


Explanation

First, we read user input and transform it into upper case. We then turn the string into a list of characters and remove any characters that aren't alphabetic. Once done, we transform the list into a set. A Set is a data structure that can only contain unique values. That's cool because it allows us to simply count the number of elements in the set and calculate the percentage of unique letters that are used. Once all is done, we return the result as a decimal string with the precision of 2.

(~r (* (/ (set-count (apply set (filter char-alphabetic? (string->list (string-upcase (read-line)))))) 26) 100) #:precision 2) 

Some cool things to note that I just found out is that Racket uses fractions by default to calculate large arbitrary precision. If you were to remove ~r you'd see a fraction that looks like x/y.

\$\endgroup\$
0
\$\begingroup\$

C (gcc), 72 bytes

Outputs via exit code

r;a[];main(c){return~c?main(getchar(r+=a[c&=95]++<c/65-c/91)):r/.26+.5;} 

Try it online!

Commented

r; // int r is zero-initialized a[]; // same for int array a[]; it has size 1 (we'll pretend it's 96) main(c){ // function main returns int and takes an int `c` return ~c? // if c != -1: main( // call main() recursively with getchar( // next character from stdin or EOF(-1) (ignores argument) r+= // add the following to r: a[c&=95]++ // uppercase `c`, increment the value at it's index in a[] <c/65-c/91 // if (old value of a[c]) < (c/'A' - c/'[') add 1, else 0 ) // end call to getchar() ) // end call to main() : // else (we're at the end): r/.26+.5; // return the accumulated result as a % of 26, rounded } // end main() function 

Explanation

Alphabetic characters are in the ranges \$1000001_2..1011010_2\$ (uppercase) and \$1100001_2..1111010_2\$ (lowercase), and \$95_{10} = 1011111_2\$, so c &= 95 makes characters uppercase and changes symbols in some way we don't care as they will not land in the same range.

For \$0 \le c \le 95\$ the equation \$\frac{c}{65} - \frac{c}{91}\$ returns 1 if \$65 \le c \le 90\$ (c is uppercase) and 0 otherwise.

So a[c &= 95]++ < c/65 - c/91 checks that isapha(c) and that this is the first time we see c in uppercase or lowercase (by incrementing the value we make sure subsequent occurrences of c will not match the expression).

\$\endgroup\$
0
\$\begingroup\$

Arturo, 40 bytes

$=>[//size unique match lower&{[a-z]}26] 

Try it!

\$\endgroup\$
1
2

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.