• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Devaka Cooray
  • Campbell Ritchie
  • Tim Cooke
  • Ron McLeod
  • Paul Clapham
Sheriffs:
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Saloon Keepers:
  • Tim Holloway
Bartenders:

How to count the number of words only using string methods using a more efficient regex

 
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Input data
You will read the text of the email from the keyboard. It can span multiple lines and contains only lowercase letters of the English alphabet and spaces.

Output data
A single integer will be displayed, representing the number of distinct words in the email

limitation
The text does not contain more than 100,000 words, and each word can contain up to 25 characters.

Input data:
thanks for the list of shopping
Is helpful
thanks

Output data:
7

What I have tried:

 
Marshal
Posts: 6209
501
IntelliJ IDE Python TypeScript Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, welcome to the Ranch!

What does your program output? How does that differ from what you expect?

p.s. You'll see a "Code" button in the post editor that is good for presenting code as it formats it nicely to be easy to read. I've done it for you this time as you can see.
 
Bartender
Posts: 15743
368
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to CodeRanch!

Why do you need to replace the call to split()? Is it a requirement that you use regular expressions?

Why are you printing the number of words minus 1? Presumably because otherwise your output doesn't match the expected output for the given input data. However, you should solve it in a different way. The assignment says that you should print the number of distinct words in the email, meaning you must ignore words that you've already seen.
 
Cris Marinescu
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
"Is it a requirement that you use regular expressions?" - Yes
"Why are you printing the number of words minus 1?" - my mistake, I fix it.
"Presumably because otherwise your output doesn't match the expected output for the given input data."- Exactly. I receive a wrong answer at the last test when I'm submitting my solution on the platform where I'm learning.
 
Tim Cooke
Marshal
Posts: 6209
501
IntelliJ IDE Python TypeScript Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
String split() takes a regular expression, and " " is technically a valid regular expression
 
lowercase baba
Posts: 13091
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i'm confused.  Granted it's early...but...

Input data:
thanks(1) for(2) the(3) list(4) of(5) shopping(6)
Is(7) helpful(8)
thanks(duplicate of 1)

Output data:
7


Shouldn't the output be 8?
 
Tim Cooke
Marshal
Posts: 6209
501
IntelliJ IDE Python TypeScript Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Good spot Fred. Funny how the rest of us never even thought to question the question.
 
Bartender
Posts: 5752
217
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That was my first thought as well, but then I realized that with 'distinct' the unique words were meant.
 
fred rosenberger
lowercase baba
Posts: 13091
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
so since "thanks" is repeated, it doesn't count at all?  

Also, how should we count "Is"?  the specs say we will only get lower case letters, but that has a capital "I".  
 
Marshal
Posts: 81615
593
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the Ranch (again)

Tim Cooke wrote:. . . " " is technically a valid regular expression

Yes, it is. But that depends on the words being separated by single spaces; if there are multiple spaces or line ends, that will give you the wrong answer.I suggest you try a hash set instead of a tree set unless you need the words sorted. Why did youi use − 1 in line 23?
 
Oh sure, it's a tiny ad, but under the right circumstances, it gets bigger.
Paul Wheaton's 16th Kickstarter: Gardening playing cards for gardeners and homesteaders
https://coderanch.com/t/889615/Paul-Wheaton-Kickstarter-Gardening-playing
reply
    Bookmark Topic Watch Topic
  • New Topic