0

The question is we have to split the string and write how many words we have.

Scanner in = new Scanner(System.in); String st = in.nextLine(); String[] tokens = st.split("[\\W]+"); 

When I gave the input as a new line and printed the no. of tokens .I have got the answer as one.But i want it as zero.What should i do? Here the delimiters are all the symbols.

3
  • Why use \W+? Use \S+ and match, don't split. Commented Oct 12, 2015 at 9:53
  • alternatively you could use \b+ instead Commented Oct 12, 2015 at 10:47
  • Here the delimiters are all the symbols. Commented Oct 13, 2015 at 18:21

2 Answers 2

1

Short answer: To get the tokens in str (determined by whitespace separators), you can do the following:

String str = ... //some string str = str.trim() + " "; //modify the string for the reasons described below String[] tokens = str.split("\\s+"); 

Longer answer:

First of all, the argument to split() is the delimiter - in this case one or more whitespace characters, which is "\\s+".

If you look carefully at the Javadoc of String#split(String, int) (which is what String#split(String) calls), you will see why it behaves like this.


If the expression does not match any part of the input then the resulting array has just one element, namely this string.

This is why "".split("\\s+") would return an array with one empty string [""], so you need to append the space to avoid this. " ".split("\\s+") returns an empty array with 0 elements, as you want.


When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.

This is why " a".split("\\s+") would return ["", "a"], so you need to trim() the string first to remove whitespace from the beginning.


If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Since String#split(String) calls String#split(String, int) with the limit argument of zero, you can add whitespace to the end of the string without changing the number of words (because trailing empty strings will be discarded).

UPDATE:

If the delimiter is "\\W+", it's slightly different because you can't use trim() for that:

String str = ... str = str.replaceAll("^\\W+", "") + " "; String[] tokens = str.split("\\W+"); 
Sign up to request clarification or add additional context in comments.

3 Comments

@LoganMzz I have tried it and "\na".trim().split("\\s+") just returns ["a"] - trim() compares characters to \u0020, but considers characters with codes lower than that to be whitespace.
Here the delimiters are all the symbols.
@harry That will be similar, except replaceAll() has to be used instead of trim() - please see the update.
0
public static void main(String[] args) { Scanner in = new Scanner(System.in); String line = null; while (!(line = in.nextLine()).isEmpty()) { //logic } System.out.print("Empty Line"); } 

output

Empty Line 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.