53

I would like to know how to split up a large string into a series of smaller strings or words. For example:

I want to walk my dog.

I want to have a string: "I", another string:"want", etc.

How would I do this?

2
  • 4
    Please show what you've tried (did you look for the word "split" in the docs for String, for example?) Commented Jul 30, 2012 at 16:53
  • 11
    Yes, String#split() is named very ambiguously :-P Commented Jul 30, 2012 at 16:53

15 Answers 15

90

Use split() method

Eg:

String s = "I want to walk my dog"; String[] arr = s.split(" "); for ( String ss : arr) { System.out.println(ss); } 
Sign up to request clarification or add additional context in comments.

2 Comments

This method will not remove commas, dots, and so on from the words.
71

As a more general solution (but ASCII only!), to include any other separators between words (like commas and semicolons), I suggest:

String s = "I want to walk my dog, cat, and tarantula; maybe even my tortoise."; String[] words = s.split("\\W+"); 

The regex means that the delimiters will be anything that is not a word [\W], in groups of at least one [+]. Because [+] is greedy, it will take for instance ';' and ' ' together as one delimiter.

1 Comment

\\W only seems to consider ASCII alphabetic characters. It isn't suitable for languages with accents.
33

A regex can also be used to split words.

\w can be used to match word characters ([A-Za-z0-9_]), so that punctuation is removed from the results:

String s = "I want to walk my dog, and why not?"; Pattern pattern = Pattern.compile("\\w+"); Matcher matcher = pattern.matcher(s); while (matcher.find()) { System.out.println(matcher.group()); } 

Outputs:

I want to walk my dog and why not 

See Java API documentation for Pattern

2 Comments

Does this produce empty words?
@JoshuaOliphant Not if you use trim() before.
15

See my other answer if your phrase contains accentuated characters :

String[] listeMots = phrase.split("\\P{L}+"); 

2 Comments

This is the best answer.
However, some diacritics are not handled, e.g. فنّى will be split into two words.
6

Yet another method, using StringTokenizer :

String s = "I want to walk my dog"; StringTokenizer tokenizer = new StringTokenizer(s); while(tokenizer.hasMoreTokens()) { System.out.println(tokenizer.nextToken()); } 

3 Comments

ah! this is good in case where i dont need an array but isn't tokenizer returning an array of token? nice idea though
No, there isn't any array being produced . StringTokenizer looks for the consecutive tokens in the string and returns them one by one.
Nice solution, unfortunately, StringTokenizer should not be used anymore. From the Docs: StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
4

To include any separators between words (like everything except all lower case and upper case letters), we can do:

String mystring = "hi, there,hi Leo"; String[] arr = mystring.split("[^a-zA-Z]+"); for(int i = 0; i < arr.length; i += 1) { System.out.println(arr[i]); } 

Here the regex means that the separators will be anything that is not a upper or lower case letter [^a-zA-Z], in groups of at least one [+].

Comments

2

You can use split(" ") method of the String class and can get each word as code given below:

String s = "I want to walk my dog"; String []strArray=s.split(" "); for(int i=0; i<strArray.length;i++) { System.out.println(strArray[i]); } 

Comments

2

This regex will split word by space like space, tab, line break:

String[] str = s.split("\\s+"); 

Comments

1

Use split()

String words[] = stringInstance.split(" "); 

1 Comment

Please go through the link from answer
1
StringTokenizer separate = new StringTokenizer(s, " "); String word = separate.nextToken(); System.out.println(word); 

Comments

1

Java String split() method example

 public class SplitExample{ public static void main(String args[]){ String str="java string split method"; String[] words=str.split("\\s");//splits the string based on whitespace for(String word:words){ System.out.println(word); } } } 

Comments

1

you can use Apache commons' StringUtils class

String[] partsOfString = StringUtils.split("I want to walk my dog", StringUtils.SPACE) 

Comments

0
class test{ public static void main(String[] args){ StringTokenizer st= new StringTokenizer("I want to walk my dog."); while (st.hasMoreTokens()) System.out.println(st.nextToken()); } } 

Comments

0

Using Java Stream API:

String sentence = "I want to walk my dog."; Arrays.stream(sentence.split(" ")).forEach(System.out::println); 

Output:

I want to walk my dog. 

Or

String sentence2 = "I want to walk my dog."; Arrays.stream(sentence2.split(" ")).map(str -> str.replace(".", "")).forEach(System.out::println); 

Output:

I want to walk my dog 

Comments

-1
String[] str = s.split("[^a-zA-Z]+"); 

1 Comment

Pattern matching of your own is usually not the best way to go; use solutions of people who have done that already and thought of all the weird corner cases that you don't think of at the moment of writing. Also, as a rule of thumb, I would rather go with a whitelist of whitespace characters here instead of trying to match the words as you miss out on umlauts etc.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.