Are you completely sure you need to consider that's as two words? (viz. that is)
Ordinarily, I believe that's is counted as one word in English.
But if your perspective on the requirements is correct, you have a (moderately) difficult problem: I don't think there is any (reasonable) regex that can distinguish between something like that's (contraction of that and is) and something like steve's (possessive).
AFAIK you will have to write something yourself.
Suggestion: take a look at this list of English language contractions. You could use it to make an enumeration of the things you need to handle in a special way.
Basic Example
enum Contraction { AINT("ain't", "is not"), ARENT("aren't", "are not"), // Many, many in between... YOUVE("you've", "you have"); private final String oneWord; private final String twoWords; private Contraction(String oneWord, String twoWords) { this.oneWord = oneWord; this.twoWords = twoWords; } public String getOneWord() { return oneWord; } public String getTwoWords() { return twoWords; } } String s = "That's a good question".toLowerCase(); for (Contraction c : Contraction.values()) { s = s.replaceAll(c.getOneWord(), c.getTwoWords()) } String[] words = s.split("\\s+"); // And so forth...
NOTE: This example handles case sensitivity by converting the entire input to lower case, so the elements in the enum will match. If that doesn't work for you, you may need to handle it in another way.
I'm not clear on what you need to do with the words once you have them, so I left that part out.