From what your task says I cannot specify your input method (Scanner or command line arguments) however your code can be simplified dramatically.
Firstly HashSet is an object that holds unique objects meaning no two strings can match inside therefore you should use:
Set<String> uniquewords = new HashSet<String>(); And add to that for each word.
Secondly you have to remove everything from the input that isn't a alphabet character or space or a newline or apostrophe (Otherwise "When the farmer's wife called Bob back in, the other farmers went back to their work as well." both "farmer's" and "farmers" would be the same string. The simple removal of characters can be achieved with a regex.
[^A-Za-z\s\n] ^ mean NOT so we are not looking for the following A-Z and a-z are capital and lowercase characters \s is a space character \n is a new line character This can be used in the replaceall() function on a string via
// \n and \s are escaped replaceAll("[^A-Za-z\\n\\s']", "") This will remove everything else from the string then you can toLowerCase() so that "ThIs" will be the same as "this" and then split at the spaces with split(" ") to receive each word.
A TreeSet can be used to store the remaining words whilst keeping the "natural" ordering
My function to complete the uniqueness and sorting of the lines would be
public static Set<String> getUniqueWords(List<String> input) { Set<String> uniquewords = new TreeSet<String>(); for (String s : input) { uniquewords.addAll(Arrays.asList(s.replaceAll("[^A-Za-z\\s]", "").toLowerCase().split(" "))); } return uniquewords; } If you wish to only use simpler data types (primitive and ArrayList) then the following will apply
For each word you will have to check if the ArrayList contains your value already since you can place two of the same value into it.
for (String word : words) { if (!uniquewords.contains(word)) { uniquewords.add(word); } } When you insert into an ArrayList the "natural" ordering of the types is not accounted for as such the List will not be in alphabetical order but in the order you put the data in. Before you return you will have to call Collections.sort() which will order each string in alphabetical order
My code using only primitive and ArrayList types is as follows (It is a little more spread out):
public static List<String> getUniqueWords(List<String> input) { List<String> uniquewords = new ArrayList<String>(); for (String s : input) { String[] words = s.replaceAll("[^A-Za-z\\s]", "").toLowerCase().split(" "); for (String word : words) { if (!uniquewords.contains(word)) { uniquewords.add(word); } } } Collections.sort(uniquewords); return uniquewords; }