Suppose I have the following list of strings myStrList:
myStrList = {"This is A 123a Test", "This is A 123ab Test", "This is A 123-a 456-B 7c-89 Test"}; I wish to create a function that takes a single string str and returns the string "words" that contain exactly one letter character.
I will define a "word" as a substring delimited by whitespace but containing no whitespace of its own. I can generate a list of string "words" by simply passing the string to the function StringSplit. For example:
StringSplit["This is A 123a Test"] (* {"This", "is", "A", "123a", "Test"} *) So the words comprising the string "This is A 123a Test" are "This", "is", "A", "123a", and "Test".
Now I wish to write a function myWordFunction that returns the words containing exactly one letter character. I can do this by using StringSplit to generate the list of words, and then I select the words with exactly one letter character:
myStrList = {"This is A 123a Test", "This is A 123ab Test", "This is A 123-a 456-B 7c-89 Test"}; myWordFunction[str_String] := Module[{wordsList, substringsList}, wordsList = StringSplit[str]; substringsList = Select[wordsList, StringCount[#, _? LetterQ] == 1 &]; Return[substringsList]; ]; myWordFunction[#] & /@ myStrList This works, but it seems rather inelegant to first split the string into words and then do the (substring) analysis and selection. Is there a way to specify a "word" (whitespace-delimited substring) directly in a (string) pattern?
StringCases["This is A 123-a 456-B 7c-89 Test", " " ~~ Shortest[a__] ~~ " " /; (StringCount[a, LetterCharacter] == 1) :> a] &should work, but I'm still experimenting, because for some reason this is only finding "A" and "456-B" and not "123-a" and "7c-89" even though, for instance,StringCount["123-a", LetterCharacter] == 1yieldsTrue. For some strange reason, it seems to be missing the lower case letters when put inside theStringCasescall. Stay tuned. $\endgroup$StringCases[#, " " ~~ (a__ /; StringCount[a, LetterCharacter] == 1) ~~ " " :> a] &doesn't work. The reason is thatStringCasesdoesn't scan overlapping subsequences of characters in the string. Thus, once it has found" A ", it can't find" 123-a "because those two words share the white-space character between them. I'm not sure there's a way around this using this code, and this makes me thing it can't be done with doing something analogous to splitting the string at the white-spaces. $\endgroup$myWordFunction[str_String] := Select[StringSplit[str], StringCount[#, LetterCharacter] == 1 &]. One avoids the unnecessary use ofModule. (Also, by the way,Returnis not used in Mathematica in the same way as other languages. You don't need toReturnsubstringsList. You could have just doneModule[ ... substringsList = Select[wordsList, StringCount[#, _? LetterQ] == 1 &]]`.) $\endgroup$StringCases[#, " " ~~ (a__ /; StringCount[a, LetterCharacter] == 1) ~~ " " :> a, Overlaps -> All] &$\endgroup$