Splitting over a delimiter
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Hi,
I'm having a problem splitting over a string array with a delimiter
So my code is this:
is there just so I only get one sample from moby for testing purposes.
This is my output:
I'm not sure if the delimiter is the problem but I think it is. I tried using \\ and \\\\ but to no solution. PartOfSpeechArray is what I'm trying to split.
Its dieiing at the line: String [] mobyLine = pos.split("\\");
Thanks all.
I'm having a problem splitting over a string array with a delimiter
So my code is this:
is there just so I only get one sample from moby for testing purposes.
This is my output:
I'm not sure if the delimiter is the problem but I think it is. I tried using \\ and \\\\ but to no solution. PartOfSpeechArray is what I'm trying to split.
Its dieiing at the line: String [] mobyLine = pos.split("\\");
Thanks all.
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
That delimiter would suggest you are splitting on backslashes. Are there any backslashes in your input?
posted 7 years ago
Well, it would help if you tell us what the delimiter you are trying to split is. The backslash is the escape for both the Java literal string, and for the regex pattern, so, two backslashes with a string literal is not a valid regex.
You say four backslashes doesn't work, but that is a valid regex -- which represents a single backslash as the delimiter. So, back to the original question, what is the delimiter that you are trying to split?
Henry
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Well, it would help if you tell us what the delimiter you are trying to split is. The backslash is the escape for both the Java literal string, and for the regex pattern, so, two backslashes with a string literal is not a valid regex.
You say four backslashes doesn't work, but that is a valid regex -- which represents a single backslash as the delimiter. So, back to the original question, what is the delimiter that you are trying to split?
Henry
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Right there in the output. I'm trying to split 3-D\AN
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Use four backslashes "\\\\".
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Tried using \\\\
Output then is:
The code I'm referencing for that output is:
As you can see, partOfSpech and word both come back blank from the splitting operation
Output then is:
The code I'm referencing for that output is:
As you can see, partOfSpech and word both come back blank from the splitting operation
posted 7 years ago
Both of those variables have a value of zero length strings (aka. blank as set at line 6 and line 7)... because you set them to zero length strings. The splitting operation have nothing to do with it.
Henry
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Ted Gress wrote:
As you can see, partOfSpech and word both come back blank from the splitting operation
Both of those variables have a value of zero length strings (aka. blank as set at line 6 and line 7)... because you set them to zero length strings. The splitting operation have nothing to do with it.
Henry
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Your "symbol" will consist of one or more characters denoting various parts of speech in probability order. Your word "3-D" is most likely to be an adjective ('A') but may possibly be a noun ('N').
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Ok. Got another one.
Code:
Output:
mobyLine is only showing up as a single element, "a", despite the fact that the next entry after "Abyla" in the moby dictionary is abysmally\v
It should be:
[SensoryCore] pos: abysmally\v
[SensoryCore] mobyLine: [abysmally, v]
[SesnroyCore] partOfSpeechSymbol: v
[SensoryCore] new word: abysmally
[SensoryCore] unit : abysmally
[SensoryCore] encoding size : 0
Code:
Output:
mobyLine is only showing up as a single element, "a", despite the fact that the next entry after "Abyla" in the moby dictionary is abysmally\v
It should be:
[SensoryCore] pos: abysmally\v
[SensoryCore] mobyLine: [abysmally, v]
[SesnroyCore] partOfSpeechSymbol: v
[SensoryCore] new word: abysmally
[SensoryCore] unit : abysmally
[SensoryCore] encoding size : 0
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
How are you populating "unit" when you call the method?
Here's a code posting tip: You can change what line number your code listing starts with. The default is '1'.
vs
Here's a code posting tip: You can change what line number your code listing starts with. The default is '1'.
vs
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
I'm populating unit using this function that reads from a text file and gets a word-part of speech pair.
So, for example, unit could be:
Abyla\N
This is the code that reads from the text file:
So, for example, unit could be:
Abyla\N
This is the code that reads from the text file:
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
There is a single "unit" for each line. One and only one unit.
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
For example, this is a sample from the moby textfile i'm reading:
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Congratulations! You've discovered a Scanner bug!
I wrote a minimalist program to read mobypos.txt using Scanner and it stopped at the same character for me as it did for you. Thought it might be that a non-printable character snuck into the txt file so I retyped a few lines at the point of failure and that didn't fix it. I changed code to use a BufferedReader and that worked without an issue.
EDIT: Even Files.readAllLines() failed.
I wrote a minimalist program to read mobypos.txt using Scanner and it stopped at the same character for me as it did for you. Thought it might be that a non-printable character snuck into the txt file so I retyped a few lines at the point of failure and that didn't fix it. I changed code to use a BufferedReader and that worked without an issue.
EDIT: Even Files.readAllLines() failed.
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Ok. New Problem. It isn't recognizing the nouns and verbs. In fact, its only recognizing "Alice" and "Blue". (Nice catch on the scanner bug btw)
So what it should do is if it finds the tkn variable (the token word that the part of speech is being checked on) it should ouptut, for nouns anyway,
Or in lamens terms if the noun is 'Car': [Sensory Core] Encoding noun: Car noun
WHat it is doing is outputting two nouns and then quitting.
So what it should do is if it finds the tkn variable (the token word that the part of speech is being checked on) it should ouptut, for nouns anyway,
Or in lamens terms if the noun is 'Car': [Sensory Core] Encoding noun: Car noun
WHat it is doing is outputting two nouns and then quitting.
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Well, let's call it a "feature". The file does have some characters outside of the default character set. If you set the character set as noted in my comments below all three approaches work.
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Yes?>[SensoryCore] Reading Line: Alice followed the white rabbit. She followed the white rabbit down the tunnel. Blue green purple red.
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 2
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: Blue noun
[Sensory Core] Outputting following message: Alice noun Blue noun
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 2
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: Blue noun
[Sensory Core] Outputting following message: Alice noun Blue noun
Ted Gress
Ranch Hand
Posts: 229
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
Should pick up on all the nouns, right?>
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Should pick up on all the nouns, right?>
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
I think you want this instead. The symbol may contain multiple letters, but the first letter is the most likely POS.
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
I changed
(uncommented new token)
and now I'm getting long long lists of items from the readiniglist.
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
(uncommented new token)
and now I'm getting long long lists of items from the readiniglist.
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
[SensoryCore] new token: Alice
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Carey, thanks for the help so far. You are going in the credits of the paper for this. :-)
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
So quick note. If I take out that comment I get the following output:
Yes?>[SensoryCore] Reading Line: Alice followed the white rabbit. She followed the white rabbit down the tunnel. Blue green purple red.
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[Sensory Core] Encoding noun: rabbit noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[Sensory Core] Encoding noun: rabbit noun
[Sensory Core] Encoding noun: tunnel noun
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[Sensory Core] Encoding noun: green noun
[Sensory Core] Encoding noun: purple noun
[Sensory Core] Encoding noun: red noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 8
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: tunnel noun
[SensoryCore] Encoded String: Blue noun
[SensoryCore] Encoded String: green noun
[SensoryCore] Encoded String: purple noun
[SensoryCore] Encoded String: red noun
[Sensory Core] Outputting following message: Alice noun rabbit noun rabbit noun tunnel noun Blue noun green noun purple noun red noun
As you can see, its duplicating
Yes?>[SensoryCore] Reading Line: Alice followed the white rabbit. She followed the white rabbit down the tunnel. Blue green purple red.
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[Sensory Core] Encoding noun: rabbit noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[Sensory Core] Encoding noun: rabbit noun
[Sensory Core] Encoding noun: tunnel noun
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[Sensory Core] Encoding noun: green noun
[Sensory Core] Encoding noun: purple noun
[Sensory Core] Encoding noun: red noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 8
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: tunnel noun
[SensoryCore] Encoded String: Blue noun
[SensoryCore] Encoded String: green noun
[SensoryCore] Encoded String: purple noun
[SensoryCore] Encoded String: red noun
[Sensory Core] Outputting following message: Alice noun rabbit noun rabbit noun tunnel noun Blue noun green noun purple noun red noun
As you can see, its duplicating
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
So quick note. If I take out that comment I get the following output:
Yes?>[SensoryCore] Reading Line: Alice followed the white rabbit. She followed the white rabbit down the tunnel. Blue green purple red.
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[Sensory Core] Encoding noun: rabbit noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[Sensory Core] Encoding noun: rabbit noun
[Sensory Core] Encoding noun: tunnel noun
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[Sensory Core] Encoding noun: green noun
[Sensory Core] Encoding noun: purple noun
[Sensory Core] Encoding noun: red noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 8
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: tunnel noun
[SensoryCore] Encoded String: Blue noun
[SensoryCore] Encoded String: green noun
[SensoryCore] Encoded String: purple noun
[SensoryCore] Encoded String: red noun
[Sensory Core] Outputting following message: Alice noun rabbit noun rabbit noun tunnel noun Blue noun green noun purple noun red noun
As you can see, its duplicating
Yes?>[SensoryCore] Reading Line: Alice followed the white rabbit. She followed the white rabbit down the tunnel. Blue green purple red.
[SensoryCore] Sentence: Alice followed the white rabbit
[SensoryCore] Sentence: She followed the white rabbit down the tunnel
[SensoryCore] Sentence: Blue green purple red
[SensoryCore] Append Sentence: Alice followed the white rabbit.
[SensoryCore] Append Sentence: She followed the white rabbit down the tunnel.
[SensoryCore] Append Sentence: Blue green purple red.
[SensoryCore] Processing tokens (sentences): Alice followed the white rabbit
[SensoryCore] Array Contents[Alice, followed, the, white, rabbit]
[Sensory Core] Encoding noun: Alice noun
[Sensory Core] Encoding noun: rabbit noun
[SensoryCore] Processing tokens (sentences): She followed the white rabbit down the tunnel
[SensoryCore] Array Contents[, She, followed, the, white, rabbit, down, the, tunnel]
[Sensory Core] Encoding noun: rabbit noun
[Sensory Core] Encoding noun: tunnel noun
[SensoryCore] Processing tokens (sentences): Blue green purple red
[SensoryCore] Array Contents[, Blue, green, purple, red]
[Sensory Core] Encoding noun: Blue noun
[Sensory Core] Encoding noun: green noun
[Sensory Core] Encoding noun: purple noun
[Sensory Core] Encoding noun: red noun
[SensoryCore] Encoding Complete
[SesoryCore] Encoding Size: 8
[SensoryCore] Encoded String: Alice noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: rabbit noun
[SensoryCore] Encoded String: tunnel noun
[SensoryCore] Encoded String: Blue noun
[SensoryCore] Encoded String: green noun
[SensoryCore] Encoded String: purple noun
[SensoryCore] Encoded String: red noun
[Sensory Core] Outputting following message: Alice noun rabbit noun rabbit noun tunnel noun Blue noun green noun purple noun red noun
As you can see, its duplicating
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
The colors, for example, have POS "NAV" (noun, adjective, verb). It could be any one of those. If you want to see if there's any possibility that a word is a noun you could use
Ditto for any other POS characters. This is where the parsing gets ugly because if you have three possible POS, then which one is it?
Ditto for any other POS characters. This is where the parsing gets ugly because if you have three possible POS, then which one is it?
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Problem solved. Do you know where I can find a list of what the meanings of the parts of speech abbreviations in moby could be?
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Carey Brown wrote:Your "symbol" will consist of one or more characters denoting various parts of speech in probability order. Your word "3-D" is most likely to be an adjective ('A') but may possibly be a noun ('N').
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
"He went golfing on the green"
"your bank balance is in the red"
"your bank balance is in the red"
Ted Gress
Ranch Hand
Posts: 229
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Where did you get that list?
posted 7 years ago
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
A README.txt file that came along in my download.Ted Gress wrote:Where did you get that list?
| Why fit in when you were born to stand out? - Seuss. Tiny ad: The new gardening playing cards kickstarter is now live! https://www.kickstarter.com/projects/paulwheaton/garden-cards |








