8

I want to read a local txt file and read the text in this file. After that i want to split this whole text into Strings like in the example below .

Example : Lets say file contains-

 abcdef ghijkl aededd ededed ededfe efefeef efefeff ...... ...... 

I want to split this text in to Strings

s1 = abcdef+"\n"+ghijkl; s2 = aededd+"\n"+ededed; s3 = ededfe+"\n"+efefeef+"\n"+efefeff; ........................ 

I mean I want to split text on empty line.

I do know how to read a file. I want help in splitting the text in to strings

2
  • Are you sure that reading the whole file then splitting such a long string is a good idea? I'd suggest you instead reading by lines, append them to StringBuilder through \n until have read blank line. Commented Apr 8, 2012 at 20:01
  • string.split("\n\r") Find this answer here Commented Jan 20, 2017 at 11:42

6 Answers 6

10

you can split a string to an array by

String.split(); 

if you want it by new lines it will be

String.split("\\n\\n"); 

UPDATE*

If I understand what you are saying then john.

then your code will essentially be

BufferedReader in = new BufferedReader(new FileReader("foo.txt")); List<String> allStrings = new ArrayList<String>(); String str =""; while(true) { String tmp = in.readLine(); if(tmp.isEmpty()) { if(!str.isEmpty()) { allStrings.add(str); } str= ""; } else if(tmp==null) { break; } else { if(str.isEmpty()) { str = tmp; } else { str += "\\n" + tmp; } } } 

Might be what you are trying to parse.

Where allStrings is a list of all of your strings.

Sign up to request clarification or add additional context in comments.

7 Comments

I think the pattern should be \\n\\n for double lines.
not on new line i want to split strings on empty line because String can be any number of lines.
Please see the updated question String can be any number of line.
This answer will work, since it only splits when it finds two newlines (e.g. a blank line).
@johnsmith I have updated my answer, but please be aware this is not tested
|
7

I would suggest more general regexp:

text.split("(?m)^\\s*$"); 

In this case it would work correctly on any end-of-line convention, and also would treat the same empty and blank-space-only lines.

Comments

6

The below code would work even if there are more than 2 empty lines between useful data.

import java.util.regex.*; // read your file and store it in a string named str_file_data Pattern p = Pattern.compile("\\n[\\n]+"); /*if your text file has \r\n as the newline character then use Pattern p = Pattern.compile("\\r\\n[\\r\\n]+");*/ String[] result = p.split(str_file_data); 

(I did not test the code so there could be typos.)

Comments

4

It may depend on how the file is encoded, so I would likely do the following:

String.split("(\\n\\r|\\n|\\r){2}"); 

Some text files encode newlines as "\n\r" while others may be simply "\n". Two new lines in a row means you have an empty line.

1 Comment

This one falsely splits on \r\n and therefore doesn't work.
3

Godwin was on the right track, but I think we can make this work a bit better. Using the '[ ]' in regx is an or, so in his example if you had a \r\n that would just be a new line not an empty line. The regular expression would split it on both the \r and the \n, and I believe in the example we were looking for an empty line which would require a either a \n\r\n\r, a \r\n\r\n, a \n\r\r\n, a \r\n\n\r, or a \n\n or a \r\r

So first we want to look for either \n\r or \r\n twice, with any combination of the two being possible.

String.split(((\\n\\r)|(\\r\\n)){2})); 

next we need to look for \r without a \n after it

String.split(\\r{2}); 

lastly, lets do the same for \n

String.split(\\n{2}); 

And all together that should be

String.split("((\\n\\r)|(\\r\\n)){2}|(\\r){2}|(\\n){2}");

Note, this works only on the very specific example of using new lines and character returns. I in ruby you can do the following which would encompass more cases. I don't know if there is an equivalent in Java.

.match($^$) 

Comments

0

@Kevin code works fine and as he mentioned that the code was not tested, here are the 3 changes required:

1.The if check for (tmp==null) should come first, otherwise there will be a null pointer exception.

2.This code leaves out the last set of lines being added to the ArrayList. To make sure the last one gets added, we have to include this code after the while loop: if(!str.isEmpty()) { allStrings.add(str); }

3.The line str += "\n" + tmp; should be changed to use \n instead if \\n. Please see the end of this thread, I have added the entire code so that it can help

BufferedReader in = new BufferedReader(new FileReader("foo.txt")); List<String> allStrings = new ArrayList<String>(); String str =""; List<String> allStrings = new ArrayList<String>(); String str =""; while(true) { String tmp = in.readLine(); if(tmp==null) { break; }else if(tmp.isEmpty()) { if(!str.isEmpty()) { allStrings.add(str); } str= ""; }else { if(str.isEmpty()) { str = tmp; } else { str += "\n" + tmp; } } } if(!str.isEmpty()) { allStrings.add(str); } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.