4

I am trying to read a text file with C#, that is formatted like this:

this is a line\r\n this is a line\r \r\n this is a line\r \r\n this is a line\r \r\n this is a line\r\n this is a line\r \r\n etc... 

I am reading each line from the file with

StreamReader.ReadLine() 

but that does not preserve new line characters. I need to know/detect what kind of new line characters there are because I am counting the amount of bytes on each line. For example:

if the the line ends with character \r, line consists of: ((nr-of-bytes-in-line) + 1 byte) bytes (depending on the encoding type of course), if line ends with \r\n, line consists of: ((nr-of-bytes-in-line) + 2 bytes) bytes.

EDIT:

I have the solution, based on the answer of israel altar. BTW: Jon Skeet suggested it also. I have implemented an overridden version of ReadLine, so that it would include new line characters. This is the code of the overridden function:

 public override String ReadLine() { StringBuilder sb = new StringBuilder(); while (true) { int ch = Read(); if (ch == -1) { break; } if (ch == '\r' || ch == '\n') { if (ch == '\r' && Peek() == '\n') { sb.Append('\r'); sb.Append('\n'); Read(); break; } else if(ch == '\r' && Peek() == '\r') { sb.Append('\r'); break; } } sb.Append((char)ch); } if (sb.Length > 0) { return sb.ToString(); } return null; } 
9
  • 3
    I believe you'll basically have to reimplement ReadLine() yourself in that case then. Commented Apr 12, 2016 at 11:54
  • Do use ReadLine. Read one character at a time if you need to byte count. Commented Apr 12, 2016 at 11:54
  • No I do it like this: string line = sr.ReadLine(); int nrOfBytes = Encoding.GetByteCount(line); But need to detect what kind of new line chars there are.. either \r or \r\n. So that I could do: nrOfBytes += Encoding.GetByteCount(UNKNOWN-NEW-LINE-CHAR); Commented Apr 12, 2016 at 11:56
  • This will help to implement it: referencesource.microsoft.com/#mscorlib/system/io/… Commented Apr 12, 2016 at 11:57
  • Use Stream, not StreamReader, because you need to deal with bytes. All TextReaders, including StreamReader, help you proces lines at the expense of making it impossible for you to access the raw bytes separating them. Commented Apr 12, 2016 at 11:57

1 Answer 1

3

this is the way that readline is implemented according to .net resources:

// Reads a line. A line is defined as a sequence of characters followed by // a carriage return ('\r'), a line feed ('\n'), or a carriage return // immediately followed by a line feed. The resulting string does not // contain the terminating carriage return and/or line feed. The returned // value is null if the end of the input stream has been reached. // public virtual String ReadLine() { StringBuilder sb = new StringBuilder(); while (true) { int ch = Read(); if (ch == -1) break; if (ch == '\r' || ch == '\n') { if (ch == '\r' && Peek() == '\n') Read(); return sb.ToString(); } sb.Append((char)ch); } if (sb.Length > 0) return sb.ToString(); return null; } 

as you can see you can add an if sentence like this:

 if (ch == '\r') { //add the amount of bytes wanted } if (ch == '\n') { //add the amount of bytes wanted } 

or do whatever manipulation you want.

Sign up to request clarification or add additional context in comments.

3 Comments

Or just change it to append the \r and \n to the StringBuilder.
I am going to try this, I will have to implement my own version of ReadLine (an overridden version).
I have implemented an overridden version of ReadLine in a custom class, I think it works. I am testing what the best way is to get the byte count of a line, but this is the solution I was looking for.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.