Skip to main content
Tweeted twitter.com/#!/StackMma/status/561928695632654336
edited body
Source Link
Basheer Algohi
  • 20.2k
  • 1
  • 34
  • 80

I have a data file (over 2.5 GB) and I want to do calculations on the data.

At first I used.

data=ReadList["file", Record,n] 

I still have three problems:

1- From the Doc of ReadList (ReadList["file",types,n] reads only the first n objects of the specified types) reading starsstarts from the beginning of the file. So how can I read from any part inside the file ( It is clear if I use n the last record, this means I am reading the whole file which I can't).

2- If there is way to read from any part of the file and let useus say I want to read the last 1000 objects, how can I know n (last object of the file).

3- The data (after reading) is in this form:

 {" 0.00000E+000\t-9.15527E-004\t", " 2.50000E-006\t-0.0015258789\t", \ " 5.00000E-006\t-0.0018310547\t", " 7.50000E-006\t-0.0015258789\t"} 

I want to create two columns of the data. I used this way:

Interpreter["Number"][ReadList[StringToStream[#], Word] & /@ data] 

As can be seen, I used ReadList twice (one at first reading and one to convert the data). Is there any other efficient way to do this?

Thank you.

I have a data file (over 2.5 GB) and I want to do calculations on the data.

At first I used.

data=ReadList["file", Record,n] 

I still have three problems:

1- From the Doc of ReadList (ReadList["file",types,n] reads only the first n objects of the specified types) reading stars from the beginning of the file. So how can I read from any part inside the file ( It is clear if I use n the last record, this means I am reading the whole file which I can't).

2- If there is way to read from any part of the file and let use say I want to read the last 1000 objects, how can I know n (last object of the file).

3- The data (after reading) is in this form:

 {" 0.00000E+000\t-9.15527E-004\t", " 2.50000E-006\t-0.0015258789\t", \ " 5.00000E-006\t-0.0018310547\t", " 7.50000E-006\t-0.0015258789\t"} 

I want to create two columns of the data. I used this way:

Interpreter["Number"][ReadList[StringToStream[#], Word] & /@ data] 

As can be seen, I used ReadList twice (one at first reading and one to convert the data). Is there any other efficient way to do this?

Thank you.

I have a data file (over 2.5 GB) and I want to do calculations on the data.

At first I used.

data=ReadList["file", Record,n] 

I still have three problems:

1- From the Doc of ReadList (ReadList["file",types,n] reads only the first n objects of the specified types) reading starts from the beginning of the file. So how can I read from any part inside the file ( It is clear if I use n the last record, this means I am reading the whole file which I can't).

2- If there is way to read from any part of the file and let us say I want to read the last 1000 objects, how can I know n (last object of the file).

3- The data (after reading) is in this form:

 {" 0.00000E+000\t-9.15527E-004\t", " 2.50000E-006\t-0.0015258789\t", \ " 5.00000E-006\t-0.0018310547\t", " 7.50000E-006\t-0.0015258789\t"} 

I want to create two columns of the data. I used this way:

Interpreter["Number"][ReadList[StringToStream[#], Word] & /@ data] 

As can be seen, I used ReadList twice (one at first reading and one to convert the data). Is there any other efficient way to do this?

Thank you.

Source Link
Basheer Algohi
  • 20.2k
  • 1
  • 34
  • 80

How to efficiently read data from any part inside huge file

I have a data file (over 2.5 GB) and I want to do calculations on the data.

At first I used.

data=ReadList["file", Record,n] 

I still have three problems:

1- From the Doc of ReadList (ReadList["file",types,n] reads only the first n objects of the specified types) reading stars from the beginning of the file. So how can I read from any part inside the file ( It is clear if I use n the last record, this means I am reading the whole file which I can't).

2- If there is way to read from any part of the file and let use say I want to read the last 1000 objects, how can I know n (last object of the file).

3- The data (after reading) is in this form:

 {" 0.00000E+000\t-9.15527E-004\t", " 2.50000E-006\t-0.0015258789\t", \ " 5.00000E-006\t-0.0018310547\t", " 7.50000E-006\t-0.0015258789\t"} 

I want to create two columns of the data. I used this way:

Interpreter["Number"][ReadList[StringToStream[#], Word] & /@ data] 

As can be seen, I used ReadList twice (one at first reading and one to convert the data). Is there any other efficient way to do this?

Thank you.