Linked Questions
35 questions linked to/from How to efficiently parse fixed width files?
1 vote
3 answers
3k views
Python library for parsing space delimited files [duplicate]
Possible Duplicate: Efficient way of parsing fixed width files in Python Not even sure if "space delimited" is really the right term here (which is probably what is hindering my search efforts). ...
1 vote
1 answer
1k views
how to convert fix length file to csv file in python? [duplicate]
In the file each columns have fix size. Is there any library or function available using that we can convert easily to CSV file ? There are four columns in the file. 1st - 1233212Q1AQYHDVCS1221 2nd ...
1 vote
1 answer
61 views
Python regex to find connected digits [duplicate]
I have raw txt files and need to use regex to search each digit separated by space. Question, data format is like: 6 3 1 0 7 3 1 0 8 35002 0 9 34104 0 My regex is: (?P&...
15 votes
8 answers
49k views
Convert a space delimited file to comma separated values file in python
I am very new to Python. I know that this has already been asked, and I apologise, but the difference in this new situation is that spaces between strings are not equal. I have a file, named coord, ...
15 votes
7 answers
5k views
Scraping large pdf tables which span across multiple pages
I am trying to scrape PDF tables which span across multiple pages. I tried many things but the best seems to be pdftotext -layout as advised here. The problem is that the resultant text file is not ...
5 votes
6 answers
33k views
Python convert table to dictionary
Problem: A command generates a table that makes it very hard to script around. Solution: Convert table into a Python dictionary for much more efficient use. These tables can have 1 - 20 differing ...
2 votes
5 answers
5k views
data read from a text file into two lists in python
My text file format is: apple very healthy orange tangy and juicy banana yellow in color and yummy I need to create either two lists: l1 = ['apple','orange','banana'] l2=['very healthy',...
6 votes
1 answer
3k views
Excel-like text import in Python: automatically parsing fixed width columns
In Excel, if you import whitespace delineated text in which the columns do not line up perfectly and data may be missing, like pH pKa/Em n(slope) 1000*chi2 vdw0 CYS-I0014_ ...
1 vote
4 answers
3k views
How do I transpose/pivot a csv file with python *without* loading the whole file into memory?
For one of my data analysis pipelines, I end up generating a lot of individual CSV files. I would like to transpose them, concatenate them, and transpose them again. However, the amount of data is ...
2 votes
3 answers
2k views
Parsing a command output - Python
I'm running a utility that parses the output of the df command. I capture the output and send it to my parser. Here's a sample: Filesystem 512-blocks Used Available Capacity ...
-1 votes
2 answers
729 views
Convert list of strings to Pandas Dataframe
I have a list that looks like this: Sum = ['* Report_type Leach\n', '* Result_text Concentration \n', '* Run_Id 179\n', '* Location ...
1 vote
1 answer
1k views
Difficulty utilizing Python struct.Struct.unpack_from with different format strings
First time poster, long-time lurker. Have searched high and low for an answer to this but it's got to that stage...! I am having some trouble implementing the answer given by John Machin to this past ...
1 vote
3 answers
556 views
Parse flat-file (positional text-file) to read the wavelength
I have the next txt with data: FI R 83.0000m 34.960 1.1262 Fe 2 1.32055m 33.626 0.0522 N 2 5754.61A 33.290 0.0241 TI R 1800.00m 33.092 0.0153 ...
1 vote
1 answer
378 views
Splitlines in Python a table with empty spaces
Through one command linux (lsof) I get a serie of data in a table: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME init 1 root cwd unknown ...
1 vote
1 answer
1k views
Dealing with non-ASCII characters when parsing fixed-width txt file
So I have a series of huge files (several GB) of tabular data. They are txt and each column is defined by a fixed width. This width is indicated by a number of dashes right below the headers. So far ...