0

I have a text file that has numerous lines. I want to extract certain lines and write them to a CSV file. However, I want to write particular lines to the same row in the CSV file. For example, my text file is like this:

Name= Sarah F Location= Baltimore MD Name= Bob M Location= Sacramento CA Name= Tom M NY Location= Brooklyn NY Name= Anne F Location= Morristown NJ 

My CSV file I want to generate will include the name of the person, their sex, the city and state they reside in:

Sarah,F,Baltimore,MD Bob,M,Sacramento,CA Tom,M,Brooklyn,NY Anne,F,Morristown,NJ 

When I use csv.writerows([list]) I get the names,sex and the city,state written in separate rows:

Sarah,F Baltimore,MD Bob,M Sacramento,CA Tom,M Brooklyn,NY Anne,F Morristown,NJ 

When I try to append to the list with: [name, sex] the city and state the override the original list instead of appending.

Here is my code to do this:

import csv file = open("file_to_use.txt", 'r') csv_file = open("file_to_write.csv", 'wb') writer = csv.writer(csv_file) Row_lines =[] for line in file: if line.startswith("Name="): name_line = line.replace(" ", ",") name_line = name_line.strip("\n") Row_lines.append(name_line) if line.startswith("Location="): loc_line = line.replace(" ", ",") loc_line = loc_line.strip("\n") Row_lines.append(loc_line) writer.writerows(Row_lines) csv_file.close() 

I know I have some logical order in the incorrect place, but I can't seem to figure it out.

1
  • Are Name= and Location= lines always alternating in the input file? Are there any other lines in the input file or only those two types? Commented Feb 28, 2016 at 16:15

4 Answers 4

2

There are two parts to your task. First is joining the rows, you can use zip for that:

with open(inputfile) as propsfile: data = [row.split("=")[1].split() for row in propsfile] # join two at a time data_tuples = zip(data[::2], data[1::2]) 

Second is writing the rows, you can use the csv module for that:

import csv with open(outputfile, 'w') as csvfile: writer = csv.writer(csvfile) writer.writerows([name+location for name, location in data_tuples]) 

Now we have the data in outputfile:

Sarah,F,Baltimore,MD Bob,M,Sacramento,CA ... 
Sign up to request clarification or add additional context in comments.

Comments

1

You are adding two different rows to Row_lines which represent one single csv row, you should add only one row to Row_lines for each row.

Comments

1

Each time you call Row_lines.append(), you are adding a new item to the list. Each item in the list is written as a separate line when you call writer.writerows(Row_lines).

Each time you encounter a name line, you should create a new string from that line, but don't add it to the Row_lines list yet. Each time you encounter a location line, you should append it to the name line string, creating a complete row which you can now add to the Row_lines list.

And instead of calling writerows() on each iteration of the loop, you should call it once after you have compiled the full list of rows.

import csv file = open("file_to_use.txt", 'r') csv_file = open("file_to_write.csv", 'wb') writer = csv.writer(csv_file) Row_lines =[] for line in file: if line.startswith("Name="): name_line = line.replace(" ", ",") name_line = name_line.strip("\n") # start building the new line current_line = name_line if line.startswith("Location="): loc_line = line.replace(" ", ",") loc_line = loc_line.strip("\n") # append the extra fields to the current line current_line = current_line + ',' + loc_line # add the current line to the output list Row_lines.append(current_line) # call this after you have added # all lines, not after each one writer.writerows(Row_lines) csv_file.close() 

Comments

0

Here is a code that does not use any external libraries.

Since all your lines are not necessarily consistent (e.g. "Name= Tom M NY" - NY should probably not be there), this code looks at the 2 first data entries following "Name=" or "Location=" and ignores any subsequent entries (like "NY" in the example above).

# Opening the file to use input_file = open(r"C:\Temp\file_to_use.txt", 'r') # Creating an empty CSV file output_file = open(r"C:\Temp\output.csv", 'w') # Going through the text file, it is checking whether the line holds name or location information # If it holds location information, all saved information so far is saved to the CSV file for line in input_file: line = line.split("=") if line[0] == "Name": first_name, last_name = line[1].strip().split()[:2] elif line[0] == "Location": city, state = line[1].strip().split()[:2] output_file.write('%s,%s,%s,%s\n' % (first_name, last_name, city, state)) # Closes the opened files input_file.close() output_file.close() 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.