1

Infile is a genealogy:

holla 1755 ronaj 1781 asdflæj 1803 axle 1823 einar 1855 baelj 1881 æljlas 1903 jobbi 1923 gurri 1955 kolli 1981 Rounaj 2004 

I want to print out every generation time from infile and in the end I want the average. Here I think my issue is that line2 gets out of range when the infile ends:

def main(): infile = open('infile.txt', 'r') line = infile.readline() tmpstr = line.split('\t') age=[] while line !='': line2 = infile.readline() tmpstr2 = line2.split('\t') age.append(int(tmpstr2[1]) - int(tmpstr[1])) print age tmpstr = tmpstr2 infile.close() print sum(age)*1./len(age) main() 

So I decided to read all information to a list but tmpstr doesn´t change value here:

def main(): infile = open('infile.txt', 'r') line = infile.readline() age=[] while line !='': tmpstr = line.split('\t') age.append(tmpstr[1]) print age infile.close() print sum(age)*1./len(age) main() 

How come? What's wrong with these two scripts? Why am I writing main() two times? Any ideas how these two can be solved?

Thanx all, this is how it ended up:

 def main(): with open('infile.txt', 'r') as input: ages = [] for line in input: data = line.split() age = int(data[1]) ages.append(age) gentime = [] for i in xrange(len(ages)-1): print ages[i+1] - ages[i] gentime.append(ages[i+1] - ages[i]) print 'average gentime is', sum(gentime)*1./len(gentime) main() 
3
  • I suggest this go on Code Review instead. Commented Jan 14, 2012 at 23:20
  • 1
    @Martin: codereview.SE is not for broken code. Commented Jan 14, 2012 at 23:23
  • @NiklasBaumstark: okay, good point. I did not think of it like that before. Commented Jan 14, 2012 at 23:26

5 Answers 5

1

Try this:

def main(): with open('infile.txt', 'r') as input: ages, n = 0, 0 for line in input: age = int(line.split()[1]) ages += age n += 1 print age print 'average:', float(ages) / n 

Some comments:

  • You don't need to use a list for accumulating the numbers, a couple of local variables are enough
  • In this case it's a good idea to use split() without arguments, in this way you'll process the input correctly when the name is separated from the number in front of it by spaces or tabs
  • It's also a good idea to use the with syntax for opening a file and making sure that it gets closed afterwards

With respect to the final part of your question, "Why am I writing main() two times?" that's because the first time you're defining the main function and the second time you're calling it.

Sign up to request clarification or add additional context in comments.

1 Comment

Maybe you rather want float(ages) / n
1

You can iterate over the entire contents of the file using this statement:

for line in infile: # Perform the rest of your steps here 

You wouldn't want to use a while loop, unless you had some sort of counter to switch index locations (i.e. you used infile.readlines() and wanted to use a while loop for that).

Comments

1

In the second instance, your code only reads a single line from the file.

Something simpler, like:

age = [] with open('data.txt', 'rt') as f: for line in f: vals = line.split('\t') age.append(int(vals[1])) print sum(age) / float(len(age)) 

generates

1878.54545455 

Comments

1

You can try something like this:

if __name__ == "__main__": file = open("infile.txt", "r") lines = file.readlines() gens = [int(x.split('\t')[1]) for line in lines] avg = sum(gens)/len(gens) 

The first line is the native entrance for python into a program. It is equivalent to C's "int main()".

Next, its probably easiest to set up for list comprehensions if you read all lines from the file into the list.

The 4th line iterates through the file lines splitting them at the tab and only retrieving the 2nd item (at index 1) from the newly split list.

Comments

1

The problem with both of these scripts is that your while loop is infinite. The condition line != '' will never be false unless the first line is empty.

You could fix this, but it's better to use the Python idiom:

lastyear = None ages = [] for line in infile: _name, year = line.split('\t') year = int(year) if lastyear: ages.append(year - lastyear) lastyear = year print float(sum(ages))/len(ages) 

2 Comments

why is underscore the first character in _name?
That's just a hint to the reader that _name is needed for tuple unpacking but will not be used. See this answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.