0

I am a newbie and would like to extract dates from a txt file and write them to another file. Each date in one line. But I don't get how. I tried append but it won't work and this way it only writes the last date:

f = open("Krupp.txt", "r") contents = f.read() f.close() #close the file # finditer # finds all Dates and shows them in a List (Montag, 15. März 2013) for m in re.finditer("(Montag|Dienstag|Mittwoch|Donnerstag|Freitag|Samstag|Sonnabend|Sonntag)(, )([123][0-9]|[1-9])(. )(Januar|Februar|März|April|Mai|Juni|Juli|August|September|Oktober|November|Dezember)( )([0-2][0-9][0-9][0-9])", contents): print m.group(0) # changed with open("testoutput.txt", "a") as myfile: myfile.write(m.group(0)) 

---EDIT--- I changed

f.write(contents) # writes contents correctly to file with Umlauts f.write(m.group(0)) 

to

with open("testoutput.txt", "a") as myfile: myfile.write(m.group(0)) 

Now it writes all Dates to the file, but it writes them directly after another. What do I have to add, if I want them below eachother?

Can anybody help?

best regards

2
  • Please provide more details around this. Provide a sample of what the file looks like. Are you getting anything from your regex match? Furthermore, you are also constantly over-writing the file inside your loop every time you open it in 'write' mode each time. You want to open your file outside of your loop and then write. Commented Mar 19, 2017 at 15:59
  • You are simply overwriting your file in every iteration, maybe open("testoutput.txt", "a") is what you're looking for. Furthermore, opening and writing to the file on each iteration is very slow - save in to a string and write it once afterwards. Commented Mar 19, 2017 at 15:59

2 Answers 2

1

What do I have to add, if I want them below eachother?

I guess, you mean a linefeed:

myfile.write("\n")

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I often struggle to find answers, due to missing vocabulary
0

The following is working for me on python 2.7.6

#!/bin/python # -*- coding: utf-8 -*- import re f = open("Krupp.txt", "r") contents = f.read() f.close() #close the file # finditer # finds all Dates and shows them in a List (Montag, 15. März 2013) with open("testoutput.txt", "a+") as f: for m in re.finditer("(Montag|Dienstag|Mittwoch|Donnerstag|Freitag|Samstag|Sonnabend|Sonntag)(, )([123][0-9]|[1-9])(. )(Januar|Februar|März|April|Mai|Juni|Juli|August|September|Oktober|November|Dezember)( )([0-2][0-9][0-9][0-9])", contents): print m.group(0) f.write(m.group(0)) f.write("\n") 

The data file I used to test is:

Montag, 10. März 2013 Montag, 15. Juni 2013 Freitag, 15. März 2013 Montag, 15. Januar 2013 Dienstag, 15. März 2013 Montag, 15. März 2013 Juli, 15. Februar - incomplete Juli, 15. Februar 2013 asdasdasdasdasd;lasdjkfas;dlfjk;a fjasl;dfj ;akdfj;askjdfa Mittwoch, 15. März 2013 test Mittwoch, 15. Januar 2013 blah Montag, 15. März 2013 

Code explanation/changes:

  1. I had to add # -*- coding: utf-8 -*- for python to get UTF characters in the source
  2. open("testoutput.txt", "a+") this opens the file in read+append mode.
  3. You were re-opening the file in every loop which is not suggested! Moved the open before the loop
  4. with open expression automatically closes the file when out of the context (when with block finishes). It is generally more safe since it will also close the file on exceptions and errors
  5. f.write("\n"): Answers your edit ... adds a new line after each entry

Let me know if you have more questions or you need more explanation

1 Comment

Thank you very much :) I will probably need a lot of help for further steps

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.