I'm scraping data from a website and I would like to know when this data are updated. But on the site there is not an absolute data, only a reference like this: Updated on Monday 21:00 or this one, Update 1 day ago. Anyone can help me to get the timestamp from these strings? Thanks
- What input u have and what output u expect?NPatel– NPatel2015-12-01 10:46:41 +00:00Commented Dec 1, 2015 at 10:46
- @Lafada I have in input strings like: Updated on Monday 21:00 or Updated 1 day ago and I would like to transform these strings in 23/11/2015 21:00Lupanoide– Lupanoide2015-12-01 10:51:54 +00:00Commented Dec 1, 2015 at 10:51
- 3consider checking - github.com/bear/parsedatetimeTom Ron– Tom Ron2015-12-01 10:56:13 +00:00Commented Dec 1, 2015 at 10:56
- @TomRon This library is perfect for my job. Thanks a lot TomRon!Lupanoide– Lupanoide2015-12-01 11:20:43 +00:00Commented Dec 1, 2015 at 11:20
Add a comment |
1 Answer
You could use parsedatetime module as @Tom Ron suggested:
#!/usr/bin/env python import parsedatetime as ptd ptc = ptd.Constants() ptc.YearParseStyle = 0 # avoid future year ptc.DOWParseStyle = 0 # how weekday is parsed cal = ptd.Calendar(ptc) for human_time in ["Updated on Monday 21:00", "1 day ago"]: print(cal.parseDT(human_time)[0]) Output
2015-11-30 21:00:00 2015-11-30 20:49:03