4

I am trying to learn python-watchdog, but I am sort of confused why the job I set up runs more than once. So, here is my set up:

#handler.py import os from watchdog.events import FileSystemEventHandler from actions import run_something def getext(filename): return os.path.splitext(filename)[-1].lower() class ChangeHandler(FileSystemEventHandler): def on_any_event(self, event): if event.is_directory: return if getext(event.src_path) == '.done': run_something() else: print "event not directory.. exiting..." pass 

the observer is set up like so:

#observer.py import os import time from watchdog.observers import Observer from handler import ChangeHandler BASEDIR = "/path/to/some/directory/bin" def main(): while 1: event_handler = ChangeHandler() observer = Observer() observer.schedule(event_handler, BASEDIR, recursive=True) observer.start() try: while True: time.sleep(1) except KeyboardInterrupt: observer.stop() observer.join() if __name__ == '__main__': main() 

and finally, the actions like so:

#actions.py import os import subprocess def run_something(): output = subprocess.check_output(['./run.sh']) print output return None 

..where ./run.sh is just a shell script I would like to run when a file with an extension .done is found on /path/to/some/directory/bin

#run.sh #!/bin/bash echo "Job Start: $(date)" rm -rf /path/to/some/directory/bin/job.done # remove the .done file echo "Job Done: $(date)" 

However, when I issue a python observer.py and then do a touch job.done on /path/to/some/directory/bin, I see that my shell script ./run.sh runs three times and not one..

I am confused why this runs thrice and not just once (I do delete the job.done file on my bash script)

2
  • 1
    ‘While 1‘ and ‘While true‘ are bad code and im guessing in your code it isnt even necessary since ‘join‘ calls usually block until a condition is met. Your code will block indefinitely at some point, creating a zombie process which wastes system resources Commented Dec 24, 2014 at 18:12
  • Actually, I took the code straight out of a tutorial: ginstrom.com/scribbles/2012/05/10/… I have now deleted both ‘While 1‘ and ‘While true‘ from the code. Thanks again for the tip. Commented Dec 24, 2014 at 19:31

2 Answers 2

6

To debug watchdog scripts, it is useful to print what watchdog is seeing as events. One file edit or CLI command, such as touch, can result in multiple watchdog events. For example, if you insert a print statement:

class ChangeHandler(FileSystemEventHandler): def on_any_event(self, event): print(event) 

to log every event, running

% touch job.done 

generates

2014-12-24 13:11:02 - <FileCreatedEvent: src_path='/home/unutbu/tmp/job.done'> 2014-12-24 13:11:02 - <DirModifiedEvent: src_path='/home/unutbu/tmp'> 2014-12-24 13:11:02 - <FileModifiedEvent: src_path='/home/unutbu/tmp/job.done'> 

Above there were two events with src_path ending in job.done. Thus,

 if getext(event.src_path) == '.done': run_something() 

runs twice because there is a FileCreatedEvent and a FileModifiedEvent. You might be better off only monitoring FileModifiedEvents.

Sign up to request clarification or add additional context in comments.

6 Comments

Thats simply awesome - thanks a lot for this - I now use on_created to just monitor newly created files (since I delete the job.done on my bash script anyway). Works as expected! accepted your answer :)
Do you think I am better off looking for FileCreatedEvent rather than FileModifiedEvent?
Well, I don't think it matters in this case. I suggested FileModifiedEvent only because it happens last, so you know whatever has been written to the file has been written. In this case, it sounds like you are not reading the file, so it does not matter.
Guys I'm trying to create a simple sync tool, which will send these CRUD events to a sync handler. But apparently on any of CRUD operations, except on update, multiple events are fired, and I want to take only the first most event. Eg., in case of created, created event, then modified event of the dir in which the file is created, then modified of the new file created event. How do I just get the created event and work with that?
@bad_keypoints: I think you would need to keep track of the timestamp of the last event on a per-file basis and write custom logic (based on comparison of timestamps) to decide if an event needs to be handled. I don't think there is a canned solution for this.
|
-2

I Made a fix for watchdog:

import watchdog.events import watchdog.observers import time osb = None class Handler(watchdog.events.PatternMatchingEventHandler): def on_any_event(self, event): global osb osb = None print(f"Watchdog received {event} event - {event.src_path}.") def on_modified(self, event): global osb if not osb == event.src_path: #Code goes here osb = event.src_path if __name__ == "__main__": src_path = r"C:\\Users\\Administrator\\Desktop\\" event_handler = Handler() observer = watchdog.observers.Observer() observer.schedule(event_handler, path=src_path, recursive=True) observer.start() try: while True: time.sleep(0) except KeyboardInterrupt: observer.stop() observer.join() 

No I will not explain, sorry(forgot how it works and don't have time to find out).

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.