6

I am using Python Watchdog to monitor a folder on Ubuntu. It's working fine with 1 or 2 files, but when I moved 50 files by command mv *.xml dest_folder then it received only 2 events and processed only 2 files. Below is the code.

def on_moved(self, event): try: logger.debug("on_moved event :" + str(event) ) self._validate_xml(event.dest_path) except Exception as ex: logger.exception(ex) 

If I comment out _validate_xml function then I receive all 45 events.

Can any one tell me what is exactly happened in the Watchdog and what is the best solution for this?

1
  • Sorry, even commenting _validate_xml function its receiving 2 events only. Commented Oct 18, 2012 at 7:45

1 Answer 1

5

I haven't used Python Watchdog, but from a generic real-time systems perspective,

  • processing xml with _validate_xml can be slow, and make you miss events.
  • event = similar to an interrupt, handling should be as fast as possible.

To more you do while handling an event, the less "real-time" your system becomes. What you can do is offload the xml validity check to another process and exchange messages with a Queue (message would be event.dest_path) the paths you have seen moving. Your event handling will be as simple as putting messages on a queue, and the files can be processed in batch by the consumer of the queue.

In short:

  • instantiate a Queue
  • fork() process
  • in the on_moved handler, put messages on the queue,
  • in the forked process, pop messages from the queue and call _validate_xml.
  • you may optionally leverage multiprocessing.Pool do validate xml files in parallel.

good luck.

EDIT: tested out on my system; most of the comments above seem not to apply because watchdog's code seems to handle threading just fine.

#!/usr/bin/env python import time from watchdog.observers import Observer, api from watchdog.events import LoggingEventHandler, FileSystemEventHandler, FileMovedEvent import logging def counter_gen(): count = 0 while True: count += 1 yield count class XmlValidatorHandler(FileSystemEventHandler): sleep_time = 0.1 COUNTER = counter_gen() def on_moved(self, event): if isinstance(event, FileMovedEvent): print '%s - event %d; validate: %s' % ( type(self).__name__, self.COUNTER.next(), event.dest_path) time.sleep(self.sleep_time) class SlowXmlValidatorHandler(XmlValidatorHandler): sleep_time = 2 COUNTER = counter_gen() def get_observer(handler): observer = Observer(timeout=0.5) observer.event_queue.maxsize=10 observer.schedule(handler, path='.', recursive=True) return observer if __name__ == "__main__": logging.basicConfig(level=logging.INFO) event_handler = LoggingEventHandler() observer1 = get_observer(XmlValidatorHandler()) observer2 = get_observer(SlowXmlValidatorHandler()) observer1.start() observer2.start() try: while True: time.sleep(1) except KeyboardInterrupt: observer1.stop() observer2.stop() observer1.join() observer2.join() 

Wasn't able to reproduce your issue. some pointers:

  • check queue maxsize, if you already have items in there and they don't get handled in a timely fashion, then my guess is that the timeout kicks in and the event is lost. You may want to resize in that case.
  • check timeout, if it is configured, you may want to tune that parameter.

Maybe a more complete snippet would help us help you.

Sign up to request clarification or add additional context in comments.

3 Comments

Hi, thanks for your quick response. In code provided by you just i changed one line observer.schedule(handler, path='./mont', recursive=True)
and put 45 files by mv command in mont folder and i get only 2 events for both Handler. Below is the print msg <br>*XmlValidatorHandler - event 1; validate: /home/devtool/test/mont/invalid_test_invalid_data_type.xlsx SlowXmlValidatorHandler - event 1; validate: /home/devtool/test/mont/invalid_test_invalid_data_type.xlsx XmlValidatorHandler - event 2; validate: /home/devtool/test/mont/valid_test_offers_terms_HDVOD_SDEST.xlsx SlowXmlValidatorHandler - event 2; validate: /home/devtool/test/mont/valid_test_preview_missing.xlsx* </br>
Even change the size of queue to 100 then also same.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.