I'm coming across some weirdness with a variable not being accessible in other functions after being set. This is a Celery task file named html.py
base_path = '' @app.task(bind=True) def status(self): """ returns the count of files downloaded and the timestamp of the most recently downloaded file """ num_count = 0 latest_timestamp = '' for root, _, filenames in os.walk(base_path): for filename in filenames: file_path = root + '/' + filename file_timestamp = datetime.fromtimestamp(os.path.getctime(file_path)) if latest_timestamp == '' or file_timestamp > latest_timestamp: latest_timestamp = file_timestamp num_count += 1 @app.task(bind = True) def download(self, url='', cl_id=-1): if len(url) == 0 or cl_id < 0: return None base_path = settings.WGET_PATH + str(cl_id) log_paths = { 'output' : wget_base_path + '/out.log', 'rejected' : wget_base_path + '/rejected.log' } create_files(log_paths) wget_cmd = 'wget -prc --convert-links --html-extension --wait=3 --random-wait --no-parent ' \ '--directory-prefix={0} -o {1} --rejected-log={2} {3}'.\ format(wget_base_path, log_paths['output'], log_paths['rejected'], url) subprocess.Popen(wget_cmd, shell = True) When I call this via
from ingest.task import html web_url = 'https://www.gnu.org/software/wget/manual/html_node/index.html' ingest = html.download.delay(web_url, 54321) the wget process kicks off as expected. However, the base_path parameter at the top of the file never gets set, so when I call status via
status = html.status.delay() the base_path variable is an empty string, despite status being called after download. Is this because these tasks are in a script vs a class?