I'm writing a program that has to download a bunch of files from the web before it can even run, so I created a function that will download all the files and "initialize" the program called init_program, how it works is it runs through a couple dicts that have urls to a gistfiles on github. It pulls the urls and uses urllib2 to download them. I won't be able to add all the files but you can try it out by cloning the repo here. Here's the function that will create the files from the gists:
def init_program(): """ Initialize the program and allow all the files to be downloaded This will take awhile to process, but I'm working on the processing speed """ downloaded_wordlists = [] # Used to count the amount of items downloaded downloaded_rainbow_tables = [] print("\n") banner("Initializing program and downloading files, this may take awhile..") print("\n") # INIT_FILE is a file that will contain "false" if the program is not initialized # And "true" if the program is initialized with open(INIT_FILE) as data: if data.read() == "false": for item in GIST_DICT_LINKS.keys(): sys.stdout.write("\rDownloading {} out of {} wordlists.. ".format(len(downloaded_wordlists) + 1, len(GIST_DICT_LINKS.keys()))) sys.stdout.flush() new_wordlist = open("dicts/included_dicts/wordlists/{}.txt".format(item), "a+") # Download the wordlists and save them into a file wordlist_data = urllib2.urlopen(GIST_DICT_LINKS[item]) new_wordlist.write(wordlist_data.read()) downloaded_wordlists.append(item + ".txt") new_wordlist.close() print("\n") banner("Done with wordlists, moving to rainbow tables..") print("\n") for table in GIST_RAINBOW_LINKS.keys(): sys.stdout.write("\rDownloading {} out of {} rainbow tables".format(len(downloaded_rainbow_tables) + 1, len(GIST_RAINBOW_LINKS.keys()))) new_rainbowtable = open("dicts/included_dicts/rainbow_tables/{}.rtc".format(table)) # Download the rainbow tables and save them into a file rainbow_data = urllib2.urlopen(GIST_RAINBOW_LINKS[table]) new_rainbowtable.write(rainbow_data.read()) downloaded_rainbow_tables.append(table + ".rtc") new_rainbowtable.close() open(data, "w").write("true").close() # Will never be initialized again else: pass return downloaded_wordlists, downloaded_rainbow_tables This works, yes, however it's extremely slow, due to the size of the files, each file has at least 100,000 lines in it. How can I speed up this function to make it faster and more user friendly?