1

i have one scraper which initiate the "requestes" session and fetch some data, using a IPV6, i have now 10000 ip list, I have prepared it using threading, but its giving error. Need support to find out the issue.

import requests, queue,threading, urllib3,jso,pandas as pd, os, time, datetime,inspect num_threads = 2 root = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) with open (root+ "/ip_list.txt") as ips: device_ip = list(ips) class Writer_Worker(threading.Thread): def __init__(self, queue, df, *args, **kwargs): if not queue: print("Device Queue not specified") exit(1) self.out_q = queue self.df = df super().__init__(*args, **kwargs) def run(self): while True: try: device_details = self.out_q.get(timeout=3) except queue.Empty: return self.df[device_details[0]] = device_details self.out_q.task_done() class Worker(threading.Thread): def __init__(self, queue, out_queue, device_password, *args, **kwargs): if not queue: print("Device Queue not specified") exit(1) self.queue = queue self.pas = device_password self.out_q = out_queue super().__init__(*args, **kwargs) def run(self): while True: try: device_ip = self.queue.get(timeout=3) except queue.Empty: return self.connect_to_device_and_process(device_ip) self.queue.task_done() def connect_to_device_and_process(self, device_ip): st = str("Online") try: r = requests.post("https://["+device_ip+"]/?q=index.login&mimosa_ajax=1", {"username":"configure", "password":self.pas}, verify=False) except requests.exceptions.ConnectionError: st = str("Offline") self.out_q.put([device_ip,st,"","","","","","","","","","","","","","","","","",""]) return finally: if 'Online' in st: r = requests.get("https://["+device_ip+"]/cgi/dashboard.php", cookies=r.cookies, verify=False) if "Response [401]" in str(r): st2 = str("Password Error") self.out_q.put([device_ip,st2,"","","","","","","","","","","","","","","","","",""]) else: data = json.loads(r.content.decode()) output5 = data ['config'] ['Spectrum_Power'] self.out_q.put([device_ip,st,output5['Auto_Power'].replace('2', 'Max Power').replace('1', 'Min Power').replace('0', 'off'),output5['AutoConfig']]) def main(): start = time.time() urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) pas = input("Enter Device Password:") df =pd.DataFrame(columns = ["IP","Status","Auto_Power","AutoChannel"]) q = queue.Queue(len(device_ip)) for ip in device_ip: q.put_nowait(ip) out_q = queue.Queue(len(device_ip)) Writer_Worker(out_q, df).start() for _ in range(num_threads): Worker(q, out_q, pas).start() q.join() print(df) df.to_excel('iBridge_C5x_Audit_Report.xlsx', sheet_name='Detail', index = False) if __name__ == "__main__": main() 

below is the error while running the script, seeps I am unable to login to this device. Any help is appreciable.

enter image description here

2
  • your str2 = str("Password Error") is wrong, `"Password Error" is already a string Commented Jun 14, 2021 at 15:24
  • @Theshape yeah write will do the correction, my main issue is how to make it multi-threading. Commented Jun 14, 2021 at 15:25

2 Answers 2

3

You should use a thread pool that distributes the work between a fixed number of threads. This is a core feature of Python since version 3.2.

  1. from concurrent.futures import ThreadPoolExecutor
  2. Define a function perform(ip) that performs the request for one ip
  3. Set variable numThreads to the number of desired threads
  4. Run the thread-pool executor:
print(f'Using {numThreads} threads') with ThreadPoolExecutor(max_workers=numThreads) as pool: success = all(pool.map(perform, ips)) 

Source: https://docs.python.org/3/library/concurrent.futures.html

On that page you find an example even better tailored to your application: https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for help, but I am unable to implement this as per my requirement. if possible can you modify my code. Thanks in advance. I am a beginner.
0

from threading import Thread

th = Thread(target=self.fill_imdb, args=(movies_info_part, "thread " + str(count))) th.start()

fill_imdb is my method

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.