I had made a script for scraping some data from a web site but it only runs for a few few pages and after that it will stop with this message "'NoneType' object has no attribute 'a'".Another error which appear sometimes is this:
File "scrappy3.py", line 31, in <module> f.writerow(doc_details) File "C:\python\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u015f' in position 251: character maps to <undefined> Can You please give me an advice how to resolve those errors.This is my script:
import requests import csv from bs4 import BeautifulSoup import re import time start_time = time.time() page = 1 f = csv.writer(open("./doctors.csv", "w", newline='')) while page <= 5153: url = "http://www.sfatulmedicului.ro/medici/n_s0_c0_h_s0_e0_h0_pagina" + str(page) data = requests.get(url) print ('scraping page ' + str(page)) soup = BeautifulSoup(data.text,"html.parser") for liste in soup.find_all('li',{'class':'clearfix'}): doc_details = [] url_doc = liste.find('a').get('href') for a in liste.find_all('a'): if a.has_attr('name'): doc_details.append(a['name']) data2 = requests.get(url_doc) soup = BeautifulSoup(data2.text,"html.parser") a_tel = soup.find('div',{'class':'contact_doc add_comment'}).a tel_tag=a_tel['onclick'] tel = tel_tag[tel_tag.find("$(this).html("):tel_tag.find(");")].lstrip("$(this).html(") doc_details.append(tel) f.writerow(doc_details) page += 1 print("--- %s seconds ---" % (time.time() - start_time))
soup.find('div',{'class':'contact_doc add_comment'})does not find anything, returnsNone, so the.afails.