Not able to click on element in a Selenium using Python automation script

Question

Issue Description:

I am trying to automate a process where I can visit a website, hover over the menu navigation bar and click on each navigation category options from tier 1 dropdown, visit that page and scrape product details of top 20 products on that page and put it in an excel file. If that page does not contain any product, the script will continue scrolling down till it reaches the end of the page and if no product-div is found, it will go back to the top of the page and click on the next category in the navigation panel

I am working with Selenium (using python) for this. I have attached my code below.

scroll_and_click_view_more function is for scrolling down the page, prod_vitals function is for scraping product details specific to each page, and prod_count function is for extracting total count of products on each page and creating a summary of all pages.

Error Description:

When I run the below code, every function is working fine except one. The first page that this code is scrolling down, does not have any product details. The script will scroll down the entire page, print no product tiles found on that page and then is supposed to click on the next category but for some reason it can't click on the next category in the path. It throws timeout exception error and clicks on the next category which is working fine again. This website has two categories where there is no product tile present and for both of these pages, the script is unable to click on the next category available. I am attaching a screenshot of the error.

Output of my code:

['/feature/unlock-your-courage.html', '/shop/new/women', '/shop/women', '/shop/men/bags', '/shop/collection', '/shop/gift/women/bestseller', '/shop/coachworld', '/shop/coachreloved/coach-reloved'] Reached the end of the page and no product tiles were found: /feature/unlock-your-courage.html Element with href /shop/new/women not clickable Link: /shop/women Link: /shop/men/bags Link: /shop/collection Link: /shop/gift/women/bestseller Reached the end of the page and no product tiles were found: /shop/coachworld Element with href /shop/coachreloved/coach-reloved not clickable

If you look at the output, in the first line, it prints all the navigation categories available on the site. After that, the script visits all the URLs in that array is able to click on all the URLs except the second and eighth one. FYI, the first and seventh category does not contain any product tile on that page. Rest all the links are clickable. The clicking on each category and iterating over the loop is taken care inside the WebScraper class.

Resolution Steps:

I have tried adding time.sleep() in between the actions but still this doesn't work. I also added a step where it is taking a screenshot when timeout exception is happening, I can see the category is visible on screen but still it is not clickable.

I am attaching a screenshot of the output on terminal. Screenshot of error

I am attaching my code below:

from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.service import Service from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.chrome.options import Options from selenium.common.exceptions import TimeoutException from selenium.webdriver.support import expected_conditions as EC from bs4 import BeautifulSoup import pandas as pd import time import re import os import shutil import datetime import openpyxl import chromedriver_autoinstaller from openpyxl import Workbook from openpyxl.styles import PatternFill from openpyxl.utils.dataframe import dataframe_to_rows #custom_path = r"c:\Users\DELL\Documents\Self_Project" # Define the custom path where you want ChromeDriver to be installed #temp_path=chromedriver_autoinstaller.install() # Installs the ChromeDriver to a temporary directory and returns the path to that directory. #print("Temporary path",temp_path) #final_path = os.path.join(custom_path, "chromedriver.exe") # constructs and stores the full path to the ChromeDriver executable in the custom directory. #shutil.move(temp_path, final_path) # Moves the ChromeDriver executable from the temporary directory to the custom directory. #print("ChromeDriver installed at:", final_path) date_time = datetime.datetime.now().strftime("%m%d%Y_%H%M%S") file_name = f'CRTL_JP_staging_products_data_{date_time}.xlsx' products_summary = [] max_count_of_products=20 def scroll_and_click_view_more(driver,href): flag=False last_height = driver.execute_script("return window.pageYOffset + window.innerHeight") while True: try: driver.execute_script("window.scrollBy(0, 800);") time.sleep(4) new_height1 = driver.execute_script("return window.pageYOffset + window.innerHeight") try: WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div.product-tile'))) except Exception as e: new_height = driver.execute_script("return window.pageYOffset + window.innerHeight") if new_height1 == last_height and flag==False: print("Reached the end of the page and no product tiles were found: ",href) return "No product tiles found" else: last_height = new_height continue div_count = 0 flag=True #while div_count >= 0: response = driver.page_source soup = BeautifulSoup(response, 'html.parser') div_elements = soup.find_all('div', class_ = 'product-tile') div_count = len(div_elements) if(div_count > max_count_of_products): return(driver.page_source) driver.execute_script("window.scrollBy(0, 300);") time.sleep(3) new_height = driver.execute_script("return window.pageYOffset + window.innerHeight") #print(new_height) if new_height == last_height: print("Reached the end of the page: ",href) return("Reached the end of the page.") break else: last_height = new_height except Exception as e: print(e) break def prod_vitals(soup,title,url): count_of_items=1 products_data = [] # Array to store all product data for our excel sheet for div in soup.find_all('div', class_ = 'product-tile'): # Iterate over each individual product-tile div tag if count_of_items<=max_count_of_products: #print(title) list_price = 0 # Variable to store list price sale_price = 0 # Variable to store sale price discount1 = 0 # Variable to store discount% that is displayed on the site discount2 = 0 count_of_items = count_of_items+1; # Variable to store discount% calculated manually res = "Incorrect" # Variable to store result of discount1==discount2; initialized with Incorrect #pro_code = div.select('div.css-1fg6eq7 img')[0]['id'] pro_name = div.select('div.product-name a.css-avqw6d p.css-1d5mpur')[0].get_text() pdpurl = div.select('div.css-grdrdu a.css-avqw6d')[0]['href'] pdpurl = url+pdpurl element = div.select('div.salePriceWrapper span.salesPrice') # Extract all the salesPrice span elements inside salePriceWrapper div (Ideally only one should be present) "<span class="chakra-text salesPrice false css-1gi2nbo" data-qa="m_plp_txt_pt_price_upper_rl">¥179000 </span>" if element: # If sale price exists sale_price = float(element[0].get_text().replace('¥', '').replace(',', '')) # Extract the text of the first element in the list (which is the price including the dollar sign), removes the dollar sign with the replace method, and converts the result to a float res="Correct" element = div.select('div.comparablePriceWrapper span.css-l96gil') # Similarly extract list price if element: list_price = float(element[0].get_text().replace('¥', '').replace(',', '')) percent_off = div.select('div.salePriceWrapper span.css-181q1zt') # Similarly extract the DR% off text if percent_off: percent_off = percent_off[0].get_text() discount1 = re.search(r'\d+', percent_off).group() # Extract only the digits from the DR% using the search function from regex library and group them together; return type is a string discount1 = int(discount1) else: percent_off = 0 # Convert the DR% characters into integer discount2 = round(((list_price - sale_price) / list_price) * 100) # Calculate the correct DR% manually using list price and sale price if(discount1 == discount2): # Check if DR% on site matches with the expected DR% or not res = "Correct" # If yes then store result as correct else Incorrect else: res = "Incorrect" products_data.append({'Product Name': pro_name,'Product URL': pdpurl, 'Sale Price': '¥'+format(sale_price, '.2f'), 'List Price': '¥'+format(list_price, '.2f'), 'Discount on site': str(discount1)+'%', 'Actual Discount': str(discount2)+'%', 'Result': res}) # Append the extracted data to the list else: break time.sleep(5) df = pd.DataFrame(products_data, columns=['Product Name', 'Product URL', 'Sale Price', 'List Price', 'Discount on site', 'Actual Discount', "Result" ]) # Convert the array along with specific column names to a pandas DataFrame; A DataFrame is a two-dimensional labeled data structure with columns potentially of different types if os.path.exists(file_name): book = openpyxl.load_workbook(file_name) else: book = Workbook() default_sheet = book.active book.remove(default_sheet) sheet = book.create_sheet(title) for row in dataframe_to_rows(df, index=False, header=True): sheet.append(row) yellow_fill = PatternFill(start_color='FFFF00', end_color='FFFF00', fill_type='solid') green_fill = PatternFill(start_color='00FF00', end_color='00FF00', fill_type='solid') for row in range(2, sheet.max_row + 1): cell = sheet.cell(row=row, column=8) if cell.value == "Correct": cell.fill = green_fill else: cell.fill = yellow_fill book.save(file_name) def prod_count(soup,title): product_count_element = soup.find('p', {'class': 'chakra-text total-count css-120gdxl', 'data-qa': 'plp_txt_resultcount'}) if product_count_element: pro_count_text = product_count_element.get_text() pro_count_text = pro_count_text.replace(',', '') pro_count = re.search(r'\d+', pro_count_text).group() products_summary.append({'Category': title,'Total products available': pro_count, 'Total products scraped': max_count_of_products}) class WebScraper: def __init__(self): self.url = "https://staging1-japan.coach.com/?auto=true" self.reloved_url="https://staging1-japan.coach.com/shop/coachreloved/coach-reloved" self.driver = webdriver.Chrome() #options = Options() #options.add_argument("--lang=en") #self.driver = webdriver.Chrome(service=Service(r"c:\Users\DELL\Documents\Self_Project\chromedriver.exe"), options=options) def scrape(self): self.driver.get(self.url) self.driver.maximize_window() time.sleep(5) nav_count = 0 soup = BeautifulSoup(self.driver.page_source, 'html.parser') links = soup.find('div', {'class': 'css-wnawyw'}).find_all('a', {'class': 'css-ipxypz'}) hrefs = [link.get('href') for link in links] print(hrefs) for i,href in enumerate(hrefs): try: #print(href) element1 = WebDriverWait(self.driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{href}"]'))) #self.driver.execute_script("arguments[0].scrollIntoView(true);", element1) self.driver.execute_script("window.scrollTo(0, arguments[0].getBoundingClientRect().top + window.scrollY - 100);", element1) time.sleep(10) is_visible = self.driver.execute_script("return arguments[0].offsetParent !== null && arguments[0].getBoundingClientRect().top >= 0 && arguments[0].getBoundingClientRect().left >= 0 && arguments[0].getBoundingClientRect().bottom <= (window.innerHeight || document.documentElement.clientHeight) && arguments[0].getBoundingClientRect().right <= (window.innerWidth || document.documentElement.clientWidth);", element1) #print("Displayed: {element1.is_displayed()}, Visible: {is_visible}") WebDriverWait(self.driver, 30).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'a[href="{href}"]'))).click() time.sleep(3) response = scroll_and_click_view_more(self.driver,href) time.sleep(3) if(response!="No product tiles found" and response!="Reached the end of the page."): print("Link: \n",href) soup = BeautifulSoup(response, 'html.parser') PLP_title=links[nav_count].get('title') prod_vitals(soup,PLP_title,self.url) time.sleep(5) prod_count(soup,PLP_title) self.driver.execute_script("window.scrollBy(0, -500);") else: self.driver.execute_script("window.scrollTo(0,0);") #element2 = WebDriverWait(self.driver, 15).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{hrefs[i+1]}"]'))) #self.driver.execute_script("window.scrollTo(0, arguments[0].getBoundingClientRect().top + window.scrollY - 100);", element2) #time.sleep(3) #is_visible = self.driver.execute_script("return arguments[0].offsetParent !== null && arguments[0].getBoundingClientRect().top >= 0 && arguments[0].getBoundingClientRect().left >= 0 && arguments[0].getBoundingClientRect().bottom <= (window.innerHeight || document.documentElement.clientHeight) && arguments[0].getBoundingClientRect().right <= (window.innerWidth || document.documentElement.clientWidth);", element2) #print(f"Element href: {hrefs[i+1]}, Displayed: {element2.is_displayed()}, Visible: {is_visible}") time.sleep(3) continue except TimeoutException: print(f"Element with href {href} not clickable") self.driver.save_screenshot('timeout_exception.png') except Exception as e: print(f"An error occurred: {e}") nav_count+=1 df = pd.DataFrame(products_summary, columns=['Category', 'Total products available','Total products scraped']) book = openpyxl.load_workbook(file_name) sheet = book.create_sheet('Summary') for row in dataframe_to_rows(df, index=False, header=True): sheet.append(row) book.save(file_name) scraper = WebScraper() scraper.scrape() time.sleep(5) scraper.driver.quit()

Please find my updated code below as per @mehdi-ahmadi's comment and along with it the output and issues I am facing now

I initially tried with your first option but that was not working fine so decided to change the logic instead and tried with second option by getting anchor from nav each time. With this logic, the second link is clickable now ('/shop/new/women'). However, the last link is again getting timeout exception and not able to click on it(/shop/coachreloved/coach-reloved).

Please find the output below:

0 /feature/unlock-your-courage.html Reached the end of the page and no product tiles were found: /feature/unlock-your-courage.html nav_count 1 1 /shop/new/women nav_count 2 2 /shop/women nav_count 3 3 /shop/men/bags nav_count 4 4 /shop/collection nav_count 5 5 /shop/gift/women/bestseller nav_count 6 6 /shop/coachworld Reached the end of the page and no product tiles were found: /shop/coachworld nav_count 7 Element with href /shop/coachreloved/coach-reloved not clickable

I am attaching my updated class also below. Can you please help?

def scrape(self): self.driver.get(self.url) self.driver.maximize_window() time.sleep(5) nav_count = 0 while True: try: # Refresh the page source and parse it soup = BeautifulSoup(self.driver.page_source, 'html.parser') links = soup.find('div', {'class': 'css-wnawyw'}).find_all('a', {'class': 'css-ipxypz'}) hrefs = [link.get('href') for link in links] # Check if nav_count is within the range of hrefs if nav_count < len(hrefs): href = hrefs[nav_count] time.sleep(2) element = WebDriverWait(self.driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{href}"]'))) self.driver.execute_script("arguments[0].scrollIntoView(true);", element) time.sleep(3) WebDriverWait(self.driver, 30).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'a[href="{href}"]'))).click() time.sleep(3) print(nav_count, href) response = scroll_and_click_view_more(self.driver, href) time.sleep(3) if response != "No product tiles found" and response != "Reached the end of the page.": #print("Link: \n", href) soup = BeautifulSoup(response, 'html.parser') PLP_title = links[nav_count].get('title') prod_vitals(soup, PLP_title, self.url) time.sleep(5) prod_count(soup, PLP_title) self.driver.execute_script("window.scrollBy(0, -500);") time.sleep(2) else: self.driver.get(self.url) time.sleep(5) continue else: break except TimeoutException: print(f"Element with href {href} not clickable") self.driver.save_screenshot('timeout_exception.png') except Exception as e: print(f"An error occurred: {e}") finally: nav_count += 1 print("nav_count", nav_count)

mehdi_ahmadi · Accepted Answer · 2024-10-01 10:23:06Z

the problem is you get anchor tags from https://staging1-japan.coach.com/?auto=true and you did save them in list but when you are in this page https://staging1-japan.coach.com/feature/unlock-your-courage.html you want to click on anchor tag is in https://staging1-japan.coach.com/?auto=true and this is not possible so Maybe you say that both these anchor refer to the same address or are completely identical. But it has no meaning for the browser. Rather, these are two anchor on two separate pages, and you cannot click on something that is on the other page when you are on different page.

so the one solution is to load the page you read anchor from

in class WebScraper method scrape in for loop for i,href in enumerate(hrefs): you can add this code self.driver.get(self.url) sorry this is massy code i cant write all of that for you this is just part of your code to see how to change it

for i,href in enumerate(hrefs): try: ##########new line added########## self.driver.get(self.url) ################################## #print(href) element1 = WebDriverWait(self.driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{href}"]'))) #self.driver.execute_script("arguments[0].scrollIntoView(true);", element1) self.driver.execute_script("window.scrollTo(0, arguments[0].getBoundingClientRect().top + window.scrollY - 100);", element1) time.sleep(10)

second solution is to get anchor from nav each time from each page your in if you are sure anchor are in all pages and they are same

third is to open each link in new tab and back to the first tab in the end so your code will be :

def scrape(self): self.driver.get(self.url) # self.driver.maximize_window() time.sleep(5) nav_count = 0 soup = BeautifulSoup(self.driver.page_source, 'html.parser') links = soup.find('div', {'class': 'css-wnawyw'}).find_all('a', {'class': 'css-ipxypz'}) hrefs = [link.get('href') for link in links][-2:] mainWindow = self.driver.window_handles[0] for i,href in enumerate(hrefs): try: #print(href) self.driver.switch_to.window(mainWindow) # abslute_url = self.driver.get+href element1 = WebDriverWait(self.driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{href}"]'))) #self.driver.execute_script("arguments[0].scrollIntoView(true);", element1) self.driver.execute_script("window.scrollTo(0, arguments[0].getBoundingClientRect().top + window.scrollY - 100);", element1) time.sleep(10) # is_visible = self.driver.execute_script("return arguments[0].offsetParent !== null && arguments[0].getBoundingClientRect().top >= 0 && arguments[0].getBoundingClientRect().left >= 0 && arguments[0].getBoundingClientRect().bottom <= (window.innerHeight || document.documentElement.clientHeight) && arguments[0].getBoundingClientRect().right <= (window.innerWidth || document.documentElement.clientWidth);", element1) #print("Displayed: {element1.is_displayed()}, Visible: {is_visible}") WebDriverWait(self.driver, 30).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'a[href="{href}"]'))).send_keys(Keys.CONTROL+Keys.ENTER) newTab = self.driver.window_handles[-1] self.driver.switch_to.window(newTab) time.sleep(3) response = scroll_and_click_view_more(self.driver,href) time.sleep(3) if(response!="No product tiles found" and response!="Reached the end of the page."): print("Link: \n",href) soup = BeautifulSoup(response, 'html.parser') PLP_title=links[nav_count].get('title') prod_vitals(soup,PLP_title,self.url) time.sleep(5) prod_count(soup,PLP_title) self.driver.execute_script("window.scrollBy(0, -500);") else: self.driver.execute_script("window.scrollTo(0,0);") #element2 = WebDriverWait(self.driver, 15).until(EC.presence_of_element_located((By.CSS_SELECTOR, f'a[href="{hrefs[i+1]}"]'))) #self.driver.execute_script("window.scrollTo(0, arguments[0].getBoundingClientRect().top + window.scrollY - 100);", element2) #time.sleep(3) #is_visible = self.driver.execute_script("return arguments[0].offsetParent !== null && arguments[0].getBoundingClientRect().top >= 0 && arguments[0].getBoundingClientRect().left >= 0 && arguments[0].getBoundingClientRect().bottom <= (window.innerHeight || document.documentElement.clientHeight) && arguments[0].getBoundingClientRect().right <= (window.innerWidth || document.documentElement.clientWidth);", element2) #print(f"Element href: {hrefs[i+1]}, Displayed: {element2.is_displayed()}, Visible: {is_visible}") time.sleep(3) continue except TimeoutException: print(f"Element with href {href} not clickable") self.driver.save_screenshot('timeout_exception.png') except Exception as e: print(f"An error occurred: {e}") nav_count+=1 self.driver.close() df = pd.DataFrame(products_summary, columns=['Category', 'Total products available','Total products scraped']) book = openpyxl.load_workbook(file_name) sheet = book.create_sheet('Summary') for row in dataframe_to_rows(df, index=False, header=True): sheet.append(row) book.save(file_name)

this give you list of tabs

self.driver.window_handles

use this to switch to other tab

self.driver.switch_to.window(mainWindow)

use this to close current active tab (if current tab is only tab close method work like quit method )

self.driver.close()

update

to open new tab in browser you can use send_keys(Keys.CONTROL+Keys.ENTER) instead of clicke()

WebDriverWait(self.driver, 30).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'a[href="{href}"]'))).send_keys(Keys.CONTROL+Keys.ENTER)

or you can use execute_script as @Annie says in comments

self.driver.execute_script(f"window.open('{href}', '_blank');")

Collectives™ on Stack Overflow

Not able to click on element in a Selenium using Python automation script

1 Answer 1

update

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

update

Comments

Related