-1

I'm working on a Python bot that needs to interact with Google Ads, but I'm encountering a couple of issues:

The bot fails to select elements related to Google Ads. The browser window closes automatically after a short period, which is not intended. Here's what I've tried so far:

I've used various methods from Selenium to locate and interact with elements. I've checked for updates on all relevant libraries and ensured that my browser driver is up to date. I've implemented exception handling to catch any errors that might indicate what's going wrong. Despite these efforts, the issues persist. The bot is supposed to run a longer session, and interact with Google Ads without manual intervention. Here is a simplified version of my code:

import time import pandas as pd from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.service import Service from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import NoSuchElementException from urllib.parse import urlparse def is_allowed_domain(url, allowed_domains): domain = urlparse(url).netloc return any(domain.endswith(allowed_domain) for allowed_domain in allowed_domains) def search_and_save_ads_to_excel(search_keyword, chromedriver_path, allowed_domains, output_excel_path): # Set Chrome options to open in guest profile chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("--guest") # Set the path to the ChromeDriver executable service = Service(chromedriver_path) # Open Chrome browser with the specified options driver = webdriver.Chrome(service=service, options=chrome_options) try: # Maximize the browser window driver.maximize_window() # Navigate to Google driver.get("https://www.google.com") # Find the search box and enter the search keyword search_box = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.NAME, "q"))) search_box.send_keys(search_keyword) search_box.submit() # Wait for search results to load WebDriverWait(driver, 20).until(EC.visibility_of_any_elements_located((By.CSS_SELECTOR, ".g"))) # Find all search results search_results = driver.find_elements(By.CSS_SELECTOR, ".g") ad_data = [] for result in search_results: try: # Look for elements with specific ad attributes (e.g., data-text-ad) ad_elements = result.find_elements(By.XPATH, ".//div[@data-text-ad='1']") if ad_elements: ad_url = result.find_element(By.CSS_SELECTOR, "a").get_attribute("href") if is_allowed_domain(ad_url, allowed_domains): ad_title = result.find_element(By.CSS_SELECTOR, "h3").text ad_data.append({"Title": ad_title, "URL": ad_url}) print(f"Found Ad Title: {ad_title}") print(f"Found Ad URL: {ad_url}") # Click on the ad result.find_element(By.CSS_SELECTOR, "a").click() WebDriverWait(driver, 20).until(EC.number_of_windows_to_be(2)) # Wait for ad page to open driver.switch_to.window(driver.window_handles[-1]) # Switch to the new window/tab WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.TAG_NAME, "body"))) # Ensure ad page has loaded time.sleep(10) # Additional wait for ad page to load driver.close() # Close the ad page driver.switch_to.window(driver.window_handles[0]) # Switch back to the main search results window except (NoSuchElementException, Exception) as e: print(f"Error processing result: {e}") # Save ad data to Excel df = pd.DataFrame(ad_data) df.to_excel(output_excel_path, index=False) print(f"Ad data saved to {output_excel_path}") finally: driver.quit() # Example usage (for educational purposes only) search_keyword = "Sikka Kaamya Greens" chromedriver_path = r"C:/Users/J.A.S/Desktop/New folder/chromedriver.exe" # Your ChromeDriver path (use raw string literal) allowed_domains = ["sikka.ixapl.com", "99acres.com"] # List of allowed domains output_excel_path = "ad_data.xlsx" # Output Excel file path search_and_save_ads_to_excel(search_keyword, chromedriver_path, allowed_domains, output_excel_path) 

Can anyone provide guidance on how to resolve this issue? I have been diligently working on it both day and night without success. Any suggestions would be greatly appreciated. Thank you!

1
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. Commented Jun 10, 2024 at 6:39

1 Answer 1

1

If you aren't getting an exception and don't know how to troubleshoot, try line-by-line debugging. It is easy to use pdb to step through your code, or learn to use your IDE's debugging tools. You can even simply open a Python REPL and paste in your commands one by one. Learning to step through your code will benefit you greatly.


In this case it's pretty simple--since your program isn't printing anything, there are only three possibilities: is_allowed_domain() is never returning True, ad_elements are not being found, or search_results are not being found. The issue is the line ad_elements = result.find_elements(By.XPATH, ".//div[@data-text-ad='1']") -- the search returns nothing and so ad_elements is an empty list. This is because you are searching from a result element but ads are not children of search result elements, they are search results. (Assuming we are talking about the same thing--I am defining an "ad" as a Google result marked with "Sponsored").

To correct this, search for ads then iterate through them:

search_box.send_keys(search_keyword) search_box.submit() WebDriverWait(driver, 20).until(EC.visibility_of_any_elements_located((By.CSS_SELECTOR, ".g"))) ad_elements = driver.find_elements(By.XPATH, ".//div[@data-text-ad='1']") for ad_element in ad_elements: # note: there may be multiple links in the same ad element ad_links = ad_element.find_elements(By.CSS_SELECTOR, "a") ad_link = ad_links[0] ad_url = ad_link.get_attribute("href") if is_allowed_domain(ad_url, allowed_domains): # finding the title is not as simple as locating h3 children ad_title = ad_element.find_elements(...) ... 

BTW, I get an ElementClickInterceptedException when trying to directly click() an <a> element on the search results page--I would recommend using driver.switch_to.new_window(), driver.get(ad_url) instead.


As a side note, the line except (NoSuchElementException, Exception) as e: makes no sense, because NoSuchElementException is a subclass of Exception and would be caught by using Exception alone. If you want to have different behavior for NoSuchElementException and other Exception types (which would be a good idea), use separate excepts:

except NoSuchElementException as e: # do something except Exception as e: # do something else 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.