3

I am trying to search for elements on a webpage and have used various methods, including text and XPath. It seems that the timeout option does not work the way I expected, and no exception is raised even if the text or XPath is not present in the website's HTML. The code below demonstrates why I am confused.

There are three page.find queries. The first one just removes the cookies pop-up, so it can be ignored.

The second query:

page.find(("emailOrGhin", timeout = 1) 

works, and as shown in the results, it executes in about 0.01 seconds, which is great.

The third query is where I have questions:

page.find("emailOrGhinnowayjunkXyq", timeout=1) 

As shown in the results, this statement seems takes 1.52 seconds to execute, which is longer than the specified timeout of 1 second. Why does this happen?

Even more interesting, when the query fails to find a matching element, no exception (like ElementNotFound) is raised. I tried both except asyncio.TimeoutError and except Exception as e, but the statement completes successfully and returns None. I understand I can check for None, but I expected an exception to be triggered. Unfortunately, I cannot find a list of Nodriver exceptions anywhere.

I am hoping for insight/advice on two points:

  1. Why the timeout option behaves this way.

  2. What kind of error can be trapped if an element is not found, since the statement currently returns None.

Here is the relevant code:

from datetime import timedelta from time import sleep import asyncio import datetime import sys import nodriver as uc passed_website = "https://www.google.com" # Main Processing Controller! = Mainline def a_main_processor(): loop =asyncio.new_event_loop() print("Line 20") driver = loop.run_until_complete(b_main_line_processing()) driver.stop() print("Line 28") return driver # Main Line Processing Controller! = Mainline async def b_main_line_processing(): driver = await uc.start() # maximize=True) # Start a new Chrome instance await driver.main_tab.maximize() sleep(1) ghin_url = "http://www.ghin.com/lookup.aspx" try: page = await driver.get(ghin_url, new_tab=False, new_window=False) except asyncio.TimeoutError: print("C100 - error", e) return driver # Find and Select Reject All Cookie Settings try: cookies = await page.find("Reject All") await cookies.click() except asyncio.TimeoutError: print("C105 - error:0, e") return driver # Find UserName Entry field name try: timepoint = datetime.datetime.now() user_name = await page.find("emailOrGhin", timeout = 1) print("First checkpoint =", str(datetime.datetime.now() - timepoint)[-9:-4], user_name) except Exception as e: print("First error took", str(datetime.datetime.now() - timepoint)[-9:-4], e) return driver # Pass bogus name to trip error try: timepoint = datetime.datetime.now() user_name = await page.find("emailOrGhinnowayjunkXyq", timeout=1) print("Second checkpoint =", str(datetime.datetime.now() - timepoint)[-9:-4], user_name) #except Exception as e: except asyncio.TimeoutError: print("Second error took", str(datetime.datetime.now() - timepoint)[-9:-4], e) return driver driver = a_main_processor() print("Line 62") 

The results are:

Line 20 First checkpoint = 00.01 <input type="text" id="emailOrGhin" name="emailOrGhin" aria-describedby="emailError" aria-invalid="false" aria-required="true" value="" style="margin-bottom: 0px;"></input> Second checkpoint = 01.52 None Line 28 Line 62 successfully removed temp profile C:\Users\pinev\AppData\Local\Temp\uc_5yaidgqs Process finished with exit code 0 

The website is live, and the code should work.

New contributor
Shankboy is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
1
  • This seems like a better question for the module author. The documentation doesn't say much about the timeout option, just that it will retry for that long. Commented 2 days ago

1 Answer 1

0
  1. Why the timeout option behaves this way.

It sleeps 0.5 seconds between retrying finding the element:

https://github.com/ultrafunkamsterdam/nodriver/blob/65562facd0f9d7f659085ba67458ccf5b6d7bdb0/nodriver/core/tab.py#L297

I wouldn't expect a very precise timeout to be achievable. You can override its sleep interval with something lower, but there's probably not much point.

  1. What kind of error can be trapped if an element is not found, since the statement currently returns None.

That's just the way the API works. If an element is not found, it returns None:

https://github.com/ultrafunkamsterdam/nodriver/blob/65562facd0f9d7f659085ba67458ccf5b6d7bdb0/nodriver/core/tab.py#L243

You can wrap the find() method in your own function/method and raise an exception if it returns None.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.