how to select elements from the current node by selenium

Question

I want to select contact information by selenium on the website below:

http://buyersguide.recyclingtoday.com/search.

For matching the right information one by one, I want to select the rows first, and then select information from the rows. The simple code as below, my question now is how to select the information from each row. For example, company name, email.

Code:

from time import sleep from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait as wait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import NoSuchElementException import pandas as pd driver = webdriver.Chrome('D:\chromedriver_win32\chromedriver.exe') driver.get('http://buyersguide.recyclingtoday.com/search') rows = driver.find_elements_by_xpath('//*[@id="Body_tbl"]/tbody/tr') for row in rows: email = row.find_element_by_xpath('//*/tr/td[3]/a').text company=row.find_element_by_xpath('//*/tr/td[1]').text

Run the code as answers below, but I still face problem?

from time import sleep from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait as wait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import NoSuchElementException import pandas as pd driver = webdriver.Chrome('D:\chromedriver_win32\chromedriver.exe') driver.get('http://buyersguide.recyclingtoday.com/search') rows = driver.find_elements_by_xpath('//*[@id="Body_tbl"]/tbody/tr') records = [] for row in rows: company=row.find_element_by_xpath('./td[1]').text address = row.find_element_by_xpath('./td[2]').text contact= row.find_element_by_xpath('./td[3]//a').text number= row.find_element_by_xpath('./td[5]').text records.append((company,address,contact,number)) df = pd.DataFrame(records, columns=['company','number','address', 'contact'])

No content selected

yes, i need the whole data from all the pages,but the code i write seems no work, — Yan Zhang
– Yan Zhang, Commented Sep 7, 2018 at 12:54

Benjamin Loison · Accepted Answer · 2024-10-22 23:33:11Z

You can get details like,

You have to locate number of Row available in the table without Table Header,

This is Example as according to your HTML.

Example using Python:

rows = driver.find_elements_by_xpath("//td[@style='font-weight:bold;']//parent::tr") for row in rows: company=row.find_element_by_xpath('./td[1]').text address = row.find_element_by_xpath('./td[2]').text contact= row.find_element_by_xpath('./td[3]//a').text number= row.find_element_by_xpath('./td[5]').text

Example using Java:

List<WebElement> findData = driver.findElements("//td[@style='font-weight:bold;']//parent::tr"); for (WebElement webElement : findData) { String getValueofCompany = webElement.findElement(By.xpath("./td[1]")).getText(); String getValueofAddress = webElement.findElement(By.xpath("./td[2]")).getText(); String getValueofContact = webElement.findElement(By.xpath("./td[3]//a")).getText(); String getValueofPhoneNumber = webElement.findElement(By.xpath("./td[5]")).getText(); }

Email id text will not work because it does not contains email directly in text in many cases so we require value of href attribute e.g. '//tbody//tr[4]//td[3]/a'
@Amit Yes, I missed <a> for Emails. I have updated solution.
also we get an error as below File "<ipython-input-62-94e1b71ee87b>", line 10 rows = driver.find_elements_by_xpath('//td[@style='font-weight:bold;']//parent::tr') ^ SyntaxError: invalid syntax
Updated with Double " " ("//td[@style='font-weight:bold;']//parent::tr")

Amit Jain · Accepted Answer · 2018-09-07 07:35:43Z

The data which you want starts from

tr[3]//td[1] - contains company Name as text

tr[3]//td[3] - contains email but in href attribute

So looping over tr starts from index 3 to rows WebElement length

 rows = driver.find_elements_by_xpath('//*[@id="Body_tbl"]/tbody/tr') for index, element in enumerate(rows,start=2): companyName = rows.find_element_by_xpath("//tr[" + index + "]//td[1]") if companyName is not None: companyName.getText(); companyEmail = driver.find_element_by_xpath("//tr[" + index + "]//td[3]/a") if companyEmail is not None: companyEmail.get_attribute("href"); // this will give exact if email is there

Note - I was not able to test code, please take care of boundary conditions. Thanks

Benjamin Loison · Accepted Answer · 2024-10-22 23:33:50Z

1

You can use something like this:

for row in rows: email = row.find_element_by_xpath('.//td[3]/a').text company = row.find_element_by_xpath('.//td[1]').text

edited Oct 22, 2024 at 23:33

Benjamin Loison

5,7514 gold badges20 silver badges37 bronze badges

answered Sep 7, 2018 at 7:24

Kamal

2,5621 gold badge17 silver badges27 bronze badges

4 Comments

Yan Zhang Over a year ago

i print email or company to check,but return a name error can you tell me why, NameError: name 'company' is not defined

Kamal Over a year ago

print("{} {}".format(email, company)) this works for me in the for loop, please share your code if you still face issue..

Yan Zhang Over a year ago

ok, i share full code for reference, this is still have problem, please try to run

Kamal Over a year ago

As @Amit Jain pointed out, if you check the website, the first and last two rows do not have any data. So you need to run this rows=rows[2:-2] before going over your for loop. Hope it helps

Collectives™ on Stack Overflow

how to select elements from the current node by selenium

3 Answers 3

4 Comments

Comments

4 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

4 Comments

Related