I have a table which I wish to pick up all the links, go through the link and scrape the items within td class=horse.
The home page where the table is with all the links has the following code:
<table border="0" cellspacing="0" cellpadding="0" class="full-calendar"> <tr> <th width="160"> </th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=NSW">NSW</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=VIC">VIC</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=QLD">QLD</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=WA">WA</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=SA">SA</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=TAS">TAS</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=ACT">ACT</a></th> <th width="105"><a href="/FreeFields/Calendar.aspx?State=NT">NT</a></th> </tr> <tr class="rows"> <td> <p><span>FRIDAY 13 JAN</span></p> </td> <td> <p> <a href="/FreeFields/Form.aspx?Key=2017Jan13,NSW,Ballina">Ballina</a><br> <a href="/FreeFields/Form.aspx?Key=2017Jan13,NSW,Gosford">Gosford</a><br> </p> </td> <td> <p> <a href="/FreeFields/Form.aspx?Key=2017Jan13,VIC,Ararat">Ararat</a><br> <a href="/FreeFields/Form.aspx?Key=2017Jan13,VIC,Cranbourne">Cranbourne</a><br> </p> </td> <td> <p> <a href="/FreeFields/Form.aspx?Key=2017Jan13,QLD,Doomben">Doomben</a><br> </p> </td> I currently have the code to look up the table and print the links
from selenium import webdriver import requests from bs4 import BeautifulSoup #path to chromedriver path_to_chromedriver = '/Users/Kirsty/Downloads/chromedriver' #ensure browser is set to Chrome browser = webdriver.Chrome(executable_path= path_to_chromedriver) #set browser to Racing Australia Home Page url = 'http://www.racingaustralia.horse/' r = requests.get(url) soup=BeautifulSoup(r.content, "html.parser") #looks up to find the table & prints link for each page table = soup.find('table',attrs={"class" : "full-calendar"}). find_all('a') for link in table: print link.get('href') Wondering if anyone can assist in how I can get the code to click on all the links that are within the table & do the following to the each of the pages
g data = soup.findall("td",{"class":"horse"}) for item in g_data: print item.text Thanks in advance