I have the following code to scrape all the "href" attribute from all elements in the PlayStation webpage:
https://store.playstation.com/#!/es-...s-store%3Ahome
String url = "https://store.playstation.com/#!/es-es/ps4/cid=STORE-MSF75508-PS4CAT%7Cplatform~ps4%7Cname~asc/"; String url2 = "?smcid=nav%3Aps-store%3Ahome"; int juegos_totales = 0; ArrayList<String> all_links = new ArrayList<String>(); int z=0; for (int i=1; i<50; i++) { String urlPage = url+i+url2; System.out.println("Comprobando entrada: " + urlPage); if (getStatusConnectionCode(urlPage) == 200) { Document document = getHtmlDocument(urlPage); Elements entradas = document.select("div.gridViewportPaneWrapper li.cellGridGameStandard"); // Paseo cada una de las entradas for (Element elem : entradas) { Elements links = elem.getElementsByTag("a"); for (Element link : links ) { all_links.add(link.attr("href")); juegos_totales++; } z++; } System.out.println("Hay un total de " + juegos_totales + " juegos"); } } It scrapes nothing I don't know why...if I try to scrape the title PS4 it does. This code should scrape all the links of the webpage.
document? Please check this answer for more informationDocument document = getHtmlDocument(urlPage);but the following line is emptyElements entradas = document.select("div.gridViewportPaneWrapper li.cellGridGameStandard");. I am using the same code for parsing xbox.com and I don't have any problem, it also has login.documenthas all the information of the page you want? check this answer to read thedocument