Web crawling with Jsoup doesn't scrape what I want in java

Question

I have the following code to scrape all the "href" attribute from all elements in the PlayStation webpage:

https://store.playstation.com/#!/es-...s-store%3Ahome

 String url = "https://store.playstation.com/#!/es-es/ps4/cid=STORE-MSF75508-PS4CAT%7Cplatform~ps4%7Cname~asc/"; String url2 = "?smcid=nav%3Aps-store%3Ahome"; int juegos_totales = 0; ArrayList<String> all_links = new ArrayList<String>(); int z=0; for (int i=1; i<50; i++) { String urlPage = url+i+url2; System.out.println("Comprobando entrada: " + urlPage); if (getStatusConnectionCode(urlPage) == 200) { Document document = getHtmlDocument(urlPage); Elements entradas = document.select("div.gridViewportPaneWrapper li.cellGridGameStandard"); // Paseo cada una de las entradas for (Element elem : entradas) { Elements links = elem.getElementsByTag("a"); for (Element link : links ) { all_links.add(link.attr("href")); juegos_totales++; } z++; } System.out.println("Hay un total de " + juegos_totales + " juegos"); } }

It scrapes nothing I don't know why...if I try to scrape the title PS4 it does. This code should scrape all the links of the webpage.

have you checked what's inside document? Please check this answer for more information — eLRuLL
– eLRuLL, Commented Jan 10, 2017 at 14:45
@eLRuLL Inside the document is all the html Document document = getHtmlDocument(urlPage); but the following line is empty Elements entradas = document.select("div.gridViewportPaneWrapper li.cellGridGameStandard");. I am using the same code for parsing xbox.com and I don't have any problem, it also has login. — JetLagFox
– JetLagFox, Commented Jan 10, 2017 at 15:09
have you checked that document has all the information of the page you want? check this answer to read the document — eLRuLL
– eLRuLL, Commented Jan 10, 2017 at 15:25
@eLRuLL You are right the HTML is incomplete. I am reading and trying to solve the problem with the links you have shared, but I'm not able to deal with it. What I don't understand is why is working on xbox.com where there is also a login. — JetLagFox
– JetLagFox, Commented Jan 10, 2017 at 17:26

Collectives™ on Stack Overflow

Web crawling with Jsoup doesn't scrape what I want in java

0

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Linked