0

I want to scrape some pages of this site: Marketbook.ca So I used for that mechanize. but it does not load pages properly. and it returns a page with empty body, like in the following code:

require 'mechanize' agent = Mechanize.new agent.user_agent_alias = 'Linux Firefox' agent.get('http://www.marketbook.ca/list/list.aspx?ETID=1&catid=1001&LP=MAT&units=imperial') 

What could be the issue here?

1 Answer 1

1

Actually this page requires JS engine to display the content:

<noscript>Please enable JavaScript to view the page content.</noscript> 

Mechanize doesn't handle pages with JS, so you'd better choose another options like Selenium or WATIR. Both need a real web browser to manipulate.

Another option for you is to look through included JS scripts and figure out where data comes from and query that web resource if it's possible.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, I had the same thoughs, but when I disabled Javascript on Chrome it keeps loading the full page, that's what confused me, but on firefox it shows the noscript content and does not load the rest of the page.I don't know why that happen on Chrome.
this problem with chrome is because I used Web Developer extension to disable Javascript, which does not work, but when I disabled Javascript directly from the settings, it disable Javascript properly, and the above page does not load.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.