0

A very simple example link https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm.

Even wget without any header information can successfully scrape the information.

However, casperjs just not work

var casper=require("casper").create(); var mouse=require("mouse").create(casper); var link="https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm"; casper.start().then(function() { this.open(link); this.wait(5000); }); casper.run(function(){ this.echo(this.getPageContent()).exit(); }); 

It always output

<html><head></head><body></body></html> 

add header info does not help, like below

this.open(link, { method: 'get', authority: 'www.accessdata.fda.gov', path: '/scripts/cder/daf/index.cfm', scheme: 'https', headers: { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'accept-encoding': 'gzip, deflate, br', 'accept-language': 'en-US,en;q=0.9,zh-TW;q=0.8,zh;q=0.7,zh-CN;q=0.6,ja;q=0.5', 'cache-control': 'max-age=0', 'sec-fetch-dest': 'document', 'sec-fetch-mode': 'navigate', 'sec-fetch-site': 'none', 'sec-fetch-user': '?1', 'upgrade-insecure-requests': '1' } }); 

I tried many combinations of header style but just not work.

However, it is noteworthy that the casperjs code above works for certain website like http://docs.casperjs.org/en/latest/selectors.html

1 Answer 1

0

I just noticed that add --ssl-protocol=any

casperjs --ssl-protocol=any yourScript.js 

solved the issue

this link has more explanation CasperJS/PhantomJS doesn't load https page

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.