Requests post on form not returning the generated page

Question

I'd like to scrape with python from this website: http://www.ssa.gov/oact/babynames/#ht=1

At the bottom, under the table of names, there are three tabs. I'm looking to POST to the form under the tab "Popular Names by Birth Year."

Here's my code:

from bs4 import BeautifulSoup import requests url = "http://www.ssa.gov/oact/babynames/" payload = { 'year': 2010, 'top': 50 } r = requests.post(url, data=payload) # returns status 200 soup = BeautifulSoup(r.text) print soup.prettify()

This only returns the original page, not the generated page I'm looking for.

What could be the reason it's not returning the generated page?

THANKS!

alecxe · Accepted Answer · 2014-06-06 17:46:55Z

You need to change the url for your POST request to http://www.ssa.gov/cgi-bin/popularnames.cgi.

Demo:

>>> from bs4 import BeautifulSoup >>> import requests >>> url = "http://www.ssa.gov/cgi-bin/popularnames.cgi" >>> payload = { ... 'year': 2010, ... 'top': 50 ... } >>> r = requests.post(url, data=payload) >>> soup = BeautifulSoup(r.text) >>> table = soup.find('table', summary='Popularity for top 50') >>> for row in table.find_all('tr')[1:4]: ... print [td.text for td in row.find_all('td')] ... [u'1', u'Jacob', u'Isabella'] [u'2', u'Ethan', u'Sophia'] [u'3', u'Michael', u'Emma']

Collectives™ on Stack Overflow

Requests post on form not returning the generated page

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related