0

I'm using the rvest package in R and would like to scrape some data from a table that only includes about 40% of the total information. I followed this blog post, but it doesn't specify how to scrape data when there is no difference in the HTML address for the different pages. This website is the one I'm trying to obtain some job listing data from.

I've successfully retrieved the data on the first page using this code:

job_page <- read_html( 'page_address' ) data_raw <- job_page %>% html_node('table') %>% html_text() 

Is it possible to scrape the webpage when the HTML address is NOT different for multiple pages of data? My hope is to use lapply to iterate over the multiple pages in some way.

1 Answer 1

1

Try this URL instead, it should give you all results in one page:

http://explore.msujobs.msstate.edu/cw/en-us/filter/?search-keyword=&job-mail-subscribe-privacy=agree&location=main%20campus%20-%20starkville%20ms&category=faculty&page=1&page-items=100 

You can open the developer tools in Chrome and select Network tab. You can examine the request and tweak searching parameters.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.