Scraping single table out of multiple webpage in one go using R

Question

I'm new to R programming. My code to scrap a single web table below works.

library(XML) balsht <- "http://www.theedgemarkets.com/my/AA/balance_sheet?0=2593&exchange=KLSE" balstable <- readHTMLTable(balsht, header=T, which=1,stringsAsFactors=F) balstable write.table(balstable, "balsht-2593.txt", row.name=FALSE)

My question is I want to get 5 tables at one go which only differ by the number (ie 2593) in the url(the remaining characters in url are the same) and use that number as part of file name with write.table command.

For example, say the random numbers are 0081, 0126, 3379, 6149 & 9997.

Tried along solution suggested here Scraping multiple table out of webpage in R but got this error: Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached

Please shed some light how to go about solving it using a loop or any available line command. thank you.

Jota · Accepted Answer · 2015-07-26 12:44:07Z

You can use lapply:

tab.nums <- c("0081", "0126", "3379", "6149", "9997") # construct urls balsht <- paste0("http://www.theedgemarkets.com/my/AA/balance_sheet?0=", tab.nums, "&exchange=KLSE") # get list of tables balstables <- lapply(balsht, function(x) readHTMLTable(x, header=T, which=1,stringsAsFactors=F)) # save each table using relevant number lapply(seq_along(balsht), function(x) write.table(balstables[[x]], paste0("balsht", tab.nums[x], ".txt"), row.name=FALSE))

Good idea @Frank, it really do away "for" loop. Now I can start thinking for code to space out timing between data read.

Collectives™ on Stack Overflow

Scraping single table out of multiple webpage in one go using R

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related