Retrieve scrape urls from text file in BeautifulSoup

Question

I have the following script and I would like to retrieve the URL's from a text file rather than an array. I'm new to Python and keep getting stuck!

from bs4 import BeautifulSoup import requests urls = ['URL1', 'URL2', 'URL3'] for u in urls: response = requests.get(u) data = response.text soup = BeautifulSoup(data,'lxml')

Have a look into it, stackoverflow.com/a/3277516/4985099

sushanth
– sushanth

2020-07-02 15:01:56 +00:00
Commented Jul 2, 2020 at 15:01 — sushanth
– sushanth, Commented Jul 2, 2020 at 15:01

Stef · Accepted Answer · 2020-07-02 15:07:47Z

Could you please be a little more clear about what you want?

Here is a possible answer which might or might not be what you want:

from bs4 import BeautifulSoup import requests with open('yourfilename.txt', 'r') as url_file: for line in url_file: u = line.strip() response = requests.get(u) data = response.text soup = BeautifulSoup(data,'lxml')

The file was opened with the open() function; the second argument is 'r' to specify we're opening it in read-only mode. The call to open() is encapsulated in a with block so the file is automatically closed as soon as you no longer need it open. The strip() function removes trailing whitespace (spaces, tabs, newlines) at the beginning and end of every line, for instant ' https://stackoverflow.com '.strip() becomes 'https://stackoverflow.com'.

Collectives™ on Stack Overflow

Retrieve scrape urls from text file in BeautifulSoup

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related