1

I have the following script and I would like to retrieve the URL's from a text file rather than an array. I'm new to Python and keep getting stuck!

from bs4 import BeautifulSoup import requests urls = ['URL1', 'URL2', 'URL3'] for u in urls: response = requests.get(u) data = response.text soup = BeautifulSoup(data,'lxml') 
1

1 Answer 1

1

Could you please be a little more clear about what you want?

Here is a possible answer which might or might not be what you want:

from bs4 import BeautifulSoup import requests with open('yourfilename.txt', 'r') as url_file: for line in url_file: u = line.strip() response = requests.get(u) data = response.text soup = BeautifulSoup(data,'lxml') 

The file was opened with the open() function; the second argument is 'r' to specify we're opening it in read-only mode. The call to open() is encapsulated in a with block so the file is automatically closed as soon as you no longer need it open. The strip() function removes trailing whitespace (spaces, tabs, newlines) at the beginning and end of every line, for instant ' https://stackoverflow.com '.strip() becomes 'https://stackoverflow.com'.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.