I am a beginner in Python and have just coded a simple web scraper for a webpage article, output to a text file, using BeautifulSoup and List.
The code is working fine, but I'm wondering if anybody would know a more efficient way to achieve the same.
import requests page = requests.get('https://www.msn.com/en-sg/money/topstories/10-top-stocks-of-2017/ar-BBGgEyA?li=AA54rX&ocid=spartandhp') # 2. Parsing the page using BeautifulSoup import pandas as pd from bs4 import BeautifulSoup soup = BeautifulSoup(page.content, 'html.parser') # 3. Write the context to a text file all_p_tags = soup.findAll('p') # Put all <p> and their text into a list number_of_tags = len(all_p_tags) # No of <p>? x=0 with open('filename.txt', mode='wt', encoding='utf-8') as file: title = soup.find('h1').text.strip() # Write the <header> file.write(title) file.write('\n') for x in range(number_of_tags): word = all_p_tags[x].get_text() # Write the content by referencing each item in the list file.write(word) file.write('\n') file.close()
file.close()is unnecessary, js. \$\endgroup\$