data scraping with python

Question

Hi I'm trying to scrape all the data points that are in this url https://m-selig.ae.illinois.edu/ads/coord/a18.dat

import pandas as pd import requests from bs4 import BeautifulSoup url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat" page = requests.get(url) x = BeautifulSoup(page.content, 'html.parser') df = pd.DataFrame(x) df.to_excel("air_foil.xlsx")

I've tried this code but x is just a long list that consist of one element.

dimay · Accepted Answer · 2022-06-10 19:26:45Z

First of all you need to get this data:

r = requests.get("https://m-selig.ae.illinois.edu/ads/coord/a18.dat") print(r.tetx)

you will see what inside (string).

Then you need create a list and put to Dataframe:

df = pd.DataFrame([el.split() for el in r.text.split("\r\n")[1:]])

buran · Accepted Answer · 2022-06-10 19:30:23Z

If you are going to use pandas, you can just use pd.read_table(url) or pd.read_csv(url), e.g.

import pandas as pd url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat" df = pd.read_csv(url, header=None, skiprows=1, sep=' ', engine='python') print(df) print(df.dtypes) df = pd.read_table(url, header=None, skiprows=1, sep=' ', engine='python') print(df) print(df.dtypes) df.to_excel('test.xlsx', index=False, header=False)

Collectives™ on Stack Overflow

data scraping with python

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related