0

Hi I'm trying to scrape all the data points that are in this url https://m-selig.ae.illinois.edu/ads/coord/a18.dat

import pandas as pd import requests from bs4 import BeautifulSoup url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat" page = requests.get(url) x = BeautifulSoup(page.content, 'html.parser') df = pd.DataFrame(x) df.to_excel("air_foil.xlsx") 

I've tried this code but x is just a long list that consist of one element.

2 Answers 2

1

First of all you need to get this data:

r = requests.get("https://m-selig.ae.illinois.edu/ads/coord/a18.dat") print(r.tetx) 

you will see what inside (string).

Then you need create a list and put to Dataframe:

df = pd.DataFrame([el.split() for el in r.text.split("\r\n")[1:]]) 
Sign up to request clarification or add additional context in comments.

Comments

1

If you are going to use pandas, you can just use pd.read_table(url) or pd.read_csv(url), e.g.

import pandas as pd url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat" df = pd.read_csv(url, header=None, skiprows=1, sep=' ', engine='python') print(df) print(df.dtypes) df = pd.read_table(url, header=None, skiprows=1, sep=' ', engine='python') print(df) print(df.dtypes) df.to_excel('test.xlsx', index=False, header=False) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.