data scraping with python

Question

Hi I'm trying to scrape all the data points that are in this url https://m-selig.ae.illinois.edu/ads/coord/a18.dat

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat"

page = requests.get(url)
x = BeautifulSoup(page.content, 'html.parser')

df = pd.DataFrame(x)
df.to_excel("air_foil.xlsx")

I've tried this code but x is just a long list that consist of one element.

dimay · Accepted Answer · 2022-06-10 19:26:45Z

1

First of all you need to get this data:

r = requests.get("https://m-selig.ae.illinois.edu/ads/coord/a18.dat")
print(r.tetx)

you will see what inside (string).

Then you need create a list and put to Dataframe:

df = pd.DataFrame([el.split() for el in r.text.split("\r\n")[1:]])

answered Jun 10, 2022 at 19:26

dimay

2,8341 gold badge17 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

buran · Accepted Answer · 2022-06-10 19:30:23Z

1

If you are going to use pandas, you can just use pd.read_table(url) or pd.read_csv(url), e.g.

import pandas as pd

url = "https://m-selig.ae.illinois.edu/ads/coord/a18.dat"

df = pd.read_csv(url, header=None, skiprows=1, sep='  ', engine='python')
print(df)
print(df.dtypes)
df =  pd.read_table(url, header=None, skiprows=1, sep='  ', engine='python')
print(df)
print(df.dtypes)
df.to_excel('test.xlsx', index=False, header=False)

edited Jun 10, 2022 at 19:30

answered Jun 10, 2022 at 19:20

buran

14.4k13 gold badges45 silver badges76 bronze badges

Collectives™ on Stack Overflow

data scraping with python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related