0

I use beautiful soup to parse to get data from website. My code as:

import requests response = requests.get('https://vneconomy.vn/tim-kiem.htm?q=doanh%20thu') htmlcontent = response.content from bs4 import BeautifulSoup results_soup = BeautifulSoup(htmlcontent,'html.parser') #print(results_soup) search_results = results_soup.find('div', class_="story__header") if search_results is not None: for result in search_results: Title.append(result.find("h3")) import pandas as pd df= pd.DataFrame({'Title':Title}) print(df) 

What I want is to get the title from search result page. Such as: "Apple đạt doanh thu gần 1 tỷ USD/ngày; Doanh thu tài chính đột biến, HNG báo lãi tăng 132%..."

But it returns no data. Could you please advise on this case? Thank you!

2
  • the following data could be retrieved from selenium Commented May 24, 2021 at 9:05
  • No need of selenium another user has posted answer can be done using finding xhr from chrome developer mode Commented May 24, 2021 at 9:34

1 Answer 1

1

Use the url that fetches that data directly:

import requests import pandas as pd url ='https://search.hemera.com.vn/search/1/doanh%20thu/1' jsonData = requests.get(url).json() df = pd.DataFrame(jsonData) 

Output:

print(df['Title']) 0 Apple đạt <em>doanh</em> <em>thu</em> gần 1 tỷ... 1 <em>Doanh</em> <em>thu</em> tài chính đột biến... 2 Covid-19 là "lửa <em>thử</em> vàng" cho các <e... 3 Phim hay nhất Oscar 2021 đạt <em>doanh</em> <e... 4 <em>Doanh</em> <em>thu</em> mảng xây dựng giảm... 5 Quý 1/2021 Masan Group đạt <em>doanh</em> <em>... 6 Bán thương hiệu smartphone Honor, <em>doanh</e... 7 <em>Doanh</em> <em>thu</em> cải thiện, Sabeco ... 8 <em>Doanh</em> <em>thu</em> tăng, Habeco báo l... 9 Facebook lo <em>doanh</em> <em>thu</em> sụt gi... 10 Hộ, cá nhân kinh <em>doanh</em> phải nộp thuế ... 11 Viettel Global: <em>Doanh</em> <em>thu</em> qu... 12 Không hợp nhất <em>doanh</em> <em>thu</em> từ ... 13 Tp.HCM tăng <em>thu</em> phí hạ tầng cảng biển... 14 Hoạt động đa cấp đạt <em>doanh</em> <em>thu</e... 15 Kiểm soát Covid-19 tốt, <em>doanh</em> <em>thu... 16 Quý 1/2021, <em>doanh</em> <em>thu</em> vàng m... 17 <em>Doanh</em> <em>thu</em> smartphone bùng nổ... 18 Thị trường tăng, tiền mới nhiều, <em>doanh</em... 19 <em>Doanh</em> <em>thu</em> của Vinhomes tăng ... Name: Title, dtype: object 

Found here:

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

Please tell me, How did you find out this url? @chitown88
Go to dev tools (shift-ctrl-i). Under Network -> xhr (usually its under xhr, but you might find it in the others), you can see all the requests beginmade and see where they are made from. If you look in Preview, just try to find the data you want. If you find it, look in Headers and you'll find what you need to make the request
@chitown88 thank you for your solution, however, I cannot find this url (I go to Network already but cannot find the url), could you please give me more guide, because I am so new with developer knowledge.
did you refresh the page? look for the the Name 1. See image in post
be sure to accept the solution if it fit your needs

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.