1

I'm using BeautifulSoup and requests for web scraping. I know how to extract attribute between tags, but if what I want is the number '4.31' below in a tag, any idea how to get it?

<div class="starRating" title="4.31"> <svg ... </svg> </div> 

I've tried:

soup.find('div',{'class':'starRating'}) soup.find('title') 

which returns nothing, so the number is basically the tag...

2 Answers 2

1

You can read the attribute title value like this:

from bs4 import BeautifulSoup response = """ <html> <div class="starRating" title="4.31"> <svg> </svg> </div> </html> """ soup = BeautifulSoup(response, 'lxml') print(soup.find('div', {'class': 'starRating'})['title']) 

Outputs:

4.31 

See https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes `

A tag may have any number of attributes. The tag <b id="boldest"> has an attribute “id” whose value is “boldest”. You can access a tag’s attributes by treating the tag like a dictionary

Sign up to request clarification or add additional context in comments.

Comments

0

You can use a lambda to query elements with the matching title attribute, then use the ["title"] key to extract the data you want:

>>> soup.find(lambda x: x.name == "div" and "title" in x.attrs)["title"] '4.31' 

Or use a CSS selector:

>>> soup.select_one("div[title]") <div class="starRating" title="4.31"></div> 

Even easier, use the target attribute as a kwarg:

>>> soup.find("div", title=True) <div class="starRating" title="4.31"></div> 

Attempting to pull the title attribute out of an element that doesn't have it will raise a KeyError, so it's worth filtering ahead of time. Use find_all or select if you'd like an iterable of multiple results.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.