How to access tag's attribute value with BeautifulSoup

Question

I'm using BeautifulSoup and requests for web scraping. I know how to extract attribute between tags, but if what I want is the number '4.31' below in a tag, any idea how to get it?

<div class="starRating" title="4.31"> <svg ... </svg> </div>

I've tried:

soup.find('div',{'class':'starRating'}) soup.find('title')

which returns nothing, so the number is basically the tag...

Dan-Dev · Accepted Answer · 2019-12-28 21:10:13Z

You can read the attribute title value like this:

from bs4 import BeautifulSoup response = """ <html> <div class="starRating" title="4.31"> <svg> </svg> </div> </html> """ soup = BeautifulSoup(response, 'lxml') print(soup.find('div', {'class': 'starRating'})['title'])

Outputs:

4.31

See https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes `

A tag may have any number of attributes. The tag <b id="boldest"> has an attribute “id” whose value is “boldest”. You can access a tag’s attributes by treating the tag like a dictionary

ggorlen · Accepted Answer · 2019-12-28 21:22:05Z

You can use a lambda to query elements with the matching title attribute, then use the ["title"] key to extract the data you want:

>>> soup.find(lambda x: x.name == "div" and "title" in x.attrs)["title"] '4.31'

Or use a CSS selector:

>>> soup.select_one("div[title]") <div class="starRating" title="4.31"></div>

Even easier, use the target attribute as a kwarg:

>>> soup.find("div", title=True) <div class="starRating" title="4.31"></div>

Attempting to pull the title attribute out of an element that doesn't have it will raise a KeyError, so it's worth filtering ahead of time. Use find_all or select if you'd like an iterable of multiple results.

Collectives™ on Stack Overflow

How to access tag's attribute value with BeautifulSoup

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related