3

I have a div whose id is "img-cont"

<div class="img-cont-box" id="img-cont" style='background-image: url("http://example.com/example.jpg");'> 

I want to extract the url in background-image using beautiful soup.How can I do it?

2 Answers 2

5

You can you find_all or find for the first match.

import re soup = BeautifulSoup(html_str) result = soup.find('div',attrs={'id':'img-cont','style':True}) if result is not None: url = re.findall('\("(http.*)"\)',result['style']) # return a list. 
Sign up to request clarification or add additional context in comments.

3 Comments

I have done that part.How to extract url from the result variable?
Thanks, It worked!!Could you please explain me this part "url = re.findall('("(http.*)")',result['style'])".
the result['style'] return the string 'background-image: url("http://example.com/example.jpg");' and the re.findall() is a regex search, to read more about regex check this link docs.python.org/2/library/re.html
1

Try this:

import re from bs4 import BeautifulSoup html = '''\ <div class="img-cont-box" \ id="img-cont" \ style='background-image: url("http://example.com/example.jpg");'>\ ''' soup = BeautifulSoup(html, 'html.parser') div = soup.find('div', id='img-cont') print(re.search(r'url\("(.+)"\)', div['style']).group(1)) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.