Skip to content

EchoShoot/parsel

 
 

Repository files navigation

Parsel

Tests Supported Python versions PyPI Version Coverage report

Parsel is a BSD-licensed Python library to extract data from HTML, JSON, and XML documents.

It supports:

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo):

>>> from parsel import Selector >>> text = """  <html>  <body>  <h1>Hello, Parsel!</h1>  <ul>  <li><a href="http://example.com">Link 1</a></li>  <li><a href="http://scrapy.org">Link 2</a></li>  </ul>  <script type="application/json">{"a": ["b", "c"]}</script>  </body>  </html>""" >>> selector = Selector(text=text) >>> selector.css('h1::text').get() 'Hello, Parsel!' >>> selector.xpath('//h1/text()').re(r'\w+') ['Hello', 'Parsel'] >>> for li in selector.css('ul > li'): ... print(li.xpath('.//@href').get()) http://example.com http://scrapy.org >>> selector.css('script::text').jmespath("a").get() 'b' >>> selector.css('script::text').jmespath("a").getall() ['b', 'c']

About

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.4%
  • Makefile 1.6%