html2text is a very simple script that uses DOM methods to convert HTML into a format similar to what would be rendered by a browser - perfect for places where you need a quick text representation. For example:
<html> <title>Ignored Title</title> <body> <h1>Hello, World!</h1> <p>This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly. <p>Even mismatched tags.</p> <div>A div</div> <div>Another div</div> <div>A div<div>within a div</div></div> <a href="http://foo.com">A link</a> </body> </html>Will be converted into:
Hello, World! This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly. Even mismatched tags. A div Another div A div within a div [A link](http://foo.com) See the original blog post or the related StackOverflow answer.
You can use Composer to add the package to your project:
{ "require": { "soundasleep/html2text": "~1.1" } }And then use it quite simply:
$text = \Soundasleep\Html2Text::convert($html);You can also include the supplied html2text.php and use $text = convert_html_to_text($html); instead.
| Option | Default | Description |
|---|---|---|
| ignore_errors | false | Set to true to ignore any XML parsing errors. |
| drop_links | false | Set to true to not render links as [http://foo.com](My Link), but rather just My Link. |
| char_set | 'auto' | Specify a specific character set. Pass multiple character sets (comma separated) to detect encoding, default is ASCII,UTF-8 |
Pass along options as a second argument to convert, for example:
$options = array( 'ignore_errors' => true, // other options go here ); $text = \Soundasleep\Html2Text::convert($html, $options);Some very basic tests are provided in the tests/ directory. Run them with composer install && vendor/bin/phpunit.
You need to install the PHP XML extension for your PHP version. e.g. apt-get install php7.4-xml
html2text is licensed under MIT, making it suitable for both Eclipse and GPL projects.
Also see html2text_ruby, a Ruby implementation.
