0

Is there a way to get body of an html page, without the html tags?

curl and wget return the response, but contain HTML tags. We can strip the tags using sed and awk, but I am looking for an existing tool which could do it without sed and awk.

lynx is an option, but it does not come pre-installed.

Thanks !!

1

2 Answers 2

1

Why the aversion to installing an appropriate tool?

As an alternative to lynx, try w3m, e.g.

w3m -dump http://google.com 
Sign up to request clarification or add additional context in comments.

1 Comment

I don't have an aversion towards installing a tool. Just need to know if there is an existing tool before installing any other package
0

Converting HTML to plain text in PHP for e-mail lists a few tools, as does How can I Convert HTML to Text in C#? . However, if lynx -dump does what you want then that may the best tool to install.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.