1

Many sources recommend a "filter input, escape output" way to manage content. I have been following this pretty well, but have happened upon a situation where I might want to violate this mantra but I am not sure of the potential costs for doing so.

I have a pubsub websocket server (node.js) whose purpose is to read data from another server and then send it to subscribed clients through a websocket.

What I am doing now is when a client receives data via the websocket server, the data is escaped for HTML in the client-side javascript. This seems fragile. I have limited security expertise, so it is possible that having client-side javascript do encoding may be a vulnerability I am unaware of. As a hypothetical example, maybe the client is an older browser that does not understand the escaping function fully, causing it to be vulnerable. It seems kind of strange to make a browser client do this sort of escaping in general.

I am considering moving the escaping of HTML from the client to the websocket server. The only encoding needed is to convert strings to html entities, but the strings would have to be unparsed, encoded, then reparsed and sent to the client. This might add significant overhead since each string can be several KB long. The websocket server is supposed to be just a pipe between a publisher and the subscribers and so is meant to be as fast as possible, but security is even more important.

Clients of the websocket server will ONLY need to display the data as html, so I am confident that the need to for a client to do something different with the data will not exist. (If a new sort of client must consume the data differently, I would setup a new websocket server or new subscription that does not encode the data to be sent.)

This doesn't quite violate the "filter input, escape output" idea because the original data remains untouched. The intermediate "client," the websocket server, would receive and secure the data before sending it to a browser, which is like how a normal webserver operates when it fetches from the database and then sends HTML to the browser.

The options I am considering now are:

  1. Keep encoding the data as html entities in the client-side javascript
  2. Encode as html entities on the websocket server before sending it to clients

Encoding on the websocket server might incur an overhead but seems like it would be safest thing to do since it happens server-side. Erring on the side of safety causes me to want to move away from encoding HTML in client-side js.

Encoding on the client seems the riskiest option to me due to being full of unknowns.

I use Underscore's escape function to https://underscorejs.org/docs/modules/escape.html to escape html in client-side js.

Should I move the html escaping to the websocket server or keep it in client-side javascript? Are there any vulnerabilities I should be aware of in keeping it in client-side javascript?

1 Answer 1

1

Simple Rule: Escape if and when you are displaying it and specifically for the context being used.

In other words, send it as plain-text, escape it as HTML if you're displaying it in an HTML context.

Premature escaping is how you end up with garbage like & showing up in your HTML because you lost track of if it was escaped already or not, and double-escaped it.

In practice this means that your input, transport, persistence in the database, and retransmission steps are all "raw". The final step where you insert that content into an HTML document is when you escape.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for the reply. This is the first time I have had to escape in client-side javascript. Is it appropriate to do it in this context? Sending potentially malicious text to be inserted into the page that then relies on an unknown client escaping it sounds a bit dangerous to me. But I recognize I am not sure of any actual problems this process might face.
The difference between a typical webserver and a websocket server is that the webserver is creating an html document, and the websocket server is not, but if I escape on the websocket server I KNOW it is safe for the client, but if I do it in the client I am not sure.
How is it "unknown client escaping"? Unless the client is running some extremely broken implementation of JavaScript then it'll be fine. If their JavaScript is broken then they likely have far bigger issues than XSS problems. Let the client decide if/when to escape. Remember with WebSocket the client might not actually be a web browser, it could very easily be a native application not using HTML at all, so let the client do what the client needs to do for the display method it's actually using.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.