How to run untrusted code serverside?

Question

I'm trying to run untrusted javascript code in linux + node.js with the sandbox module but it's broken, all i need is to let users write javascript programs that printout some text. No other i/o is allowed and just plain javascript is to be used, no other node modules. If it's not really possible to do, what other language do you suggest for this kind of task? The minimal feature set i need is some math, regexes, string manipulation, and basic JSON functions. Scripts will run for let's say 5 seconds tops and then the process would be killed, how can i achieve that?

You could use Ruby. I could help you sandbox that. It has all of the features (regex, math, strings, and a JSON library). YOu could also always use a more low-level idea of sandboxing: You either use normal permissions, or use SELinux (but that seems to be going WAY overboard). — Linuxios
– Linuxios, Commented Jun 7, 2012 at 18:51
i'd prefer javascript ,but could you please explain how you'd go about sandboxing ruby code? — AlfredoVR
– AlfredoVR, Commented Jun 7, 2012 at 19:18
Here's a basic Gist: gist.github.com/2890984 that I just wrote. Ruby's global $SAFE variable, when set to 4 (its highest), will prevent just about everything besides what you want to allow. It will disallow I/O, networking, most access to other objects that it didn't create, etc. And then we can safely use the dreaded eval. The Thread part is because unless you do the sandboxing in another thread, your main thread will be subject to the same restrictions of $SAFE level 4. — Linuxios
– Linuxios, Commented Jun 7, 2012 at 19:23
What do you mean by 'sandbox module is broken' ? Are you sure you're referring to this module: http://gf3.github.com/sandbox/ / — alessioalex
– alessioalex, Commented Jun 8, 2012 at 7:24
Can you clarify how that module's behaviour is erratic? If you can post up the JS code you're using with it at the moment, perhaps that will help. — halfer
– halfer, Commented Jun 9, 2012 at 11:55

Jerska · Accepted Answer · 2021-09-09 09:25:59Z

All libraries I've seen mentioned in such questions (vm2, jailed) are trying to isolate the node process itself. Those kind of "jails" are constantly broken and highly dependent on future upgrades to node's standard library to not expose another attack vector.

An alternative would be to use the V8::Isolate class directly. It is meant to isolate JavaScript in Google Chrome & node, so you can expect it to be fully maintained, and more secure than you, I or a single library maintainer would ever be able to do. This class is only able to run "pure" JavaScript. It has the full ECMAScript implementation, but no browser API or node API.
This is what is used by Cloudflare for their Worker product.

deno, the new language developed by node's creator, has an ambition of sandboxing by default using exactly the same thing and exposing parts of the standard library depending on the flags you enable.

In a node environment, you can use isolated-vm. It's an amazing library that creates v8::Isolated subprocesses with the code you want to run in isolation.

It provides methods to pass values and functions to the isolate and back. This is not as trivial to use than most of the "jailing" libraries, but guarantees you an actual sandboxing of the JavaScript code.
As it's "pure" JavaScript, the only escapes are the ones you provide under the form of injected functions.
Also, it gets automatically updated with each node version, as it uses node's own v8::Isolate.
One of the main pains is that if you want to inject libraries in your script, you will likely need to use a package bundler like webpack in order to bundle everything in a single script that can be used by the library.

I personally use it to run user-provided code in a crawler to extract information from a webpage using user provided code, and it works wonders.

"This is what is used by Cloudflare"... What? Do you mean V8::Isolate? In that case the paragraph spacing should be fixed.

asvd · Accepted Answer · 2014-10-16 10:31:15Z

I've recently created a library for sandboxing the untrusted code, it seems to fit the demands (executes a code in a restricted process in case of Node.js, and in a Worker inside a sandboxed iframe for a web-browser):

https://github.com/asvd/jailed

There is an opportunity to export the given set of methods from the main application into the sandbox thus providing any custom API and set of privilliges (that feature was actually the reason why I decided to make a library from scratch). The mentioned maths, regexp and string -related stuff is provided by the JavaScript itself, anything additional may be explicitly exported from outside (like some function for communicating with the main application).

Tippa Raj · Accepted Answer · 2013-12-05 06:24:01Z

4

Docker.io Is an awesome new kid on the block, which uses LXCs and CGroups to create sandboxes.

Here is one implementation of an online gist (similar to codepad.org) using Docker and Go Lang

This just goes to demonstrate that one can safely run untrusted code written in many programming languages inside Docker Containers, including node.js

answered Dec 5, 2013 at 6:24

Tippa Raj

5944 silver badges8 bronze badges

4 Comments

AmitA Over a year ago

Thank you for sharing! I bundled the gist you mentioned into a Ruby gem here: github.com/vaharoni/trusted-sandbox. It allows running untrusted code with Ruby, though can be easily used to run JS by using the ruby racer. It allows setting disk quotas, limiting memory, CPU sharing, etc.

mac Over a year ago

Docker is not designed as an isolation tool. It is designed for easing "packaging" and distribution of application. That is not to say that is totally insecure (see here for example), but rather that security is not part of their design principles, and it is just plain wrong to indicate it as a "sanboxing tool". To the best of my knowledge the standard way to sandboxing untrusted code is using Virtual Machines or - for those interpreted languages that support it (e.g.: pypy) - the interpreter "sandboxing option"...

Yahya Uddin Over a year ago

Is this still true in 2018?

Vicary Over a year ago

This answer should be removed to prevent confusion, containers are already defined unfit as an isolation tool for anything security related.

user588779 · Accepted Answer · 2012-06-09 11:50:16Z

The basic idea of sandboxes is, you need variables predefined as globals to do stuff, so if you deny a script them by unsetting them, or replacing them with controlled one, it cannot escape. As long you don't forget anything.

First replace deny require() or replace it with something controlled. dont forget about process and "global" a.k.a "root", the difficult thing is not to forget anything, thats why its good to rely on someone else having built a sandbox ;-)

Rajagopal S · Accepted Answer · 2017-08-11 10:06:59Z

Know its pretty late to answer the question, guess the below tool might be a value add which is not mentioned in the above answers/comments.

Trying to implement similar use-case. After have gone through the web resources, https://www.npmjs.com/package/vm2 seems to be handling the sandbox environment(nodejs) pretty well.

It's pretty much satisfies the sandboxing features like restricting the access to builtin or external modules, data exchanges between sandbox, etc.

mpartel · Accepted Answer · 2013-08-02 10:10:39Z

If you can afford the performance hit, you could run the JS in a throwaway virtual machine with the appropriate CPU and memory limits.

Of course, then you are trusting the security of the VM solution. By using it together with an ordinary JS sandbox, you'd have two layers of security.

For an additional layer, put the sandbox on a different physical machine than your main app.

Nikolay Tsenkov · Accepted Answer · 2014-02-03 09:45:23Z

I am facing a similar problem right now and I'm reading only bad things about the sandbox module.

If you don't need anything specific to the node environment, I thing the best approach will be to use a headless browser such as PhantomJS or Chimera to use as a sandbox environment.

godzilla · Accepted Answer · 2018-02-24 10:16:27Z

A late answer but maybe an interesting idea.

Static code analysis => AST manipulation => Code generating

Static analysis will parse the AST of the source code. AST provides a common data structure to allow us to traverse and modify the source code.
Via AST manipulations, we can find out all the identifier references to any sensitive variables in the outer scopes. If we need, we can re-declare and initialize them at the beginning of the function body, so as to overwrite them. Thus the references from the inside to the outside are all in control.
Generating codes from AST is easy as well.

For instance, a function is as shown below:

function () { a = 1; window.b = 1; eval('window.c()'); }

Static analysis based on JS code parser enables us to insert variable declaration statements before the original function body:

function () { var a, window = {}, eval = function () {}; // variable overwriting a = 1; window.b = 1; eval('window.c()'); }

That's it.

More overwritings should be considered, such as eval(), new Function() and other global objects or APIs. And warnings during parsing should be well organized and reported.

Some related work in order:

esprima, ECMAScript parsing infrastructure for multipurpose analysis.
estraverse, ECMAScript JS AST traversal functions.
escope, ECMAScript scope analyzer.
escodegen, ECMAScript code generator.

My practice based on the above is function-sandbox.

Sascha Reuter · Accepted Answer · 2022-08-16 13:17:47Z

We were running into the same problem while working on one of our products. We wanted to allow users to provide their own custom (untrusted) code that we would run at specific key events of the product, e.g. a task being completed. Pretty much a better alternative to webhooks!

What we've ended up with was building a separate service using a combination of AWS Lambda, Rust & V8::Isolate and some other bits to make it not only secure but also really fast. We've also added our own integrations of fetch() and such, as V8 doesn't support Web or Node-specific APIs. This furthermore allowed us to do some neat stuff, like restricting the endpoints a script could talk to and even pre-authenticate requests by injecting a pre-configured Authorization header for specific requests/domains.

Instead of open-sourcing our work, we opted to offer the service to others as a hosted offering. The service is globally deployed, requires no setup, and is completely stateless by default! You can check it out at https://scriptable.run.

Aaron Digulla · Accepted Answer · 2012-06-09 12:49:18Z

Ask yourself these questions:

Are you one of the smartest persons on the planet?
Do you turn down job offers by Google, Mozilla and Kaspersky Lab routinely because it would bore you?
Does the "untrusted code" come from people working at the same company as you or from criminals and bored computer kids all over the globe?
Are you sure that node.js has no security holes that could leak through your sandbox?
Can you write perfect 100% error free code?
Do you know everything about JavaScript?

As you already know by your experiments with the sandbox module, writing your own sandbox isn't trivial. The main problem with sandboxes is that you must get everything right. One mistake will ruin your security completely which is why browser developers fight a constant battle with crackers all over the globe.

That said, simple sandboxes are pretty easy to do. First, you'll need to write your own JavaScript interpreter because you can't use the one from node.js because of eval() and require() (both would allow crackers to escape your sandbox).

The interpreter must make sure that the interpreted code cannot access anything besides the few global symbols that you provide. This means there can't be an eval() function, for example (or you must make sure that this function is only evaluated in the context of your own JavaScript interpreter).

Drawback of this approach: A lot of work and if you make a mistake in your interpreter, the crackers can leave the sandbox.

Another approach is to clean the code and run it with node.js's eval(). You can clean existing code by running a bunch of regexp's over it like /eval\s*[(]//g to remove malicious code parts.

Drawback of this approach: It's easy to make a mistake that will leave you vulnerable to an attack. For example, there might be mismatch between what regexp and what node.js think of as "whitespace". Some obscure unicode whitespace might be accepted by the interpreter but not by regexp which would allow an attacker to run eval().

My suggestion: Write a small demo test case that shows how the sandbox module is broken and have it fixed. It will save you a lot of time and effort and if there is a bug in the sandbox, it won't be your fault (well, not entirely at least).

eval = null; require = null; After here no more way to run both functions. I wont say doing a sandbox is easy, as you are right, forget one thing (like setTimers with implicit evals) and boom! But you're overcomplicating things quite a bit.
Well, sure. The problem is that the OP looks for a cheap way out which simply doesn't exist. Security is always expensive and bothersome. Your solution looks good until a customer shows up and demands to be able to call eval(). Some "smart guy" "fixes" the issue by adding var __ev = eval; eval = null; because "no one will ever figure that out " (security by obscurity).
why is eval actually bad? The eval'ed code cant access require and process either if its set to null or undefined first. So... I don't get it why you think that code must be prechecked to run in a sandbox. The sandbox must "just" be tight in regards to the environment it can access.
"Are you one of the smartest persons on the planet?" These questions are unnecessarily derogatory and don't add to your answer whatsoever. You don't need to be a genius to find a way to run foreign code serverside (you may notice a lot of people run web hosting services).
Also, security isn't "expensive", and rarely not worth the performance impact. It's precise, not difficult.

Collectives™ on Stack Overflow

How to run untrusted code serverside?

10 Answers 10

2 Comments

Comments

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

2 Comments

Comments

4 Comments

Comments

Comments

Comments

Comments

Comments

Comments

5 Comments

Linked

Related