Monday, December 12, 2011

Node.js and process.nextTick - why you don't use it

Lately I have been messing with a new tool in my hypothetical toolbox - node.js.  Node.js is a platform for developing applications on the server using javascript (based on the V8 javascript engine used in Google Chrome).  The paradigm prevalent in node is event driven programming.  Node is designed that a node process runs inside a single thread, and all of the IO calls (network, file access, database, etc) are asynchronous.  Most of this is done "under the hood."  One thing that node does NOT do, however, is run your code in parallel (like other thread-ready languages, such as python).  This has the benefit of freeing you from worrying about shared resources (read: memory), but the drawback of CPU intensive processes blocking the current execution.

Why is this important?  Well one of the mantras of node.js is to make sure you don't write blocking code.  Some developers hear this and try to find a way to make CPU intensive processes run asynchronously.  Browsing the documentation, they come across a method on the process object called "nextTick", which you can pass a callback to.  The documentation says that this pushes the execution of that method to the next loop around the event loop.  This often gets interpreted as "runs the function in parallel."  False.  It simply defers the execution of that method until the current execution is finished (read:  finished blocking the CPU).  This means that if you have some really CPU intensive code, don't attempt to use process.nextTick to prevent it from blocking requests.  There are some ways to mitigate this, such as spawning a new node process (not terribly efficient, but gets the job done).

One important thing to note, is that in control flow libraries like async.js, there are some misleading method names.  For example, async.js has a method called "parallel."  This is very misleading, because at its core, it uses process.nextTick.  Parallel is really used to coordinate the execution of several methods that run asynchronous IO calls.  The code, however, is always single threaded (although you cannot guarantee the order in which they run).  But if you are trying to parallel simple CPU heavy code blocks using this library, well, you are out of luck.

So, to sum this all up, if you are using process.nextTick, there is a better than good chance you are looking at your project the wrong way.  Remember that while IO is asynchronous, node.js code is not.  Even though it's typically very, very fast :)  I'm quickly growing to love this platform for development, it scales very well.  Nginx uses a very similar structure to produce an extremely resource efficient, highly scalable server.

Happy coding.

3 comments:

  1. In the below code the event s.emit('abc') is not blocked by the loop but appears every 2ms in the console. However, if process.nextTick(loop) is replaced by loop(), then the event is blocked and never seen.

    So in this example the CPU intensive loop does not block events as you suggest if process.nextTick is used. Loop execution sleeps between loop end and beginning to allow events to be processed.

    Therefore your article is entirely misleading.

    var ee = require('events').EventEmitter
    var s = new ee()

    s.on('abc', function() {
    console.log('abc');
    })

    setInterval(function() {s.emit('abc')}, 2)

    function loop() {
    console.log("1")
    process.nextTick(loop)
    }

    loop()

    ReplyDelete
  2. As a general rule of thumb, node developers are not writing code that loops every 2 milliseconds, and also expect to appropriately serve web requests in the same application. The spirit of the article was to make sure people knew about the single threaded nature of node.js, and understand that if you have TRULY cpu intensive code, you would want to separate that from your web application code, as opposed to being misled in to thinking process.nextTick is going to save you. Think about a web application that has 5k requests per second, and the same application were running a function every 2 milliseconds. That would be a terrible application.

    Therefore your comment is entirely annoying. And not very practical.

    ReplyDelete
  3. Also, to put another dagger into your argument, node v0.10.0 has altered how process.nextTick to run prior to the end of the current tick, as opposed to the next trip 'round the event loop. So, if you plug your code into node v0.10.0, it now creates a recursive call to loop, which will crash v8. Go ahead, I tried it. I'll be writing up another post about this soon.

    ReplyDelete