Monday, December 12, 2011

Node.js and process.nextTick - why you don't use it

Lately I have been messing with a new tool in my hypothetical toolbox - node.js.  Node.js is a platform for developing applications on the server using javascript (based on the V8 javascript engine used in Google Chrome).  The paradigm prevalent in node is event driven programming.  Node is designed that a node process runs inside a single thread, and all of the IO calls (network, file access, database, etc) are asynchronous.  Most of this is done "under the hood."  One thing that node does NOT do, however, is run your code in parallel (like other thread-ready languages, such as python).  This has the benefit of freeing you from worrying about shared resources (read: memory), but the drawback of CPU intensive processes blocking the current execution.

Why is this important?  Well one of the mantras of node.js is to make sure you don't write blocking code.  Some developers hear this and try to find a way to make CPU intensive processes run asynchronously.  Browsing the documentation, they come across a method on the process object called "nextTick", which you can pass a callback to.  The documentation says that this pushes the execution of that method to the next loop around the event loop.  This often gets interpreted as "runs the function in parallel."  False.  It simply defers the execution of that method until the current execution is finished (read:  finished blocking the CPU).  This means that if you have some really CPU intensive code, don't attempt to use process.nextTick to prevent it from blocking requests.  There are some ways to mitigate this, such as spawning a new node process (not terribly efficient, but gets the job done).

One important thing to note, is that in control flow libraries like async.js, there are some misleading method names.  For example, async.js has a method called "parallel."  This is very misleading, because at its core, it uses process.nextTick.  Parallel is really used to coordinate the execution of several methods that run asynchronous IO calls.  The code, however, is always single threaded (although you cannot guarantee the order in which they run).  But if you are trying to parallel simple CPU heavy code blocks using this library, well, you are out of luck.

So, to sum this all up, if you are using process.nextTick, there is a better than good chance you are looking at your project the wrong way.  Remember that while IO is asynchronous, node.js code is not.  Even though it's typically very, very fast :)  I'm quickly growing to love this platform for development, it scales very well.  Nginx uses a very similar structure to produce an extremely resource efficient, highly scalable server.

Happy coding.