For several months now, JavaScript has been the most active language on github, far ahead of Java and PHP. In addition to its strong presence on the “Front” side of applications, JavaScript is beginning to earn its spurs on the “Server” side.
This is evidenced by the numerous projects based around Node.js, which is increasingly chosen to experiment with real-time or near-real-time issues.
The aim of these projects is to offer user experiences in highly connected environments (video games, mobiles, consumer surveys, notifications, etc.). For example, Facebook’s chat and search engine are developed in part with Node.js. LinkedIn’s mobile application is largely based on NodeJS.
Initial feedback shows that this technology is still in its infancy, and therefore complex to master, particularly on production projects, but very promising. It has already found an audience and a passionate, dynamic community.
Node.js for “real time” projects, but not only…
V8, Chrome’s JavaScript engine, is at the heart of Node, and is extremely powerful thanks to its JIT (Just In Time) compiler. In Node.JS, it is combined with libuv, an asynchronous library. The result is a high-performance package capable of processing thousands of requests simultaneously. Indeed, Node.JS’s asynchronous design avoids “blocking” iterative programming and limits input/output waits (filesystem, HTTP calls, etc.). This asynchronous / event-driven model is not new, and is even at the heart of the Nginx web server, one of the most popular web servers on the Internet.
Beyond pure performance, it’s above all the very good performance/resource utilization ratio that’s new. This makes Node.js capable of handling large volumes of requests on basic machines. For example, during the French presidential elections, AF83 developed a polling solution in Node.js that handled 20,000 connections per second with only 1 process attached to a vCPU. What’s more, the processing capacity of Node.JS can easily be distributed across several vCPUs thanks to cluster mode.
Reduced processing times and the ability to handle large numbers of requests are the major criteria that lead a project team to choose Node.js. Until now, it was necessary to work in C, C++ or Java to take on such projects. Today, even if Node.JS can’t compete with these languages for certain ultra-real-time needs, it puts it within reach of many teams.
Beware, however, of doing poorly in Node.JS, and it’s quite possible to code blocking things without realizing it, for example.
Finally, the libraries and modules exposed by the Node.js ecosystem facilitate the implementation of REST APIs capable of interacting in multi-device universes: mobile, desktop, tv, etc. This advantage, coupled with a language that offers portability on the server, browser or any terminal equipped with a JS engine, makes the Node.js/JavaScript pair a preferred choice for certain multi-device or pure web applications.
Its qualities
Performance, or rather the ability to handle large numbers of simultaneous requests.
Event-driven / asynchronous model,
Native support for HTTP and, more generally, all network protocols (TCP/UDP).
Multi-platform (mobile, desktop, tv),
Portability of JavaScript code on both server and client sides,
A very active community.
Node.js requires a good command of the JavaScript language: out with the (Java)Script-kiddies.
Node.js is JavaScript. But true JavaScript skills are rare: knowing how to use Jquery doesn’t mean being proficient in JavaScript. A competent JavaScript developer is at least capable of object programming (using prototypes), functional programming and event-driven programming.
The learning curve is therefore long, as Node.js and JavaScript are full of subtleties (scope, functions, this, events).
Getting up to speed is time-consuming. For programming novices, it takes 4 to 5 months to master the environment.
Its history of HTML page animation has made it unpopular with experienced developers. However, even if there are still a lot of misconceptions about this language, it is normally very accessible to experienced developers. After experimenting with a first project, they are often convinced and even enthusiastic about spreading the word!
The limits of Node.JS
– the complexity of event-driven programming
The asynchronous model is a very important concept: it’s omnipresent in the way Node.JS works.
Even after a few uses, it’s easy to fall into asynchronous design traps.
For example, even a port listening function is asynchronous:
var server = require(‘http’).createServer(function(req,res){res.end(‘hello’)});
server.listen(8080, function (){console.log(‘listening on 8080’)})
console.log(‘hello’);
will give
hello
listening on 8080
This event-driven programming makes debugging more complicated: you need to make callbacks, pass contexts, handle exceptions… and above all their context. When it comes to exception handling, developers have 2 options: catch all exceptions and handle them correctly, or simply restart. At Fasterize, all exceptions are caught without stopping the process, but they are then passed on. Fortunately, tools such as node-inspector, based on Chrome’s JavaScript debugger (breakpoints, stacks, etc.), make the task easier.
– An overdue v1
New HTTP lib, high-performance SSL, clean cluster, synchronous child process execution and tons of performance-oriented improvements – to say this new version has been eagerly awaited (and for longer than its predecessors) would be an understatement!
– Too many modules kills the module
From 5000 to 65000 in two years! Almost 300 new modules a day! The module ecosystem is fantastic: you can find everything you want, but the downside is that you have to look for and find the right module – one that has been tested, used and maintained.
– Windows?
The vast majority of NodeJS developers are on Mac or Linux, and Windows is surprisingly the poor relation in this world. Windows developers regularly struggle with modules and functionalities not designed for their OS…
Architecture
– Software architecture adapted to Node.js
Node.js requires simple, specialized applications to master the complexity induced by asynchronous processing. We need to design a software architecture adapted to Node.js, dividing complex actions into simple, specialized applications. In fact, the philosophy is quite similar to that of Unix/Linux, with a multitude of small programs doing one thing but doing it well, with one input, one output and which can be chained together (cf. awk | grep | sort | uniq). For example, Fasterize has designed its SaaS web performance engine by distributing tasks over asynchronous workers, or at proxy level, by injecting middleware, each responsible for modifying part of the request.
In fact, it doesn’t take long to create and assemble your own modules. As with all languages, module management is complex (dependencies), but NodeJS has succeeded in making it simple in most cases, thanks to npm (Node Package Manager), Node’s integrated module management tool. The fact remains, however, that this management is not perfect, and there is some debate about the use of npm / git / node_modules.
– In production
In production, as with any process server, you need a monitoring mechanism. At Fasterize, given the immaturity of the tools, all the classic wiring was redeveloped: management of PIDs, logs, services, etc…. Today, many modules exist for this purpose.
You still need to be very vigilant about updating your modules. The community is (too) quick to enrich the eco-system with ready-to-use modules, and this can have an impact on performance or even functionality. You must therefore remain cautious about upgrades and have a rigorous testing policy.
Scaling NodeJS applications is no more complicated than for other languages, you just need to design your application so that it is distributed, stateless and asynchronous. On the other hand, thanks to the cluster module, NodeJs can easily scale from one machine to another with more CPU cores. As each NodeJS is monothreaded, it runs on a single core by default. The cluster module lets you deploy the same process on all the machine’s cores. However, this module isn’t yet nickel and dime, and process distribution isn’t perfect (a round robin option managed by the Kernel arive), and you have to manage a lot of things by hand (no graceful restart integrated, for example).
Debugging/profiling is not easy to set up, but integrated tools such as StrongOps are beginning to appear after Nodetime or TraceGL. In production, and NodeJs is magical for this, there’s the possibility of connecting to the process directly and inspecting everything that’s been exposed (node-repl), which is extremely useful for understanding what can happen only in production!
Deploying a Node.JS project
There are two main ways to deploy a NodeJS project: the first and simplest is to use a PaaS, the second is to use your own servers and wire everything up yourself.
PaaS platforms that support NodeJS include Heroku, NodeJitsu, CleverCloud, dotCloud, Azure and Engine Yard.
Each of these platforms allows you to develop locally, deploy very simply (either with a git push or a specific client) and then configure the launch of these applications.
Without PaaS, there are a few deployment tools for node, but none have really emerged, so you often have to either use non-NodeJs-specific tools, or code your own. At Fasterize, Capistrano was used and adapted, as it had already proved its worth in RubyOnRails environments (rollback capability, deployment versioning, etc.).
Best practices
First of all, the best practices that every developer should respect: implementing unit tests. There are many frameworks (mocha, sinon, should, etc.), they work very well and are well documented, so it’s mandatory!
NodeJS is asynchronous and it’s very easy to fall into the “race conditions” trap, even after years of practice. In complex cases, it can be interesting and useful to use flow control libs such as async or promises.
Secondly, NodeJS is JavaScript, so you need to be careful about scope, this, closures, errors due to dynamic typing, etc. …. It’s JavaScript running on V8, and developers often tend to make counter-productive micro-optimizations given the way the compiler works. So you have to test and measure to prove.
In fact, it’s a constant: measure as much as possible. Memory consumption to detect memory leaks, time spent in the event loop to detect deadlocks, etc. For example, this is how Fasterize detected memory leaks in the HTTP lib and the Gzip lib.
The Node.js community and longevity
Many people in the community take great pleasure in experimenting and building around the project. It’s incredibly dynamic. Modules arrived very early and are a real vector for the community. Developers are very enthusiastic about them, which boosts participation, so developments are very rapid and controlled. For example, development of the kernel is separate from that of the modules, which ensures that the core of Node.js remains stable.
Today, Node.js is like a new continent to explore, with contributors taking the liberty of inventing or reinventing concepts. It’s hardly surprising that it’s one of the most active communities on GitHub. However, like any nascent community, it needs to be structured in order to mature, and in recent weeks, the first dramas between major players in this community have appeared …
——————————————————————————————————–
Our technical stack
An engine diagram
And even more general
This article was written in collaboration with our partner Oxalide, a leading provider of outsourcing and hosting services.