Sunday, March 3, 2013

My experience with Nodejs

After being a java developer for many year, I got a chance to develop a web application on Node.js recently. This blog is about my good/bad experience with Nodejs while doing that project. Node is a javascript platform for server side programming. Earlier my thought about Javascript was that it is a superb thing for UI programming and it really does amazing things. Node changed that perspective about javascript, it enables javascript to run on server side. For those who heard about Node first time, may think WHAT...., javascript can run on server side (I thought the same). But it is really Cool technology to do server side programming. It uses Google V8 VM(used by Chrome) as a runtime environment which is known for its blazingly fast performance. In Node there is a single process which serves all the incoming requests. It uses asynchronous callbacks, which makes sure that the single process is not stuck in one request in case of slow I/O’s, and parallely processes other requests.
Before writing my findings about using Node, I want to put a disclaimer that for brilliant programmers it does not matter which technology they use, they will always make the things smoother, but for an average programmer like me(and many others) there is always a good and another not so good programming paradigm to use for solving a problem.OK, now after this disclaimer I think good programmers won't beat me up after reading my thoughts about Node :)

Let's come to the findings:

  • In Node, you will be writing code in javascript. Almost every programmer who worked on web application must be knowing little bit of javascript programming. It has very simple syntax that is very similar to other programming languages. Overall, learning this programming paradigm won’t consume too much of time if you have programming background in any language. Once you are comfortable with writing asynchronous blocks, you are done.
  • Node servers are very lightweight. You don’t need to use any external web server for this purpose. Only a few lines of code and you are done. Below is a sample Node server code:
      var http = require("http");
      http.createServer(function(request, response) {
             response.writeHead(200, {"Content-Type": "text/plain"});
             response.write("Hello World....");
          response.end();
          }).listen(8080);
          console.log(‘server has started...’);

Save above lines of code in a file “sample.js”. Run this code by typing node sample.js and you server is ready. Go to your browser and type http://localhost:8080, and you will see “Hello World...” displayed in your screen. While executing above code you might have observed that, http.createServer() function is executed and it registers the internal callback function which has request and response as function arguments. Then it goes on to execute next statement and prints “server has started...” without waiting for http requests. Whenever you hit the url “http://localhost:8080”, it calls the internal callback function and returns the response. So, here your execution does not get stuck at createServer() function and continues to next statement and the callback function does the job of accepting http requests whenever it is called. That is the power of its asynchronous callback programming, which makes sure that your single process does not get stuck anywhere. Being a single process application you don’t need to bother about multithreading issues, you only need to write down your logic in correct asynchronous blocks.

  • Due to its very low memory consumption per connection and asynchronous I/O approach, it is highly scalable during high loads. It can handle large number of connections with very less number of servers deployed, that will lower your hardware cost also.
  • As you might have observed all the I/O’s are done through asynchronous way. Your server will make I/O request and go to process others tasks, There is no I/O waiting involved that saves your CPU time. But for CPU intensive work, which happens internal to server without having too much of I/O, your single process will keep on doing that work until it finishes and will not be able to do any other task, which makes other requests waiting during that period. So, if your application has too many I/O tasks than having long running CPU intensive task, then Node would perform very well for your application.
  • There is one major disadvantage of being single process application. If any error occurs at run time and not being handled, then the node process will stop and it will cause your server to stop and you need to start your server again.
  • npm (https://npmjs.org/) is repository of node modules which has thousands of node libraries for different purposes. You can also develop your own library and contribute there. I have used many libraries from npm, but I felt that most of those libraries are not very mature to be used for a big project. So, before using any of those library please analyze/test carefully.
  • To develop a big size enterprise level application, there are many general frameworks you will need, for example: a good mvc framework, logging, orm, web services(Rest/Soap) frameworks, unit test framework and many others specific to your application. I found few very good node frameworks available and used them without any issue. But for few areas, I could not find any mature framework and finally I ended up extending the available library to make that suitable for my application. If i look at other programming language frameworks I earlier used for those puposes, they are much more mature than Node libraries and serves all the general level requirements, may be because those libraries are developed and maintained by some of the well known organizations/groups and goes through a proper review cycle. But I feel with time this problem with go soon, once many tech companies will start  using Node, then more mature frameworks/libraries will start coming out. But at this time I can say about all the node libraries I tried, I found many of them not many useful.
  • Any big project goes through multiple changes and you will keep on doing refactoring the code to keep it clean. Being an interpreted language, it can’t tell a syntactically wrong statement, you will never know the error until the error happened in production. So, it is highly desirable to have unit test coverage as high as possible, otherwise code refactoring will be a nightmare for you. Having high code coverage will make sure that at least your code is syntactically correct. Personal I would prefer statically typed language over dynamically typed language for big projects keeping maintenance in mind.
  • Many times you might want to write sequential logic, which requires the outcome of first statement to be used in 2nd statement, outcome in 2nd in 3rd and so on...For example:
         i = fun1();
j = fun2(i)

k = fun3(j).

If all of the above functions are asynchronous in nature, then you code in Node will look like:

fun1(function(i)){
 fun2(i, function(j){
   fun3(j, function(k){
      //do some operation on k
   });

});
)};
Having these many levels of nested asynchronous blocks makes your code look not so good.
Let’s see another example, where one function returns an array and another function has to work on every element of returned array:
fun1(function(error, results) {
var completedTasks = 0;
for (var i in results) {
fun2(result[i], function(j){
    //do something with j
completedTasks++;
if(completedTasks == results.length){
console.log(‘completed all tasks’);
}
});
}
});
This kind of sequential execution code does not look good in asynchronous style of coding. There are design patterns which makes implementation of this type of logic look much decent than this approach. I won’t discuss those design patterns here, but you can use those to make your code look better than this. Personally I didn't like some of my code written in asynchronous way.

  • You need to be careful while writing synchronous blocks of code. If synchronous code in your application is going to consume higher time, then Node’s single process will be busy during that time and will not be able serve any other job. That will slow down overall performance of your application. So, better to be used to of writing more asynchronous blocks of code, otherwise wrong coding style will make your Node’s single process application perform slower.

At last I would say every technology choice depends on the nature  of problem. I would really prefer Node for those kind of applications which requires more I/O than CPU intensive work. Few examples are chat applications, server push kind of applications. These kind of applications requires supporting large number of user connections but will require mostly I/O operations for those requests. In these scenarios, I would prefer Node over other technologies. But for large size web applications which requires lots of logic to be written, having many other important problems to be solved than having only slow I/O's and is going to keep CPU busy for user requests, I would prefer other stable web technologies over Node.