Tuesday, January 17, 2012

Server Push/Comet Programming/Long Polling


What is Server Push?
In normal client server architecture, client requests server for web pages and gets back the response. So, client actually tries to pull data from server. In some cases, we may want server to push the data back to browser once it is available without client explicitly requesting it.
Examples:
1) In email sites when new email arrives in inbox, server pushes the new mail back to browser without user refreshing the page.
2) In social networking sites, showing the new notifications to user without page refresh.
3) In chat applications, pushing the chat message back to targeted user.

In all above scenarios, server pushes the data back to browser without explicit client request. This approach is called Server Push.

In the current scenario what I am explaining, client requests server to execute some task and the status of that task execution was expected to be available in 2-5 minutes. The actual task execution happens at an external system. The web application submits the task to that external system and waits until that external system notifies our application with status message.
With Server Push approach, after receiving client request server returns the request thread back without any status and let the “Server Push” handle the flow of pushing back the data(task execution status) to browser once it is available after task execution.
I shall explain the solution of above stated problem in detail in below sections.

Why Server Push?
While finding out solution for these kind of long running tasks, I found below 3  approaches:

Approach 1: One of the approach may be to use normal client-server request-response cycle. In this approach client requests server for response. After receiving request, server creates a thread and keep it waiting until the response is available. With this traditional approach the execution will take place as below:

step1: Server creates a thread after receiving client request
step2: Server starts task execution.
step3: Inside service() method of Servlet
do{
waiting for task completion
waiting....
waiting....
waiting...
still waiting...... :(
timeout occured
}while(task completed);
step4: sends the response back to browser and releases the thread.

This approach keeps one server thread busy for the total task execution time(in this case 2-5 minutes).  That may not be scalable solution.

Approach 2: Another approach for this kind of long running task may be to use page refresh using <meta> tag. In this approach, we do page refresh multiple times and keep polling the server for task execution status. Each new page refresh maybe done after certain time interval. But even this approach does not look very scalable. We are opening and closing connection multiple times and during the connection time one server server thread is busy to poll the status.

Approach 3(Server push/Comet Programming/Long polling/Reverse Ajax): In this approach after receiving client request server submits task for execution and returns the current client thread immediately. It saves HttpResponse object in its memory. Once status is available, it uses the HttpResponse object(saved earlier) to send the response back to browser. On browser side, javascript captures the status message and displayed the message in browser. The execution flow is explained below:

step 1: Server creates a thread after receiving client request
step 2: Server starts task execution.
step 3: Server returns the thread back to thread  pool.
step 4: Once status is available, server takes 1 of the thread from thread pool and commits the  status in HttpResponse object.
step 4: Javascript uses this status message to display status text in browser.

Looking at the execution steps of all the 3 approaches discussed above, approach 3 seems more scalable.


JBoss support for Server Push
I used JBoss 6 as the application server for this feature. JBoss web can use Apache Portable Runtime(APR) to provide superior scalability, performance, and better integration with native server technologies. APR is highly portable library and it also provides advanced I/O functionalities. Server push can be done using this APR connections, because it supports asynchronous connection without blocking (most likely responding to some event raised from some other source).
JBoss web has org.jboss.servlet.http.HttpEventServlet interface which can be used to support server push functionality. The server push servlet implements HttpEventServlet and overrides event() method of this interface. This interface allows servlets to process I/O asynchronously. In this servlet event() method is invoked than invoking service() method of HttpServlet.
Following are the different types of events exist:
-       EventType.BEGIN - will be called at the beginning of the processing of the connection.
-       EventType.READ - indicates that input data is available.
-       EventType.END - called to end the processing of the request. After this event has been processed, the request and response objects, as well as all their dependent objects will be recycled.
-       EventType.EOF – event is used for chunked data transfer.
-       EventType.ERROR – called in case of IO error. Request and response objects, as well as all their dependent objects will be recycled and used to process other requests.
-       EventType.TIMEOUT - connection timed out according to the timeout value which has been set. This timeout value is set in jboss configuration(in jbossweb/server.xml) and can be overriden inside event() mthod.

Typical lifecycle of a request will consist in a series of events such as:
BEGIN -> READ -> READ -> READ -> TIMEOUT -> END.

In this case, lifecycle will be:
BEGIN -> “Push Data to Browser” -> END
        Or,
BEGIN -> TIMEOUT -> BEGIN -> TIMEOUT BEGIN -> TIMEOUT……. until data is available to push back to browser.

For details about APR connection and HttpEventServlet, you can go through below JBoss documentations:
http://docs.jboss.org/jbossweb/3.0.x/apr.html
http://docs.jboss.org/jbossweb/3.0.x/aio.html



Work-flow explanation






Step # 1 – Client connects to Server
-        Client JavaScript will make asynchronous HTTP request to server, once user requests any of the long running task.
-        Server will store the HttpResponse object (key=any unique id to identify user) in application memory, but returns thread immediately.
-         
Step # 2 – Task execution status is received from external source
-        When status is received from external source, server creates a new thread.
-        This thread will retrieve status details from the external source.
-        Then, this thread will retrieve the stored HttpResponse object from application memory using “unique id” as key and push status data to the response object.
-        After pushing data, connection is closed.
     
Step # 3 – Client displays data
-        Client displays data to user inside the correct “div” in the web page.
-        Client re-connects to server if there are any more task status is pending and the loop repeats again.

Note- The HttpResponse object need to be kept in application memory shared across whole application. We can use Cache memory or some other kind of storage for this purpose. It interacts both with browser requests and external source requests. Depending on the complexity of your requirement you can go with different kind of component for storing HttpResponse objects.

Pseudo code with explanation

Step 1: JBoss configuration
Add below entry inside jbossweb.sar/server.xml file to enable APR protocol:
<Connector connectionTimeout="20000" port="8484" protocol="org.apache.coyote.http11.Http11AprProtocol"
enableLookups="false" address="${jboss.bind.address}" redirectPort="${jboss.web.https.port}" />
With above configuration, every request which goes through 8484 port number, will use APR connection. The timeout for connection will be 20000 mili seconds. This timeout value can be overriden programatically inside HttpEventServlet implementation.
After this, tomcat native package needs to be installed in server. The details about installation of package is available at http://docs.jboss.org/jbossweb/3.0.x/apr.html.



Step 2: Creating Http APR Connection:
From browser java script invokes APR connection to server. Below is the java script function which does this task:
openConnection: function () {
       $.ajax({
           type: "GET",
           dataType: 'jsonp',
           async: true,
           jsonp: 'jsonp_callback',
           url: '/myApp/taskStatusServlet'
       });
   }
In this function, data type I have given as ‘jsonp’. Use this data type if you are using 8484 port for APR connection and some other port for normal Http connection. Due to some security features some of the browsers does not support  requesting data from a server in a different domain. This JSONP is the solution to this problem.

After this, we need to write servlet which implements HttpEventServlet. This servlet handles all the APR connection requests.

public class TaskStatusServlet extends HttpServlet implements
HttpEventServlet {



/**
* Process the given event.
*
* @param event
*            The event that will be processed
* @throws IOException
* @throws ServletException
*/
public void event(HttpEvent event) throws IOException, ServletException {
HttpServletRequest request = event.getHttpServletRequest();
HttpServletResponse response = event.getHttpServletResponse();
HttpSession session = request.getSession(false);
// If session has expired, return from here
if (null == session) {
event.close();
return;
}



switch (event.getType()) {
case BEGIN:
Logger.info("BEGIN for session: " + sessionId, this.getClass());
String uniqueId = (String) session.getAttribute(“uniqueId”); //get the unique id from session
//creates Event response object and add it into application memory
memoryManager.addToMemory(uniqueId , response );
event.setTimeout(120000);//setting the  connection timeout to 2 minutes
break;
case ERROR:
Logger.info("ERROR for session: " + sessionId, this.getClass());
event.close();
break;
case END:
Logger.info("END for session: " + sessionId, this.getClass());
event.close();
break;
case EOF:
Logger.info("EOF for session: " + sessionId, this.getClass());
event.close();
break;
case TIMEOUT:
Logger.info("TIMEOUT for session: " + sessionId, this.getClass());
userId = (String) session.getAttribute(“uniqueId”);
//removes the unique id and HttpResponse object from application memory
memoryManager.removeFromMemory(uniqueId);
PrintWriter writer = response.getWriter();
response.setContentType("text/javascript");
//call java script to create another long polling connection
writer.println("openConnection()"); //again calling javascript function send another APR connection
writer.flush();
writer.close();
event.close();
break;
case READ:
Logger.info("READ for session: " + sessionId, this.getClass());
// This event will never occur in our scenario
/*
* InputStream is = request.getInputStream(); byte[] buf = new
* byte[512]; while (is.available() > 0) { int n = is.read(buf);
* //can throw an IOException if (n > 0) { log("Read " + n +
* " bytes: " + new String(buf, 0, n) + " for session: " +
* sessionId); } else { //error(event, request, response); return; }
* }
*/
}
}

}

Step 3: Receiving response from external source
For this component, We exposed a web service that is called from that external source to push task exeution status.

HttpServletResponse res = memoryManager.getFromMemory(uniqueId);
//send the response back to browser
res.setContentType("text/javascript");
try {

PrintWriter writer = res.getWriter();
//calls the javascript to update the status in browser
riter.println("updateStatus(\'"+status+"\')");
writer.flush();
writer.close();

} catch (IOException ioe) {
Logger.error("Exception while ..", ioe, this.getClass());
}

//remove the http response object from application memory after sending back response to browser
memoryManager.removeFromMemory(uniqueId);


Step 4: Displaying status message in browser
In above step we invoked java script function updateStatus() to display data in browser. I am not giving code snippet of this java script function, as it depends completely on the html structure of your web page. You only need to select the correct <div> and add the message passed from server side inside this div.

Friday, January 13, 2012

Utility to generate POJO classes for Json req/resp

JSON is widely used as request/response type in web services. We normally create POJO classes manually to hold the request/response json data.
Attached is the utility class to generate VO classes(POJO classes) from json sample request/response. If it works as expected, then it may save us from manually creating multiple classes for a single request/response.
This code assumes that the attributes of those classes can be of below 7 types:
1) String
2) Long
3) Double
4) Boolean
5) Object of custom classes
6) List of strings
7) List of numbers
8) List of custom objects

Steps to use this utility:
1) unzip the zip file(link of this zip file is given at the end of this blog) to any location in your machine.
2) create a folder with name "output" inside the root folder(same directory where class file is present). This is one time activity. Delete all the existing files from "output" folder if present.
3) copy the sample json request/response and paste into json.txt file
4) open json.bat file in edit mode and add the parent class file name. For e.g.: if you want to generate the class with file name WebServiceInput, then the content of this bat file should be
java JsonGenerator WebServiceInput
5)Run the json.bat file, it will generate all the VOs inside output folder.
6) Copy all the VOs inside correct package folder and add package statement to all class files.
7) this utility writes the contents of POJO class in 1 single line(without any formatting), it can be later be formatted using eclipse or any other tool.

I also attached the source code in the zip file, so that you can change the code to customize for your needs.

Note -
1)This utility may give error while running with wrongly formatted json. Please make sure that the json format is correct. It will also give error if any of the key-value pair have value as null, since with "null" as value, we can't figure out the datatype of key.
2) I am converting json property names to camel case to follow java naming convention. For e.g.: for element "name", the POJO class will have attribute with name "name"(same as json element), but for json element "Name", the generated class will have attribute with name "name"(changed to camel case). But most of the mapper requires the json element name and POJO class attribute name to be exactly same. In those cases, either you may need to remove the camel case conversion logic from the utility source code, or add some kind of mapping in the generated POJO class. Jackson mapper provides annotations to do this kind of mapping.
@JsonProperty("Name")
private String name





Please try to use this and let me know the issues you have faced while using this.

Link of zip file: https://docs.google.com/open?id=0B8O-miA80x0gOWNTckRsZG5tWnM