WebSockets vs. Regular Sockets

In a recent blog post we benchmarked WebSockets vs. regular old Http requests. Today we will go in a different direction on the network stack and benchmark WebSockets vs. regular old TCP sockets, all on the JVM.

This may all seem quite academic, after all WebSockets is made for browser to server communication where you have no access to raw TCP sockets anyway. On the other hand, however, there are a few non browser WebSocket client libraries popping up and you may be tempted to use these in server-server communication if you already have a WebSocket interface running for browser clients.

This is not a scientific experiment – so take it with a grain of salt – but it should give you some idea of when to use WebSockets vs. TCP when you have a choice, at least when working with Play!/Netty.

We are interested in processing overhead, not network latency, thus all experiments were performed on a single machine, connecting client and server through the local loopback interface. The machine in question was 2.7Ghz MacBook Pro.

  • For the TCP benchmark we wrote a simple client and server building directly on java.net sockets. The client repeatedly sends the string “Ping” to the server, which then replies with the string “Pong”.
  • For the WebSockets benchmark we used the Play! framework on the server – specifically the built in asynchronous WebSockets handler – while the client was based on this gist: https://gist.github.com/casualjim/1819496 (which uses netty). Just as with TCP the client sends the string “Ping” to the server which then replies with the string “Pong”.

For the actual benchmark the client sends a “ping” to the server many millions of times, split over 50 concurrent connections (for both TCP and WebSockets), and we compute the average time it takes for the response to arrive. In both cases the main “loop” in the client which computes the actual benchmark times looks something like this:

var totalPings = 1000000
var concurrentConnections = 50
var perConnection : Int = totalPings/concurrentConnections
var actualTotalPings : Int = perConnection*concurrentConnections
 
val t0 = (System.currentTimeMillis()).toDouble
 
var futures = (0 until concurrentConnections).map(i => 
  future {
    //this opens one connection and will send 'perConnection' pings 
    ping("localhost", "1201", perConnection)  
  } 
Future.sequence(futures).onSuccess{
  case _ => {
    val t1 = (System.currentTimeMillis()).toDouble
    val speed = (t1-t0)/actualTotalPings
    println(speed*1000)
  }
}

WebSockets performs quite well, with an average round trip time of about 20 microseconds (0.02 milliseconds), but straight up TCP still beats it handily, with an average round trip time of about 2 microseconds (0.002 milliseconds), an order of magnitude less.

A few other observations:

  • Memory footprint was also significantly lower for TCP (10s of MBs vs. 100s of MBs on the server), which is likely mostly due to the (excellent!) Play Framework, which does a lot more than just handle WebSockets so the comparison isn’t entirely fair.
  • The latency for TCP was not significantly affected by the number of concurrently active connections going up to several hundred. After that it began to suffer a little bit, going up to about 4 microseconds. As for WebSockets, the underlying netty code spawns a lot of threads and we could not test with more than about 100 or so concurrently active connections.
  • With the 50 concurrent connections we used for the benchmark neither the TCP nor the WebSockets version maxed out the 4 available physical cores, although CPU utilization was fairly high (slightly higher for WebSockets).

Thus, in most cases, it’s probably not a good idea to use WebSockets instead of direct TCP since you’ll get about 10 times higher message throughput with the latter (note that this is not the same as data throughput).
The only scenarios where it probably doesn’t matter are few very messages or if there is a lot of computation per message far dominating the processing time.

That’s it for now. Stay tuned for a future post where we go further down the rabbit hole and benchmark the JVM TCP stack versus some other languages and runtimes.

We wrote this post while working on Kifi — tools that help people outsmart information overload together. Learn more.

3 comments
andreagosto
andreagosto

i agree with nowucca, your article is pretty interesting, but you have to provide a more balanced version of the test, to inferred an equilibrated  preference! i'll wait it, i'm interested.

nowucca
nowucca

What this is doing is testing the efficiency of Play/Netty call stacks vs a raw Java socket call stack.  If you implemented a thin-client and small WebSocket handshaking server on top of Java sockets then you would be closer to the mark.  Drawing a general conclusion about WebSockets ("it’s probably not a good idea to use WebSockets instead of direct TCP") is not really justified with this approach - all we can really say with the approach here is that Play's use of netty has overhead that  chews up threads and throughput.

Stephen Kemmerling
Stephen Kemmerling

You are right, this is mostly a comparison of Play!/Netty WebSockets vs. TCP. I've mentioned this in the beginning.

I'm sure you could narrow the performance gap with a very thin custom WebSockets implementation, but I would argue that if you are doing this purely for inter server communication you might as well write a custom protocol on top of TCP.