CoralQueue Performance Numbers

In this article we present CoralQueue performance numbers for four different scenarios: message-sender latency, message transit latency, message-sender throughput and message transit throughput. The standard scenario of one producer (message-sender) and one consumer (message-receiver) is used with two possible setups: producer and consumer pinned to the same core (hyper-threading) and producer and consumer pinned to different cores (no hyper-threading).

Message-sender Latencies

In this test we measure the time it takes for the message-sender (i.e. producer) to get rid of the message it has to send, in other words, we don’t care about the time it takes for the message to hit the consumer, just the time it takes for the producer to dispatch the message. We use the AtomicLong lazySet operation to reduce even more the message-sender latency, even if that increases the message-transit latency. The benchmark source code can be seen here.

  • With hyper-threading:
    Messages: 1,100,000
    Avg Time: 14.74 nanos
    Min Time: 7.0 nanos
    Max Time: 5.331 micros
    75% = [avg: 13.0 nanos, max: 15.0 nanos]
    90% = [avg: 13.0 nanos, max: 16.0 nanos]
    99% = [avg: 13.0 nanos, max: 29.0 nanos]
    99.9% = [avg: 14.0 nanos, max: 304.0 nanos]
    99.99% = [avg: 14.0 nanos, max: 366.0 nanos]
    99.999% = [avg: 14.0 nanos, max: 610.0 nanos]
    
  • Without hyper-threading:
    Messages: 1,100,000
    Avg Time: 29.54 nanos
    Min Time: 6.0 nanos
    Max Time: 4.963 micros
    75% = [avg: 26.0 nanos, max: 28.0 nanos]
    90% = [avg: 26.0 nanos, max: 29.0 nanos]
    99% = [avg: 28.0 nanos, max: 132.0 nanos]
    99.9% = [avg: 29.0 nanos, max: 226.0 nanos]
    99.99% = [avg: 29.0 nanos, max: 287.0 nanos]
    99.999% = [avg: 29.0 nanos, max: 1.0 micros]
    

Message Transit Latencies

In this test we measure the total transit time of the message, in other words, the amount of time it takes for a message to be dispatched by the producer and received by the consumer. The AtomicLong lazySet operation is not used because we want to notify the consumer as soon as possible about new messages added by the producer. The benchmark source code can be seen here.

  • With hyper-threading:
    Messages: 10,000,000
    Avg Time: 52.97 nanos
    Min Time: 32.0 nanos
    Max Time: 9.052 micros
    75% = [avg: 51.0 nanos, max: 56.0 nanos]
    90% = [avg: 52.0 nanos, max: 58.0 nanos]
    99% = [avg: 52.0 nanos, max: 61.0 nanos]
    99.9% = [avg: 52.0 nanos, max: 66.0 nanos]
    99.99% = [avg: 52.0 nanos, max: 287.0 nanos]
    99.999% = [avg: 52.0 nanos, max: 1.27 micros]
    
  • Without hyper-threading:
    Messages: 10,000,000
    Avg Time: 88.18 nanos
    Min Time: 64.0 nanos
    Max Time: 5.961 micros
    75% = [avg: 84.0 nanos, max: 94.0 nanos]
    90% = [avg: 86.0 nanos, max: 98.0 nanos]
    99% = [avg: 87.0 nanos, max: 109.0 nanos]
    99.9% = [avg: 88.0 nanos, max: 134.0 nanos]
    99.99% = [avg: 88.0 nanos, max: 236.0 nanos]
    99.999% = [avg: 88.0 nanos, max: 1.198 micros]
    

Message-sender Throughput

In this test we send as many messages as possible on the producer side to measure how long it takes for the message-sender thread (i.e. producer) to get rid of a big batch of messages. We then calculate the message per second rate. We make 20 passes and compute the average. Again the AtomicLong lazySet operation is used to reduce the latency in the producer side. The benchmark source code can be seen here.

  • With hyper-threading:
    Passes: 20 | Avg Time: 102.703 millis | Min Time: 102.32 millis | Max Time: 109.105 millis
    Average time to send 10,000,000 messages: 102,702,560 nanos
    Messages per second: 97,368,556
    
  • Without hyper-threading:
    Passes: 20 | Avg Time: 139.555 millis | Min Time: 137.649 millis | Max Time: 145.542 millis
    Average time to send 10,000,000 messages: 139,555,424 nanos
    Messages per second: 71,656,118
    

Message Transit Throughput

In this test we compute what is the maximum number of messages we can send from producer to consumer in one second. We make 20 passes and compute the average. The AtomicLong lazySet operation is not used because we want to notify the consumer as soon as possible about new messages added by the producer. The benchmark source code can be seen here.

  • With hyper-threading:
    Passes: 20 | Avg Time: 117.343 millis | Min Time: 117.273 millis | Max Time: 117.435 millis
    Average time to send 10,000,000 messages: 117,342,956 nanos
    Messages per second: 85,220,283
    
  • Without hyper-threading:
    Passes: 20 | Avg Time: 234.781 millis | Min Time: 233.626 millis | Max Time: 236.014 millis
    Average time to send 10,000,000 messages: 234,781,092 nanos
    Messages per second: 42,592,867