Building a Transparent Batching Layer for Storm

Share

Abstract: Storm is a distributed intra-node-parallel stream processing system built for very low latency processing. One major drawback of Storm is its relatively low throughput. In order to increase Storm's throughput, we designed a batching layer for Storm that is able to improve Storm's throughput significantly. In order to get a high user acceptance, we did not modify Storm but build the batching layer "on top" of it. The layer is transparent to the Storm system as well as to the user code, i.e., the user- defined functions. Thus, already developed Storm programs (so-called topologies) can benefit from our batching layer without modification. In this document, we describe the design of the batching layer and provide inside into some implementation details.

41 pages

  • External Posting Date: July 6, 2014 [Fulltext]. Approved for External Publication
  • Internal Posting Date: July 6, 2014 [Fulltext]

Back to Listing