Peer-Reviewed Journal Details
Mandatory Fields
Asad, Z;Chaudhry, MAR;Malone, D
2016
May
IEEE Journal on Selected Areas in Communications
Greener Data Exchange in the Cloud: A Coding-Based Optimization for Big Data Processing
Published
0 ()
Optional Fields
NETWORK PERSPECTIVE INFORMATION ETHERNET
34
1360
1377
The rise of the cloud and distributed data-intensive (big data) applications puts pressure on data center networks due to the movement of massive volumes of data. Reducing the volume of communication is pivotal for embracing greener data exchange by efficient utilization of network resources. This paper proposes the use of mixing technique, spate coding, working in tandem with software-defined network control as a means of dynamically-controlled reduction in volume of communication. We introduce motivating real-world use-cases, and present a novel spate coding algorithm for the data center networks. We also analyze the computational complexity of the general problem of minimizing the volume of communication in a distributed data center application without degrading the rate of information exchange, and provide theoretical limits of such schemes. Moreover, we proceed to bridge the gap between theory and practice by performing a proof-of-concept implementation of the proposed system in a real world data center. We use Hadoop MapReduce, the most widely used big data processing framework, as our target. The experimental results employing two of industry standard benchmarks show the advantage of our proposed system compared to a vanilla Hadoop implementation, an in-network combiner, and Combine-N-Code. The proposed coding-based scheme shows performance improvement in terms of volume of communication (up to 62%), goodput (up to 76%), disk utilization (up to 38%), and the number of bits that can be transmitted per Joule of energy (up to 200%).
PISCATAWAY
0733-8716
10.1109/JSAC.2016.2520245
Grant Details