Category BigData

HBase Architecture

After working on HBase from past one and half year I decided to share my understanding. In this blog I will try to describe the high level functioning of HBase and the different components involved. HBase – The Basics: HBase is an open-source, NoSQL, distributed, column-oriented data store which has been implemented from Google BigTable […]

Implementing Partitioners and Combiners for MapReduce

Partitioners and Combiners in MapReduce Partitioners are responsible for dividing up the intermediate key space and assigning intermediate key-value pairs to reducers. In other words, the partitioner specifies the task to which an intermediate key-value pair must be copied. Within each reducer, keys are processed in sorted order. Combiners are an optimization in MapReduce that […]