Apache Flume is a highly scalable, distributed, fault tolerant data collection framework for Apache Hadoop and Apache HBase. Flume is designed to transfer massive volumes of event data in a highly scalable way into HDFS or HBase. Flume is declarative and easy to configure and can easily be deployed to a large number of machines using configuration management systems like Puppet or Cloudera Manager