Friday, 12 April 2013

Streaming Data Access | Hadoop Tutorial pdf

Streaming Data Access

Applications that run on HDFS need streaming access to their data sets. They are not general purpose applications that typically run on general purpose file systems. HDFS is designed more for batch processing rather than interactive use by users. The emphasis is on high throughput of data access rather than low latency of data access. POSIX imposes many hard requirements that are not needed for applications that are targeted for HDFS.
POSIX semantics in a few key areas has been traded to increase data throughput rates.

No comments: