About the Author
Arun Murthy (California) has contributed to Apache Hadoop full-time since the inception of the project in early 2006. He is a long-term Hadoop Committer and a member of the Apache Hadoop Project Management Committee. Previously, he was the architect and lead of the Yahoo Hadoop Map-Reduce development team and was ultimately responsible, technically, for providing Hadoop Map-Reduce as a service for all of Yahoo - currently running on nearly 50,000 machines! Arun is the Founder and Architect of the Hortonworks Inc., a software company that is helping to accelerate the development and adoption of Apache Hadoop. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011 in order to accelerate the development and adoption of Apache Hadoop. Funded by Yahoo! and Benchmark Capital, one of the preeminent technology investors, their goal is to ensure that Apache Hadoop becomes the standard platform for storing, processing, managing and analyzing big data. He lives in Silicon Valley in California.
Douglas Eadline (Pennsylvania), PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net web site in 2005, he served as Editorinchief for ClusterWorld Magazine, and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. Both clients and readers have recognized Dr. Eadline's ability to present a "technological value proposition" in a clear and accurate style. He has practical hands on experience in many aspects of HPC including, hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing.
发表于2024-12-29
Apache Hadoop YARN 2024 pdf epub mobi 电子书
图书标签: hadoop yarn 大数据 计算机 Hadoop apache BigData
Apache Hadoop is right at the heart of the Big Data revolution. In the brand-new Release 2, Hadoop’s data processing has been thoroughly overhauled. The result is Apache Hadoop YARN, a generic compute fabric providing resource management at datacenter scale, and a simple method to implement distributed applications such as MapReduce to process petabytes of data on Apache Hadoop HDFS. Apache Hadoop 2 and YARN truly deserve to be called breakthroughs.
In Apache Hadoop YARN , key YARN developer Arun Murthy shows how the key design changes in Apache Hadoop lead to increased scalability and cluster utilization, new programming models and services, and the ability to move beyond Java and batch processing within the Hadoop ecosystem. Readers also learn to run existing applications like Pig and Hive under the Apache Hadoop 2 MapReduce framework, and develop new applications that take absolutely full advantage of Hadoop YARN resources. Drawing on insights from the entire Apache Hadoop 2 team, Murthy and Dr. Douglas Eadline:
Review Apache Hadoop YARN’s goals, design, architecture, and components
Guide you through installation and administration of the new YARN architecture,
Demonstrate how to optimize existing MapReduce applications quickly
Identify the functional requirements for each element of an Apache Hadoop 2 application
Walk you through a complete sample application project
Offer multiple examples and case studies drawn from their cutting-edge experience
http://yarn-book.com
评分不仅介绍了YARN的核心基础概念及运行机制,还介绍了安装、运行、管理YARN(及HDFS)~ 更深入点的东西源码见~
评分不仅介绍了YARN的核心基础概念及运行机制,还介绍了安装、运行、管理YARN(及HDFS)~ 更深入点的东西源码见~
评分概述性的介绍架构,非常清楚
评分http://yarn-book.com
Apache Hadoop YARN 2024 pdf epub mobi 电子书