About the Author
Arun Murthy (California) has contributed to Apache Hadoop full-time since the inception of the project in early 2006. He is a long-term Hadoop Committer and a member of the Apache Hadoop Project Management Committee. Previously, he was the architect and lead of the Yahoo Hadoop Map-Reduce development team and was ultimately responsible, technically, for providing Hadoop Map-Reduce as a service for all of Yahoo - currently running on nearly 50,000 machines! Arun is the Founder and Architect of the Hortonworks Inc., a software company that is helping to accelerate the development and adoption of Apache Hadoop. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011 in order to accelerate the development and adoption of Apache Hadoop. Funded by Yahoo! and Benchmark Capital, one of the preeminent technology investors, their goal is to ensure that Apache Hadoop becomes the standard platform for storing, processing, managing and analyzing big data. He lives in Silicon Valley in California.
Douglas Eadline (Pennsylvania), PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net web site in 2005, he served as Editorinchief for ClusterWorld Magazine, and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. Both clients and readers have recognized Dr. Eadline's ability to present a "technological value proposition" in a clear and accurate style. He has practical hands on experience in many aspects of HPC including, hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing.
Apache Hadoop is right at the heart of the Big Data revolution. In the brand-new Release 2, Hadoop’s data processing has been thoroughly overhauled. The result is Apache Hadoop YARN, a generic compute fabric providing resource management at datacenter scale, and a simple method to implement distributed applications such as MapReduce to process petabytes of data on Apache Hadoop HDFS. Apache Hadoop 2 and YARN truly deserve to be called breakthroughs.
In Apache Hadoop YARN , key YARN developer Arun Murthy shows how the key design changes in Apache Hadoop lead to increased scalability and cluster utilization, new programming models and services, and the ability to move beyond Java and batch processing within the Hadoop ecosystem. Readers also learn to run existing applications like Pig and Hive under the Apache Hadoop 2 MapReduce framework, and develop new applications that take absolutely full advantage of Hadoop YARN resources. Drawing on insights from the entire Apache Hadoop 2 team, Murthy and Dr. Douglas Eadline:
Review Apache Hadoop YARN’s goals, design, architecture, and components
Guide you through installation and administration of the new YARN architecture,
Demonstrate how to optimize existing MapReduce applications quickly
Identify the functional requirements for each element of an Apache Hadoop 2 application
Walk you through a complete sample application project
Offer multiple examples and case studies drawn from their cutting-edge experience
發表於2024-11-13
Apache Hadoop YARN 2024 pdf epub mobi 電子書 下載
圖書標籤: hadoop yarn 大數據 計算機 Hadoop apache BigData
概述性的介紹架構,非常清楚
評分http://yarn-book.com
評分概述性的介紹架構,非常清楚
評分幾天前小組長纔買完hadoop1權威指南,為什麼yarn權威指南沒有人看呢?其實yarn纔是大數據框架的未來,本書第四章和第七章介紹架構部分是精華,其他地方可以略過。本書還是很值得一讀。
評分這本書是介紹YARN原理的目前最好的書。雖然國內也有基本介紹YARN的書,但是遠不及這本書。行文流暢,結構條理,講解深入。
Apache Hadoop YARN 2024 pdf epub mobi 電子書 下載