Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service
發表於2025-02-08
Hadoop: The Definitive Guide 2025 pdf epub mobi 電子書 下載
書中沒有透露太多實現架構方麵的細節,更多的是從使用者的角度上介紹瞭Hadoop的各種知識,包括MapReduce, HDFS, Hive, Pig, HBase, ZooKeeper。幾乎涉及瞭Hadoop的所有關於使用方麵的知識,包括安裝和使用。 你甚至可以直接在自己的電腦上裝上一個Hadoop,對著書中的例子實際演...
評分買瞭第一版,時間太緊,沒來得及看,後來齣瞭個號稱修訂升級的第二版,毫不猶豫又買瞭,後來聽說第二版比第一版翻譯得好,心中竊喜,再後來看瞭第二版,我震驚瞭,我TM就是一傻子,放著好好的英文版不看,趕什麼時髦買中文版呢。在這個神奇的國度,牛奶裏放的是三聚氰胺,火腿...
評分參加豆瓣China-pub抽奬,比較幸運的得到這本Hadoop權威指南中文第二版,拿來與第一版相比,發現新加入瞭Hive和Sqoop章節,譯文質量也提高瞭不少,並且保留瞭英文索引。 這本書對Hadoop的介紹還算全麵,有實踐衝動的朋友基本可以拿著書、配閤Google百度馬上實現夢想。個人感覺“...
評分Cobub Razor APP數據統計分析工具官網上有篇文章是講Hadoop Yarn調度器的選擇和使用的,我覺得寫的挺好的,推薦http://www.cobub.com/the-selection-and-use-of-hadoop-yarn-scheduler/
評分看瞭幾章中文版的,各種錯誤,太低級,實在是看不下去瞭。 建議還是看原版吧。 譯者們的臉皮可真厚,英文譯不明白也就罷瞭,中文都組織的不通順,好意思嗎!! 什麼叫 “但是,......,但是”啊,“但是體”啊。
圖書標籤: Hadoop 大數據 BigData 計算機 分布式 hadoop 機器學習 O'Reilly
閱讀瞭第1,2部分,算是對Hadoop有瞭基本的認知,接下來需要結閤實際項目夯實。其他相關的技術如Hive,HBase,Spark也需要去學習。
評分第四版全麵基於hadoop2,相比前版進行一些重要增添和順序調整,之前版本就不要看瞭。繼續那麼全麵而透徹實用。
評分第四版全麵基於hadoop2,相比前版進行一些重要增添和順序調整,之前版本就不要看瞭。繼續那麼全麵而透徹實用。
評分真尼瑪長。介紹瞭生態圈裏的大部分工具,用來總結迴顧比較適閤,沒有實踐過的讀者看前兩部分mr和yarn核心,掃一遍後麵所有工具是做什麼用的就可以瞭。
評分入門hadoop的好書
Hadoop: The Definitive Guide 2025 pdf epub mobi 電子書 下載