Nathan Marz is an engineer at Twitter. He was previously Lead Engineer at BackType, a marketing intelligence company, that was acquired by Twitter in July of 2011. He is the author of two major open source projects: Storm, a distributed realtime computation system, and Cascalog, a tool for processing data on Hadoop. He is a frequent speaker and writes a blog at nathanmarz.com.
Sam Ritchie is an engineer at Twitter who uses Cascalog and ElephantDB to process and analyze many terabytes of data in near real-time. He is also the lead developer on FORMA, an open-source deforestation monitoring system in use by a number of top research institutions. He is a committer on Cascalog, ElephantDB, Pallet and a number of other open source Clojure projects.
Services like social networks, web analytics, and intelligent e-commerce often need to manage data at a scale too big for a traditional database. Complexity increases with scale and demand, and handling big data is not as simple as just doubling down on your RDBMS or rolling out some trendy new technology. Fortunately, scalability and simplicity are not mutually exclusive—you just need to take a different approach. Big data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers.
Big Data teaches you to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy to understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.
Big Data shows you how to build the back-end for a real-time service called SuperWebAnalytics.com—our version of Google Analytics. As you read, you'll discover that many standard RDBMS practices become unwieldy with large-scale data. To handle the complexities of Big Data and distributed systems, you must drastically simplify your approach. This book introduces a general framework for thinking about big data, and then shows how to apply technologies like Hadoop, Thrift, and various NoSQL databases to build simple, robust, and efficient systems to handle it.
發表於2024-05-20
Big Data 2024 pdf epub mobi 電子書 下載
很早就聽說瞭大名鼎鼎的Lambda Architecture,但是一直不明白具體的含義。就算讀瞭wikipedia ( https://en.wikipedia.org/wiki/Lambda_architecture ),依然隻明其錶而不懂其裏。好在有這本《Big Data - Principles and Best Practices of Scalable Runtime Data Systems》給予...
評分1. 大名鼎鼎的 Lambda 架構作者的書; 2. 喜歡這樣條分縷析的思路 3. Human-fault tolerance is not optional 4. example 有點多餘, 信息冗雜讀較高 4. Lambda 架構 serving layer 對 normalization/denormalization 解決的的確很好 5. 如果能夠在剛接觸大數據的時候讀這本書, ...
評分1. 大名鼎鼎的 Lambda 架構作者的書; 2. 喜歡這樣條分縷析的思路 3. Human-fault tolerance is not optional 4. example 有點多餘, 信息冗雜讀較高 4. Lambda 架構 serving layer 對 normalization/denormalization 解決的的確很好 5. 如果能夠在剛接觸大數據的時候讀這本書, ...
評分1. 大名鼎鼎的 Lambda 架構作者的書; 2. 喜歡這樣條分縷析的思路 3. Human-fault tolerance is not optional 4. example 有點多餘, 信息冗雜讀較高 4. Lambda 架構 serving layer 對 normalization/denormalization 解決的的確很好 5. 如果能夠在剛接觸大數據的時候讀這本書, ...
評分前幾天看到一個行業相關的雲平颱技術方案的架構圖,粗略看瞭一下,覺得其應該是基於經典的大數據方案構建的,所以決定靜下心來,在2019年這個大數據已經漸涼的時間點上,對大數據架構進行一下考古,自己補習一下。找來找去,目前談大數據架構的書籍隻有這本還算不錯,其他的書...
圖書標籤: bigdata 數據挖掘 大數據 計算機 data manning 編程 big
8.9的評分 !? 給5星的朋友 你們真的看過這本書麼?或者說 你們是做分布式係統的麼? 如果是的話 隻能說你們太業餘瞭 這本書入門都不夠!!!!!
評分草草看完瞭,思路上清晰瞭一點,但感悟還是不夠深,需要把每一個提到的東西稍微研究一下纔行…
評分這本書終於都齣完瞭,追瞭快一年瞭
評分已看完目前寫完的部分。高屋建瓴地介紹如何構建一套滿足並發、穩定、靈活、容錯要求的數據架構。一定要寫書評!
評分真不怎麼樣 ,lambda 這概念早就過時瞭 實踐起來也很難。
Big Data 2024 pdf epub mobi 電子書 下載