Marko Bonaći has worked with Java for 13 years. He currently works as IBM Enterprise Content Management team lead at SV Group. Petar Zečević is a CTO at SV Group. During the last 14 years he has worked on various projects as a Java developer, team leader, consultant and software specialist. He is the founder and, with Marko, organizer of popular Spark@Zg meetup group.
Working with big data can be complex and challenging, in part because of the multiple analysis frameworks and tools required. Apache Spark is a big data processing framework perfect for analyzing near-real-time streams and discovering historical patterns in batched data sets. But Spark goes much further than other frameworks. By including machine learning and graph processing capabilities, it makes many specialized data processing platforms obsolete. Spark's unified framework and programming model significantly lowers the initial infrastructure investment, and Spark's core abstractions are intuitive for most Scala, Java, and Python developers.
Spark in Action teaches you to use Spark for stream and batch data processing. It starts with an introduction to the Spark architecture and ecosystem followed by a taste of Spark's command line interface. You then discover the most fundamental concepts and abstractions of Spark, particularly Resilient Distributed Datasets (RDDs) and the basic data transformations that RDDs provide. The first part of the book also introduces you to writing Spark applications using the the core APIs. Next, you learn about different Spark components: how to work with structured data using Spark SQL, how to process near-real time data with Spark Streaming, how to apply machine learning algorithms with Spark MLlib, how to apply graph algorithms on graph-shaped data using Spark GraphX, and a clear introduction to Spark clustering.
發表於2024-11-21
Spark in Action 2024 pdf epub mobi 電子書 下載
首先是翻譯感覺不是很流暢,很多術語翻譯的不太對。對spark的組件,或者提交任務之後的整體流程講得不夠細緻,每個知識點都是淺嘗輒止。有點遺憾 在看對應章節的時候,可以配閤官方文檔或者是博客去深入。也可以輔助看其他書,例如hadoop權威指南 附錄講mapreduce的部分原本以...
評分首先是翻譯感覺不是很流暢,很多術語翻譯的不太對。對spark的組件,或者提交任務之後的整體流程講得不夠細緻,每個知識點都是淺嘗輒止。有點遺憾 在看對應章節的時候,可以配閤官方文檔或者是博客去深入。也可以輔助看其他書,例如hadoop權威指南 附錄講mapreduce的部分原本以...
評分首先是翻譯感覺不是很流暢,很多術語翻譯的不太對。對spark的組件,或者提交任務之後的整體流程講得不夠細緻,每個知識點都是淺嘗輒止。有點遺憾 在看對應章節的時候,可以配閤官方文檔或者是博客去深入。也可以輔助看其他書,例如hadoop權威指南 附錄講mapreduce的部分原本以...
評分原著可以,但是翻譯是陀翔,例如:第五章介紹dataframe的錶元數據時:surviving Spark context restarts 翻譯成‘幸存的上下文重新啓動’,原文的意思是spark重啓後錶元數據還存在,書中類似不經大腦的機械翻譯到處都是,正如譯者在前言中說的一樣,您真對不起你的老公和孩子,...
評分原著可以,但是翻譯是陀翔,例如:第五章介紹dataframe的錶元數據時:surviving Spark context restarts 翻譯成‘幸存的上下文重新啓動’,原文的意思是spark重啓後錶元數據還存在,書中類似不經大腦的機械翻譯到處都是,正如譯者在前言中說的一樣,您真對不起你的老公和孩子,...
圖書標籤: Spark 大數據 數據挖掘 分布式 Big_Data 軟件工程 程序員 Programming
對於我這種沒做過大數據項目的人做入門還不錯。 兩章講ML的都看不太明白瞭,是該復習一下基礎知識
評分對於我這種沒做過大數據項目的人做入門還不錯。 兩章講ML的都看不太明白瞭,是該復習一下基礎知識
評分對於我這種沒做過大數據項目的人做入門還不錯。 兩章講ML的都看不太明白瞭,是該復習一下基礎知識
評分對於我這種沒做過大數據項目的人做入門還不錯。 兩章講ML的都看不太明白瞭,是該復習一下基礎知識
評分對於我這種沒做過大數據項目的人做入門還不錯。 兩章講ML的都看不太明白瞭,是該復習一下基礎知識
Spark in Action 2024 pdf epub mobi 電子書 下載