Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
發表於2024-11-30
Mining of Massive Datasets 2024 pdf epub mobi 電子書 下載
看到開篇的兩個例子,一個是地圖聚類分析倫敦病毒問題,另一個是概率統計的例子。對本書還挺有期望。結果翻到第三章開始,這。。 尼瑪整本書就是個目錄啊。全書結構如下:知識點,摘要,奇葩的例子,習題。 然後另一個知識點,知識點,識點。。 如果為瞭平時聊天增加些談資偶...
評分 評分很差是給中譯版的。 本書的中譯版是中科院計算所的王斌老師翻譯的,但是翻譯的很屎。估計王老師拿到英文稿之後就扔給學生去翻譯瞭,看這翻譯水平,實在是不敢恭維。 以上純為發泄心中不滿所寫。因為我看譯者序,說是自己獨立翻譯,前後持續瞭七個多月,並曆經多次修改。如果...
評分隻看瞭兩章,所有真心不好打分。這其實是本數學書,而且是一本入門書。這本書的目標讀者不是工程師,而是讀研或者讀博的學生。如果你本身就有數據挖掘後者機器學習的背景,或者就是很喜歡數學,我還是很推薦這本書的,學習新東西總是很有趣的。
評分Web數據挖掘特點,相比較ML增加瞭哪些理論和技術? (1) 大約覆蓋瞭20篇論文。用瞭統一的語言,統一深度數學來錶達。 (2) Hash用的特彆多。方式各異。如下。 a. 提高檢索速度,如index b. 數據隨機分組。 c. 定義數據映射,重復這些映射。最基本功能。但對於新數據映射會存...
圖書標籤: 數據挖掘 計算機 機器學習 Data Coursera CS 數據分析 軟件工程
行文很流暢,看到下麵很多人說翻譯的問題,由此推薦原版。配閤網課還是挺淺顯的,例子舉得也挺多,自學也可以。步驟寫的也很細,有條件完全可以照著碼,不晦澀,小白很喜歡。
評分行文很流暢,看到下麵很多人說翻譯的問題,由此推薦原版。配閤網課還是挺淺顯的,例子舉得也挺多,自學也可以。步驟寫的也很細,有條件完全可以照著碼,不晦澀,小白很喜歡。
評分花費6個月時間,斷斷續續看完,哈希和近似的想法真是開闊瞭眼界。第一迴看比較急促,此書值得反復看,多實踐。
評分下學期課程參考textbook,聽說professor還不錯,打算好好學一下這門課
評分花費6個月時間,斷斷續續看完,哈希和近似的想法真是開闊瞭眼界。第一迴看比較急促,此書值得反復看,多實踐。
Mining of Massive Datasets 2024 pdf epub mobi 電子書 下載