Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.
发表于2024-12-26
Mining of Massive Datasets 2024 pdf epub mobi 电子书
只看了两章,所有真心不好打分。这其实是本数学书,而且是一本入门书。这本书的目标读者不是工程师,而是读研或者读博的学生。如果你本身就有数据挖掘后者机器学习的背景,或者就是很喜欢数学,我还是很推荐这本书的,学习新东西总是很有趣的。
评分 评分并非传统的”数据挖掘”教材,更像是,“数据挖掘”在互联网的应用场景,所遇到的问题(数据量大)和解决方案; 不过老实说,这本书挺不好懂的。 大概 get 了几个不错的思想: 思想-1:务必充分利用数据的”稀疏性”,如数据充分稀疏时,可以利用 HASH 将数据“聚合”成“有效...
评分 评分只看了两章,所有真心不好打分。这其实是本数学书,而且是一本入门书。这本书的目标读者不是工程师,而是读研或者读博的学生。如果你本身就有数据挖掘后者机器学习的背景,或者就是很喜欢数学,我还是很推荐这本书的,学习新东西总是很有趣的。
图书标签: 数据挖掘 计算机 机器学习 Data Coursera CS 数据分析 软件工程
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
下学期课程参考textbook,听说professor还不错,打算好好学一下这门课
评分下学期课程参考textbook,听说professor还不错,打算好好学一下这门课
评分bug非常之多, 还找不到地方提交, 读起来极度痛苦, 前看后忘, 也许里面的算法本质上就是这样, bottom line至少近15年最新的论文成果被这么串讲一下, 本科生也能看懂
评分花费6个月时间,断断续续看完,哈希和近似的想法真是开阔了眼界。第一回看比较急促,此书值得反复看,多实践。
评分内容不错,但作为技术向的书有些浮于表面。
Mining of Massive Datasets 2024 pdf epub mobi 电子书