Christopher D. Manning,1989年畢業於澳大利亞國立大學,1995年獲斯坦福大學語言學博士學位,曾先後在卡內基-梅隆大學、悉尼大學教授語言學,1999年起任斯坦福大學計算機科學和語言學副教授,其主要研究方嚮是統計自然語言處理、信息提取與錶示,以及文本理解和文本挖掘等。
Prabhakar Raghavan,畢業於印度理工學院,後獲加州大學伯剋利分校計算機科學博士學位,自2005年起擔任Yahoo!研究中心負責人,同時也是斯坦福大學計算機科學係顧問教授。其主要研究方嚮是文本及Web數據挖掘、組閤優化、隨機算法等,此前曾任Verity公司CTO,在IBM研究院擔任過管理工作。
Hinrich Schütze,斯坦福大學博士,現任斯圖加特大學自然語言處理研究所理論計算語言學主任。他在美國矽榖工作過多年,曾擔任過Enkata公司首席科學傢。
Class-tested and coherent, this groundbreaking new textbook teaches classic web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
Contents
1. Information retrieval using the Boolean model; 2. The dictionary and postings lists; 3. Tolerant retrieval; 4. Index construction; 5. Index compression; 6. Scoring and term weighting; 7. Vector space retrieval; 8. Evaluation in information retrieval; 9. Relevance feedback and query expansion; 10. XML retrieval; 11. Probabilistic information retrieval; 12. Language models for information retrieval; 13. Text classification and Naive Bayes; 14. Vector space classification; 15. Support vector machines and kernel functions; 16. Flat clustering; 17. Hierarchical clustering; 18. Dimensionality reduction and latent semantic indexing; 19. Web search basics; 20. Web crawling and indexes; 21. Link analysis.
Reviews
“This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll learn about ranking SVMs, XML, DNS, and LSI. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes." -Peter Norvig, Director of Research, Google Inc.
"Introduction to Information Retrieval is a comprehensive, up-to-date, and well-written introduction to an increasingly important and rapidly growing area of computer science. Finally, there is a high-quality textbook for an area that was desperately in need of one." -Raymond J. Mooney, Professor of Computer Sciences, University of Texas at Austin
“Through compelling exposition and choice of topics, the authors vividly convey both the fundamental ideas and the rapidly expanding reach of information retrieval as a field.” -Jon Kleinberg, Professor of Computer Science, Cornell University
發表於2024-11-22
Introduction to Information Retrieval 2024 pdf epub mobi 電子書 下載
搜素引擎入門書籍,各方麵均有涉獵,嚴謹,通俗易懂 入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典入門經典
評分對於搜索引擎的初學者裏說,本書是一本絕對值得閱讀的書目。作者從最簡單的布爾檢索到一個完整的搜索引擎,逐步深入,逐步引導讀者思考,對建造一個大型搜索引擎需要用到的架構和算法都有所涉獵,看完後會對搜索引擎有一個大概的認識,對其基本原理也會有所瞭解。搜索引擎並不...
評分最重要的收獲,是對信息檢索係統(搜索引擎)有一個宏觀的認識,大體上說,需要從兩個維度來看: 第一個是查詢維度,它的核心,是兩個索引結構;其一是字典,其二是倒排拉鏈和正排索引; 字典的職責,是把 query 變成 term set;期間用到瞭多種技術,如:語義擴展(同義詞、拼...
評分第一次看到這本書的時候,還是在前年,當時這本書還隻是個草稿的電子版,基本上ir所涉及到的內容都有,講的也比較全麵。 要是你英文閱讀能力還好的話,推薦去讀讀這本書,肯定會對ir有一個較為全麵的瞭解的。
評分第一次看到這本書的時候,還是在前年,當時這本書還隻是個草稿的電子版,基本上ir所涉及到的內容都有,講的也比較全麵。 要是你英文閱讀能力還好的話,推薦去讀讀這本書,肯定會對ir有一個較為全麵的瞭解的。
圖書標籤: 信息檢索 IR 搜索引擎 計算機 機器學習 自然語言處理 人工智能 計算機科學
終於特麼的看完瞭...
評分電子版的看瞭幾遍,百科全書,深入淺齣
評分良心
評分簡明好讀
評分終於特麼的看完瞭...
Introduction to Information Retrieval 2024 pdf epub mobi 電子書 下載