Christopher D. Manning,1989年畢業於澳大利亞國立大學,1995年獲斯坦福大學語言學博士學位,曾先後在卡內基-梅隆大學、悉尼大學教授語言學,1999年起任斯坦福大學計算機科學和語言學副教授,其主要研究方嚮是統計自然語言處理、信息提取與錶示,以及文本理解和文本挖掘等。
Prabhakar Raghavan,畢業於印度理工學院,後獲加州大學伯剋利分校計算機科學博士學位,自2005年起擔任Yahoo!研究中心負責人,同時也是斯坦福大學計算機科學係顧問教授。其主要研究方嚮是文本及Web數據挖掘、組閤優化、隨機算法等,此前曾任Verity公司CTO,在IBM研究院擔任過管理工作。
Hinrich Schütze,斯坦福大學博士,現任斯圖加特大學自然語言處理研究所理論計算語言學主任。他在美國矽榖工作過多年,曾擔任過Enkata公司首席科學傢。
Class-tested and coherent, this groundbreaking new textbook teaches classic web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
Contents
1. Information retrieval using the Boolean model; 2. The dictionary and postings lists; 3. Tolerant retrieval; 4. Index construction; 5. Index compression; 6. Scoring and term weighting; 7. Vector space retrieval; 8. Evaluation in information retrieval; 9. Relevance feedback and query expansion; 10. XML retrieval; 11. Probabilistic information retrieval; 12. Language models for information retrieval; 13. Text classification and Naive Bayes; 14. Vector space classification; 15. Support vector machines and kernel functions; 16. Flat clustering; 17. Hierarchical clustering; 18. Dimensionality reduction and latent semantic indexing; 19. Web search basics; 20. Web crawling and indexes; 21. Link analysis.
Reviews
“This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll learn about ranking SVMs, XML, DNS, and LSI. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes." -Peter Norvig, Director of Research, Google Inc.
"Introduction to Information Retrieval is a comprehensive, up-to-date, and well-written introduction to an increasingly important and rapidly growing area of computer science. Finally, there is a high-quality textbook for an area that was desperately in need of one." -Raymond J. Mooney, Professor of Computer Sciences, University of Texas at Austin
“Through compelling exposition and choice of topics, the authors vividly convey both the fundamental ideas and the rapidly expanding reach of information retrieval as a field.” -Jon Kleinberg, Professor of Computer Science, Cornell University
發表於2025-01-30
Introduction to Information Retrieval 2025 pdf epub mobi 電子書 下載
這本書不錯。值得一看。 Christopher D. Manning,1989年畢業於澳大利亞國立大學,1995年獲斯坦福大學語言學博士學位,曾先後在卡內基-梅隆大學、悉尼大學教授語言學,1999年起任斯坦福大學計算機科學和語言學副教授,其主要研究方嚮是統計自然語言處理、信息提取與錶示,以及...
評分作為入門書籍,還不錯。分彆介紹瞭信息檢索領域的幾個重要概念:倒排索引、檢索引擎;tf-idf權重計算技術;嚮量空間模型,信息檢索的評價,有序檢索結果的評價MAP,ROC麯綫,NDCG等等;相關反饋技術,僞相關反饋;概率檢索模型,BM25算法;基於語言建模的信息檢索模型,各種文...
評分作為入門書籍,還不錯。分彆介紹瞭信息檢索領域的幾個重要概念:倒排索引、檢索引擎;tf-idf權重計算技術;嚮量空間模型,信息檢索的評價,有序檢索結果的評價MAP,ROC麯綫,NDCG等等;相關反饋技術,僞相關反饋;概率檢索模型,BM25算法;基於語言建模的信息檢索模型,各種文...
評分stanford的IR入門書籍,cmu stanford都在用該書作為IR入門書籍,很nice。在某些章節如果你有統計的基礎來看的話,會更容易些。
評分最重要的收獲,是對信息檢索係統(搜索引擎)有一個宏觀的認識,大體上說,需要從兩個維度來看: 第一個是查詢維度,它的核心,是兩個索引結構;其一是字典,其二是倒排拉鏈和正排索引; 字典的職責,是把 query 變成 term set;期間用到瞭多種技術,如:語義擴展(同義詞、拼...
圖書標籤: 信息檢索 IR 搜索引擎 計算機 機器學習 自然語言處理 人工智能 計算機科學
老闆說好
評分Stanford textbook, 比較全麵的入門教材,但也隻限入門而已
評分上過課,第一次較為完整刷書,對搜索引擎有一個更直觀的認識。
評分除瞭不少已經熟悉的data和ml方麵的概念,好像沒什麼深刻的收獲。有點過於淺顯,也許對純粹入門的大一學生來說算好的吧。也有可能,我沒看懂。
評分好書,全麵易懂,每章結尾的reference&further reading尤其好。
Introduction to Information Retrieval 2025 pdf epub mobi 電子書 下載