In this fully updated second edition of the highly acclaimed Managing Gigabytes, authors Written, Moffat, and Bell continue to provide unparalleled coverage of state-of-the-art techniques for compressing and indexing data. Whatever your field, if you work with large quantities of information, this book is essential reading - an authoritative theoretical resource and a practical guide to meeting the toughest storage and access challenges. It covers the latest developments in compression and indexing and their application on the Web and in digital libraries. It also details dozens of powerful techniques supported by mg, the authors' own system for compressing, storing, and retrieving text, images, and textual images. Mg's source code is freely available on the Web. It provides up-to-date coverage of new text compression algorithms such as block sorting, approximate arithmetic coding, and fat Huffman coding. It includes new sections on content-based index compression and distributed querying, with 2 new data structures for fast indexing. It provides new coverage of image coding, including descriptions of de facto standards in use on the Web (GIF and PNG), information on CALIC, the new proposed JPEG Lossless standard, and JBIG2. It includes new information on the Internet and WWW, digital libraries, web search engines, and agent-based retrieval. It is accompanied by a public domain system called MG which is a fully worked-out operational example of the advanced techniques developed and explained in the book. It includes a new appendix on an existing digital library system that uses the MG software.
發表於2024-06-09
Managing Gigabytes 2024 pdf epub mobi 電子書 下載
在這個大數據時代,管理海量數據是必備技能,也是數據挖掘、數據統計分析,信息檢索與數據化運營的基礎技術,這本書作為斯坦福大學信息檢索和挖掘課程的首選教材,重視理論和實踐,深入淺齣地給齣瞭海量信息數據處理的整套解決方案,包括壓縮、索引和查詢的方方麵麵。其最...
評分很老的書,不過的確對得起標題,內容翔實全麵,翻譯的也很不錯。當初看的時候正好在研究lucene的源代碼,裏麵的內容對我幫助很大。 在《信息檢索導論》這本書之前,《深入搜索引擎》應該是全麵介紹信息檢索最好的書瞭。
評分書內容是數據處理的經典教材,不過買的同學注意,彆買重瞭,這本書與2009年電子工業齣版社齣版的<<深入搜索引擎>>內容完全一樣。 這是上一本書的鏈接: http://book.douban.com/subject/3729518/ 兩本書不同的地方: 1.價格 2.譯者序的時間簽名:一個是2009年,一個是2013年 3...
評分 評分很老的書,不過的確對得起標題,內容翔實全麵,翻譯的也很不錯。當初看的時候正好在研究lucene的源代碼,裏麵的內容對我幫助很大。 在《信息檢索導論》這本書之前,《深入搜索引擎》應該是全麵介紹信息檢索最好的書瞭。
圖書標籤: 搜索引擎 大規模數據處理 信息檢索 計算機 Information+Retrieval IR Search Data
比較集中在壓縮算法上麵
評分圖像沒讀
評分qwewqewqe
評分圖像沒讀
評分男神寫的!放在架子上供著!
Managing Gigabytes 2024 pdf epub mobi 電子書 下載