Scrapy 文档¶

本文档最初的翻译迁移自Scrapy 0.24的翻译，感谢初始译者。

本文档包含你需要知道的关于Scrapy的一切。

获取帮助¶

有麻烦吗？We’d like to help!

常见问题: 获取最常问的问题的解答。
调试Spiders: 学习如何调试scrapy爬虫的常见问题。
Spiders Contracts: Learn how to use contracts for testing your spiders.
Common Practices: Get familiar with some Scrapy common practices.
Broad Crawls: 调整Scrapy来并发爬取多个域名。
Using Firefox for scraping: Learn how to scrape with Firefox and some useful add-ons.
Using Firebug for scraping: Learn how to scrape efficiently using Firebug.
Debugging memory leaks: Learn how to find and get rid of memory leaks in your crawler.
Downloading and processing files and images: 下载与你的爬取的项目相关的文件和/或图像。
Ubuntu packages: Install latest Scrapy packages easily on Ubuntu
Deploying Spiders: Deploying your Scrapy spiders and run them in a remote server.
AutoThrottle extension: Adjust crawl rate dynamically based on load.
Benchmarking: Check how Scrapy performs on your hardware.
Jobs: pausing and resuming crawls: Learn how to pause and resume crawls for large spiders.

Architecture overview: Understand the Scrapy architecture.
Downloader Middleware: Customize how pages get requested and downloaded.
Spider Middleware: Customize the input and output of your spiders.
Extensions: Extend Scrapy with your custom functionality
Core API: Use it on extensions and middlewares to extend Scrapy functionality
Signals: See all available signals and how to work with them.
Item Exporters: Quickly export your scraped items to a file (XML, CSV, etc).