scrapy
https://github.com/scrapy/scrapy
Python
Scrapy, a fast high-level web crawling & scraping framework for Python.
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported160 Subscribers
View all SubscribersAdd a CodeTriage badge to scrapy
Help out
- Issues
- feat: add disallowed_domains option to OffsiteMiddleware
- Fix “not enough values to unpack” when parsing headers
- Add item_processors feature to Scrapy FeedExporter extension
- GCP Service account can be set as string in settings.py
- Add possibility to use Selector (bytes) added in parsel 1.8.0.
- Update tutorial.rst
- add parse_with_rules method and log a warning when _parse_response is…
- Setting a cookie for a different domain does not work
- Implementation for "abstract" methods causes static type checkers to fail
- Deprecate or remove `KEEP_ALIVE`, document how to tell the shell apart
- Docs
- Python not yet supported