Self-hosted
Easyspider is a Free Open-source Self-hosted Distributed Web Crawler
Easy Spider is a fascinating project that was created in 2006 to facilitate distributed web crawling. The project was developed using Perl and it is designed to crawl web pages, distribute the crawled data to a server, and generate XML files from it. What makes Easy Spider a great tool