Easyspider is a Free Open-source Self-hosted Distributed Web Crawler

Easyspider is a Free Open-source Self-hosted Distributed Web Crawler
Photo by Sebastiano Corti / Unsplash

Easy Spider is a fascinating project that was created in 2006 to facilitate distributed web crawling. The project was developed using Perl and it is designed to crawl web pages, distribute the crawled data to a server, and generate XML files from it. What makes Easy Spider a great tool is that it is compatible with any computer, whether it is running Windows or Linux.

The project uses a unique architecture that allows the client site to collect all the data and store it on the server. This architecture is particularly useful for large data sets since it prevents the client site from becoming overloaded. Additionally, the server can be accessed from anywhere in the world, allowing users to access their data from any location.

Easy Spider has revolutionized the way we gather and store data, making it easier and more efficient than ever before.

Features

  • Client/Server Distributed Crawling
  • Config File Support
  • PDF, DOC, XLS, PPT Extraction Support


License

  • GNU Library or Lesser General Public License version 2.0 (LGPLv2)

Resources

Easyspider - Distributed Web Crawler
Download Easyspider - Distributed Web Crawler for free. Easy Spider is a distributed Perl Web Crawler Project from 2006. Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it.