Xidel is an open-source data extraction tool
Xidel is a command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
It is a platform-independent package which runs on Windows, Linux, and macOS.
- East to setup, use
- Zero configuration required
- Works smoothly on Windows, Linux, macOS and Android
- Well documented
- Packed with dozens of examples
- Lightweight package
Xidel supports expressions
- CSS 3 Selectors: to extract elements unchanged
- XPath 3.0: to extract values and calculate things with them.
- XQuery 3.0: to create new documents from the extracted values and to build Turing-complete scripts.
- Pattern matching: to extract several expressions in an easy way using an annotated version of the input page for pattern-matching.
- XPath 2.0/XQuery 1.0: compatibility mode for old XPath/XQuery versions.
- JSONiq: to work with JSON APIs (deprecated by XPath 3.1)
- HTTP Codes: Redirections like 30x are automatically followed, while keeping things like cookies.
- Links: It can follow (all) links on a page, meta refreshs, or any extracted value.
- HTML Forms: It can fill in arbitrary data in the input elements and submit the form.
- Arbitrary HTTP requests: In any query, you can call a function to make other requests.
- Adhoc: just prints the data in a human-readable format.
- XML: encodes the data as XML.
- HTML: encodes the data as HTML.
- JSON: encodes the data as JSON.
- bash/cmd: exports the data as shell variables.
- fn:serialize: implements the W3C XQuery Serialization standard.
- Connections: HTTP / HTTPS as well as local files or stdin.
Xidel is released under the GNU General Public License v3.0.
Virtual assistant technology defines as an application program that uses semantic and deep learning. It can also call an AI assistant or digital assistant. It helps users or enterprises to assist people or automate tasks. Any virtual assist.......Read more...
A Headless API-based CMS is a content management system that offers an API endpoint to view, manage, and create content, users, and settings instead of the classical web interface. Many developers like API-based approach as it is easier to.......Read more...
CMS, or Contest Management System, is a distributed system for running and (to some extent) organizing a programming contest. CMS has been designed to be general and to handle many types of contests, tasks, scoring, etc. Nonetheless, CMS ha.......Read more...