htmlq - A command-line tool for extracting data from HTML

In the past, I’ve already mentioned the jc command in an article. As a reminder, jc allows you to transform textual data from commands or scripts into structured data such as JSON.

And today, I’d like to talk to you about htmlq, which uses the same principle of operation as jq, except that we’re working with structured data in HTML. The tool allows you to select and extract elements from an HTML file using CSS selectors.

To make it easier for you, here’s an example of how to retrieve the HTML contained in an element whose class is .post:

curl --silent https://tech2geek.net/ | htmlq '.post'

For example, to output all the links on a page:

curl https://tech2geek.net/ | htmlq --attribute href a

Or to retrieve only a text format (without HTML tags):

curl --silent https://tech2geek.net | htmlq --text .post

This makes it quite easy to do a lot of things without necessarily having to code something to play with XPaths.

Now, to install htmlq, it depends on your OS:

Linux:

  cargo install htmlq

FreeBSD:

  pkg install htmlq

Homebrew (macOS):

  brew install htmlq

Scoop (Windows):

  scoop install htmlq

For all the details, I invite you to read the documentation on GitHub.

Did you enjoy this article? Feel free to share it on social media and subscribe to our newsletter so you never miss a post!

And if you'd like to go a step further in supporting us, you can treat us to a virtual coffee ☕️. Thank you for your support ❤️!

⚠️ Legal Disclaimer: This website is an informational and educational tech blog. The content provided aims to help users better understand technologies, software, online tools, and digital practices.

We do not support or promote any form of piracy, copyright infringement, or illegal use of software, video content, or digital resources.

Any mention of third-party sites, tools, or platforms is purely for informational purposes. It is the responsibility of each reader to comply with the laws in their country, as well as the terms of use of the services mentioned.

We strongly encourage the use of legal, open-source, or official solutions in a responsible manner.

READ 👉 Discover SelfH.st: Simplified Self-Hosting

Categorized in:

How To Software

Tagged in:

htmlq

htmlq – A command-line tool for extracting data from HTML

About the Author

Mohamed SAKHRI

Check latest articles from this author:

Google “Aluminium OS”: Android + ChromeOS Are Finally Merging for Desktop PCs

Black Friday 2025: What’s Actually Worth Buying (and What Isn’t)

SimpMusic: The Open-Source Android App That Outperforms YouTube Music

Comments

Leave a Reply Cancel reply

Google “Aluminium OS”: Android + ChromeOS Are Finally Merging for Desktop PCs

MSI BE6500 Wi-Fi 7 USB Adapter Review

Lenovo Legion 9 (18IAX10): Over-the-Top Performance Laptop With Everything Included

Google “Aluminium OS”: Android + ChromeOS Are Finally Merging for Desktop PCs

MSI BE6500 Wi-Fi 7 USB Adapter Review

Lenovo Legion 9 (18IAX10): Over-the-Top Performance Laptop With Everything Included

Press ESC to close

Or check our Popular Categories...

About the Author

Check latest articles from this author:

Comments

Leave a Reply Cancel reply

Related Articles