Skip to main content

CSS selectors for parsing html on the command line

Project description

Slice and dice html on the command line using CSS selectors.

Quick start

Let’s say you want to grab all the links on http://example.com/foo/bar:

$ curl http://example.com/foo/bar | que "a->href"

Let’s say that gave you 3 lines that looked like this:

/some/url?val=1
/some/url2?val=2
/some/url3?val=3

Ugh, that’s not very helpful, so let’s modify our argument a bit:

$ curl http://example.com/foo/bar | que "a->http://example.com{href}"

Now, that will print:

http://example.com/some/url?val=1
http://example.com/some/url2?val=2
http://example.com/some/url3?val=3

Selecting

Not sure how to use CSS Selectors?

The selector is divided into two parts separated by ->, the first part is the traditional selector talked about in the above links and the second part is the attributes you want to print to the screen for each match:

$ css.selector->attribute,selector

The Selector part uses Python’s string formatting syntax so you can embed the attributes you want within a larger string.

Examples

Find all the “Download” links on a page:

que has support for the the non-standard :contains css selector

$ curl http://example.com | que "a:contains(Download)->href"

Select all the links with attribute data that starts with “foo”:

$ curl http://example.com | que "a[data|=foo]->href"

Installation

You can use pip to install stable:

$ pip install que

or the latest and greatest (which might be different than what’s on pypi:

$ pip install git+https://github.com/jaymon/que#egg=que

Notes

  • If you need a way more fully featured html command line parser, try hq.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

que-0.0.2.tar.gz (3.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page