skip to navigation
skip to content

Not Logged In

webstemmer 0.6.0

A web crawler and HTML layout analyzer

Latest Version: 0.7.1

Webstemmer is a web crawler and HTML layout analyzer. It extracts articles from news sites as plain text and removes banners, ads and/or navigation links automatically. You only need to give a URL of the top page of a site and it works in an almost fully automatic way with little human intervention.

 
  • Downloads (All Versions):
  • 0 downloads in the last day
  • 0 downloads in the last week
  • 0 downloads in the last month