WebSec - Webpage Change Notification

What is WebSec?

Web Secretary is a web page change monitoring software. It will detect changes based on content analysis, making sure that it's not just HTML that changed, but actual content. You can tell it what to ignore in the page (hit counters and such), and it can mail you the document with the changes highlighted or load the highlighted page in a browser.

Web Secretary is actually a suite of two Perl scripts called websec and webdiff. websec retrieves web pages and email them to you based on a URL list that you provide. webdiff compares two web pages (current and archive) and creates a new page based on the current page but with all the differences highlighted using a predefined color.

For example you can look at the Web Secretary page as it was monitored:

Personally, I put Web Secretary on crontab to monitor a large number of web pages. When the highlighted pages are delivered to me, I use procmail to sort them out and file them into another folder. Sometimes, when I am busy, I will not have time to accessing the web for a few days. However, with Web Secretary, I can always access the "archive" that it has created for me at my own leisure.

The man pages can be found online: websec(1), url.list(5), ignore.list(5), webdiff(1).

Summary

You can download from the Savannah project file section. The latest versions are:

Contacts:

Installation

Are there any dependencies?

Only Perl 5 and LWP module which should be standard with all Perl distributions.

How to install?

Simply unpack the archive and modify the configuration file to your hearts content. There is no GUI to configure this program, it's all in the text files.

How to use?

Just run the program and it will do its magic, the best mode would be to put it in a cron job for automatic daily work, this is great if you are connected all the time.

If you are connected by dial-up, you may want to make it run automatically upon connection, how to do this is different between OSes & Distributions so exact instructions you will need to find on your own.

If you want to have different sites checked at different intervals you can check the way Jani Uusitalo made an advanced setup for websec and cron.

How do I get help? How do I help?

You can subscribe to the mailing list. Post messages to request help and offer help. You can suggest ideas and even provide patches to implement them :-)

Please share with us the web pages that you have monitored using Web Secretary, as well as tips and tricks for maximizing the signal-to-noise ratio.

There are several other facilities to help each other:

Authors

The original author is Chew Wei Yih (also known as Victor Chew), with the help of several contributors (see the README file).

Baruch Even picked up the program when it was mostly unmaintained to give it a new public home on Savannah.

'Competitors'

Maybe you are not happy with Web Secretary (please tell me why!), or maybe you want to look at other options, it's a Free-Software world(!) after-all. Following is a list of programs I've found that are somewhat related to what WebSec does.

KWebWatch
Based on Web Secretary, but written in C++ and with a KDE GUI. Has renewed development and offers a KDE3 user interface.
urlchange
Checks only timestamps and not actual page changes. It's main advantages is being written in Python.
wrep
A program to extract with regular expression interesting parts of pages, doesn't check for changes, it simply extracts interesting parts.
There appears to be two wrep's, from two different authors, with totally different methods of work.
WebReporter
A language to munge webpages into something else, can be used to create a summary page out of changes from other sites.