|
Project History
Robotcop started in response to excellent discussions on Evolt.org and
Slashdot.org which outlined the need for the software to exist. It was
written by three software developers in Northern California.
We looked around for existing solutions to the problem and felt that an Apache module would be another great weapon, in particular because most of the existing tools are CGIs which cannot review all requests to the site. We think Robotcop is currently the best way to protect a website against misbehaving spiders. Technical Overview
Robotcop is a module written in C which is hooked into the access control API
of the webserver. All requests to the site are subject to a number of checks
before Robotcop allows the request to proceed. If a check fails, Robotcop
takes control of the request to counter-attack or ban the spider, and the IP
address of the spider is added to an intercept list so that requests from the
spider during that period will be caught immediately. The IP address is
removed from the list when it sends no further requests for a configurable
period.
These are the checks applied to all requests to the webserver protected by Robotcop:
Project Direction
Robotcop is in active development right now. Have a feature suggestion? Want
to help out with development or serious testing? Try out the software on your
own website and join our mailing list. Here are the major improvements we
intend to add to the software:
|