|
robots.txt Information
robotstxt.org
This site is a great starting point for learning about spiders.
robotstxt.org
This page in particular has everything you need to know to configure
your robots.txt file.
RoboGen
If you don't want to create your robots.txt by hand this software can help
you write it.
News and Discussions
No Bots Allowed!
News article on ZDNet about the eBay vs. Bidder's Edge lawsuit which
involved the legal authority of the robots.txt file.
Tough Times for Data Robots
News article in the New York Times about the Register.com vs. Verio decision which
also involves robots ignoring the robots.txt file.
Search Engines and Legal Issues
Great collection of news links on Searchenginewatch.com regarding spiders and crawling websites without permission.
The honest truth about bad robots
Great article and discussion on the subject of blocking bad robots on Evolt.org.
How to Defeat Bad Web Robots With Apache
Lee Killough puts together an excellent summary of the many techniques for
blocking bad bots.
Stopping SpamBots With Apache
A good discussion at Slashdot.org on techniques of dealing with unwelcome
spiders.
Other Projects
Wpoison
Similar project which provides a CGI for producing endless lists of false
e-mail addresses for spiders. Could be a good alternate choice for people
who can't use Robotcop.
LaBrea
A project applying similar principals to fighting network scanners.
|