Scatter/Gather thoughts

by Johan Petersson

Spiderspotting: WWWC/1.04

One of my web pages (not part of this blog) are regularly fetched by a user agent which only identifies itself as WWWC/1.04. It did take some digging around before I found what must be the official WWWC site. It's entirely written in Japanese, but I'm not one to let mundane things like language barriers impede my investigation.

I don't know enough Japanese to talk myself out of a rice paper bag, but according to the abysmal translation services of Google Translate, WWWC is "the renewal checker of the Web page which operates with the Windows". Furthermore, "registering the URL of the Web page which is seen carefully, being something it checks renewal, it notifies the Web page which is renewed, it does."

Yoda couldn't have said it better. Looks like a client application that automatically checks for web page updates. That's fine by me, although I wish the software was a tad more well-behaved. Since it's automatic it should check robots.txt, and using conditional GETs rather than repeatedly fetching the entire page would save resources both for me and its users. Alas, WWWC does not appear to be popular enough to be a problem.

13 February, 2005