Google Web Accelerator – so much for not being evil.

Well, it looks like Google may have done a bad thing. Ok, the latest in a series of kinda silly things that might be bad enough for some of the shine to come off their star. The /. crowd has traditionally been big google fans, but event hat is starting to turn. The issue? The Google Web Accelerator (GWA) or as I have named it (in the same snarky way many say Micro$oft) the Goolge Web Annihilator is out there, and doing enough damage that I have blocked it from accessing my web server.

“See, Google isn’t serving web pages faster, its serving other people’s versions of the web page faster. What does that mean? Try using Web Accelerator on a forum site, one with lots of geeks who love Google and probably already have Web Accelerator installed. Why, if you’re lucky, you’ll be logged in as someone else, as the folks at SomethingAwful.com discovered.” – quote in context

What is the problem? In short, if you install GWA it sets your browser up to use Google as your proxy for web requests. All your web traffic will go through their servers first. If the GWA decides that the data in their cache is relevant to you, you get that copy instead of the page that lives on the website you were trying to go to. While this sounds like the caching your browser already does, this one is cross-person and largely outside the control of the users (except to turn it off).

ed note: It seems that ASP.NET applications built using the standard
tools already use a combination of javascript and form submission that is immune
to the GWA.

Your basic “nuke my data” concerns

“We discovered this yesterday when a few people were reporting that their Backpack pages were “disappearing.” We were stumped until we dug a little deeper and discovered this Web Accelerator behavior. Once we figured this out we added some code to prevent Google from prefetching the pages and clicking the links, but it was quite disconcerting.” – quote in context

This is the big one, so I’ll mention it first. Lots and lots of web applications have admin screens that use links to kick off functions. Those links sometimes have names like “delete” on them. Since the GWA automatically “clicks” every link on every page you go to to try and preload data for you it has already been seen to silently delete information. Obviously links like those have always been cause for worry, but the password mechanisms that usually protect administration pages kept the search engine spiders outside. No more, because the GWA uses your password and username to impersonate you.

note: the prefetching can be turned off in the GWA preferences, but I doubt mot
people will do that.

Privacy concerns

Are you really comfortable with Google having access to every url, every page and every screen of data you browse? Not to mention having a temporary (or not so temporary) copy of your access cookies, usernames and passwords? Lets assume for a moment that mistakes never happen… I don’t want all that information going through google.

Of course the potential for mistakes is huge. The number of access and protection mechanisms employed by websites is large. Some or many of them may prove to be incompatible with the way GWA handles caching and thus not only is the exposure of private data possible… it has already been seen in the wild.

Information integrity concerns

Anyone who is a webmaster of a medium to large organization or a web developer already knows what a problem bad caching is. Out of date pages, incomplete loading or mutually contradictory information presented to the user is a common problem. When the end users local cache is the only problem this can often be solved easily. Heck, most tech-savvy people I know run with the browser cache turned off whenever possible.

There is already a whole layer of complexity thanks to the fact that AOL runs as a proxy / cache server for AOL browser users. It causes all sorts of havoc with webmasters this way. Old pages, broken pages, incomplete loading… not to mention the trouble with out of date DNS entries. Long after the whole web has your new domain / ip mapping AOL can make a request for an IP days or weeks out of date. While we are discussing it, GWA does already change your website some.

Access control concerns

"In addition I have found it impossible to use my
web mail with this running because as soon as I sign in
Google Web Accelerator is “clicking”? on the sign-out
link, and killing my session.

You’d have thought Google would think these things
through a bit more, considering all the great minds
they’re supposed to have working for them" –

quote in context (by "Mike")

Let’s be honest… most of the people making the big $$$ in password protected websites are in the adult entertainment trade. Among adult sites password trading is a serious problem. The most important ways to detect this are:

  • Multiple logins for the same user coming from different URL’s simultaneously or in a short time
  • Many connections from the same user for many pages in the same moment, even from the same IP

See the problem? The GWA will make most logins using it look like they come from the same IP address, no matter who is using it. And yes, we have the same challenge with AOL. Also the GWA preloads pages for the user in such a way that when you go to a page the spider runs off and hits every link on the page while you read. Given the size of the server farm Google uses the resulting storm could easily look like a piracy attempt.

Revenue Concerns

Many forms of revenue generation on the web depend on “click through” (how many times an add generates a user click) in some form or another for traffic counting. It may be a top list, a pay per click add program (like Googles own AdSense) or some other mechanism that is expecting human clicks. Granted this is mostly a solved problem for services like this, but it will cause issues.

This is all quite aside from the fact that the GWA makes many types of site statistics metrics (how often someone logs in, when and where from and so on) impossible.

Remedies

  • You can
    block or redirect
    access by the GWA. You may send it to a page
    explaining to users why there is a problem. This is probably for the best
    all around… there are many concerns with this product and just locking it
    out will solve all of them.
  • No destructive changes, and probably no data changes at all, should be
    triggered by GET style links and forms. Use POST.
  • GWA stays away from anything under HTTPS. It is probably true that your
    admin interfaces should be under HTTPS anyway.

Link Library


Comments

One response to “Google Web Accelerator – so much for not being evil.”

  1. Noticed you trackbacked to my site. You have quite an extensive site, keep up the great work.