{"id":2516,"date":"2005-05-08T03:05:11","date_gmt":"2005-05-08T11:05:11","guid":{"rendered":"http:\/\/www.soulhuntre.com\/items\/date\/2005\/05\/08\/google-web-accelerator-so-much-for-not-being-evil\/"},"modified":"2005-05-08T03:05:11","modified_gmt":"2005-05-08T11:05:11","slug":"google-web-accelerator-so-much-for-not-being-evil","status":"publish","type":"post","link":"http:\/\/legacyiamsenseiken.local\/2005\/05\/08\/google-web-accelerator-so-much-for-not-being-evil\/","title":{"rendered":"Google Web Accelerator – so much for not being evil."},"content":{"rendered":"

Well, it looks like Google<\/a> may have done a bad thing. Ok, the latest in a series of kinda silly things that might be bad enough for some of the shine to come off their star. The \/. crowd has traditionally been big google fans, but event hat is starting to turn. The issue? The Google Web Accelerator<\/a> (GWA<\/a>) or as I have named it (in the same snarky way many say Micro$oft) the Goolge Web Annihilator<\/i><\/b> is out there, and doing enough damage that I have blocked it from accessing my web server<\/a>.<\/p>\n

\n

“See, Google isn’t serving web pages faster, its serving other people’s versions of the web page faster. What does that mean? Try using Web Accelerator on a forum site, one with lots of geeks who love Google and probably already have Web Accelerator installed. Why, if you’re lucky, you’ll be logged in as someone else, as the folks at SomethingAwful.com discovered<\/a>.” – quote in context<\/a><\/i><\/p>\n<\/blockquote>\n

<\/p>\n

What is the problem? In short, if you install GWA it sets your browser up to use Google as your proxy for web requests. All your web traffic will go through their servers first. If the GWA decides that the data in their cache is relevant to you, you get that copy instead of the page that lives on the website you were trying to go to. While this sounds like the caching your browser already does, this one is cross-person and largely outside the control of the users (except to turn it off).<\/p>\n

ed note: It seems that ASP.NET applications built using the standard
\ntools already use a combination of javascript and form submission that is immune
\nto the GWA.<\/i><\/b><\/p>\n

Your basic “nuke my data” concerns<\/h2>\n
\n

“We discovered this yesterday when a few people were reporting that their Backpack<\/a> pages were “disappearing.” We were stumped until we dug a little deeper and discovered this Web Accelerator behavior. Once we figured this out we added some code to prevent Google from prefetching the pages and clicking the links, but it was quite disconcerting.” – quote in context<\/a><\/i><\/p>\n<\/blockquote>\n

This is the big one, so I’ll mention it first. Lots and lots of web applications have admin screens that use links to kick off functions. Those links sometimes have names like “delete” on them. Since the GWA automatically “clicks” every link on every page you go to to try and preload data for you it has already been seen to silently delete information<\/i><\/b>. Obviously links like those have always been cause for worry, but the password mechanisms that usually protect administration pages kept the search engine spiders outside. No more, because the GWA uses your password and username to impersonate you.<\/p>\n

note: the prefetching can be turned off in the GWA preferences, but I doubt mot
\npeople will do that.<\/i><\/p>\n

Privacy concerns<\/h2>\n

Are you really comfortable with Google having access to every url, every page and every screen of data you browse? Not to mention having a temporary (or not so temporary) copy of your access cookies, usernames and passwords? Lets assume for a moment that mistakes never happen… I don’t want all that information going through google.<\/p>\n

Of course the potential for mistakes is huge. The number of access and protection mechanisms employed by websites is large. Some or many of them may prove to be incompatible with the way GWA handles caching and thus not only is the exposure of private data possible… it has already been seen in the wild<\/a>.<\/p>\n

Information integrity concerns<\/h2>\n

Anyone who is a webmaster of a medium to large organization or a web developer already knows what a problem bad caching is. Out of date pages, incomplete loading or mutually contradictory information presented to the user is a common problem. When the end users local cache is the only problem this can often be solved easily. Heck, most tech-savvy people I know run with the browser cache turned off whenever possible.<\/p>\n

There is already a whole layer of complexity thanks to the fact that AOL runs as a proxy \/ cache server for AOL browser users. It causes all sorts of havoc with webmasters this way. Old pages, broken pages, incomplete loading… not to mention the trouble with out of date DNS entries. Long after the whole web has your new domain \/ ip mapping AOL can make a request for an IP days or weeks out of date. While we are discussing it, GWA does already change your website<\/a> some.<\/p>\n

Access control concerns<\/h2>\n
\n
\n
\n
\n
\n
\n

"In addition I have found it impossible to use my
\n\t\t\t\t\t\tweb mail with this running because as soon as I sign in
\n\t\t\t\t\t\tGoogle Web Accelerator is \u00e2\u20ac\u0153clicking\u00e2\u20ac? on the sign-out
\n\t\t\t\t\t\tlink, and killing my session.<\/i><\/p>\n

You\u00e2\u20ac\u2122d have thought Google would think these things
\n\t\t\t\t\t\tthrough a bit more, considering all the great minds
\n\t\t\t\t\t\tthey\u00e2\u20ac\u2122re supposed to have working for them" –
\n\t\t\t\t\t\t

\n\t\t\t\t\t\tquote in context (by "Mike")<\/a><\/i><\/p>\n<\/blockquote><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n

Let’s be honest… most of the people making the big $$$ in password protected websites are in the adult entertainment trade. Among adult sites password trading is a serious problem. The most important ways to detect this are:<\/p>\n