Who’s Googling Who? Log Parsing II
Wouldn’t it be neat to run a script against the access logs of your web server to see what Google queries result in visits to your site? The output from such a log scanning script might look like [this] page where the timestamp and URL of Google driven visits are displayed. And, the URL is a link which is clicked to see the original Google search result as viewed by the person entering keywords into Google’s search page. A page like this can help the site owner understand, among other things, which topics are most popular and which pages are structured effectively.
***
In Apache Log Parsing I described a method to unpack web server log files so they could be analyzed for visitor traffic patterns. Since then, I’ve use the parser on my own web server logs and been looking through the output.
***
After a bit, I noticed something surprising in the logs. The referer section of certain log entries contained what appeared to be a Google search URL. This was confirmed by copy/pasting a few of these into my browser. Running the URLs in my browser produced a Google search page, and sure enough, somewhere on the Google page was a link to IPhone Cafe.
***
I thought, “Wow, assuming these aren’t robot transactions, I can now see what IPhone Cafe content is of greatest interest to real people”. Disabling IPhone’s autocorrect seems to be the number one topic. Many IPhone autocorrect searches originate in Europe. Moving contact information from old phone to iPhone is another popular topic.
***
I couldn’t afford to spend a bunch of time manually eyeballing access logs. So, I modified the log parsing script to open an output file and fill it with lines containing Google search URLs. While doing this, the script wraps HTML anchor tags around search URLs and then FTP’s the HTM file up to a folder on my ISP’s server.
***
The resulting, uploaded, Google search file is located [here]. The large fonts are formated for IPhone’s Safari browser but it still looks pretty much okay in a desktop browser. Clicking a link displays the original Google search from my web server access log. Pretty neat.
***
The Perl script which parses, formats, and does the ftp upload should work for just about any web site, IPhone related or not, so I’ll post the complete script [here] after cleaning it up a bit.
Tony
IPhone Cafe » Apache Log Parsing said,
February 21, 2008 @ 4:21 am
[...] Visit Log Parsing II for an example showing how web access log information can be used. – Tony [...]
IPhone Cafe » Log Parsing III said,
February 23, 2008 @ 8:26 pm
[...] In an earlier post, Log Parsing II, I described scanning the Apache access log with Perl to build an html file containing Google searches of your site. Here’s a link to the complete example script: Example Script – Googles Searches [...]