Since the dawn of computer networks, every server has recorded logs of all communications activities. You might be surprised to find out that email servers keep logs of every email sent, who they came from, and who they were sent to. Many large organizations even save copies of those emails.
When it comes to sharing files, all FTP servers and all file sharing services, like Dropbox, keep track of what files were uploaded and downloaded.
Then, of course, all website servers keep log files detailing visitor activity. Every time you visit a website, the server it's hosted on tracks this information:
- Your IP address
- The date and time of the visit
- The first file requested
- The fact that the request was completed
- The number of bytes that were transferred
- The referral URL where your visitor came from
- The keywords used to find your site, if any were sent
- Web browser used by the visitor
- Operating system used the visitor
Here's an example of what gets recorded in the log file:
220.127.116.11 - - [17/Apr/2015:09:33:14 -0400] "GET /jewelry-catalog/rings.html HTTP/1.1" 200 14896 "http://www.bing.com/search?q=gold+rings+14k&src=IE-SearchBox&FORM=IE8SRC" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0)"
Even if you are trained in reading a log entry like that one, you would still need a program to decipher the thousands of entries that are collected daily.
What I see in there is that someone using Windows XP, with Internet Explorer 7 was searching on Bing for the keyword phrase "gold rings 14k." From the Bing search results they click on the link for the /jewelry-catalog/rings.html page.
Furthermore I can tell you that this person was in Greenville, South Carolina, U.S. at the time of their search, which was April 17, 2015 at 9:33am Eastern. I also know that they are using a cable or fiber internet connection.
I'm using the IP address information to reverse lookup the town name and the broadband type. You can find out how to do that yourself here.
You can use software like AWStats and Webalizer to turn these log files into usable information very quickly. Your web server probably already has one of those log analyzers installed and running.
With the popularity and versatility of Google Analytics, most website owners don't bother with the crude stats collected by the web server. With all its power, Google Analytics is not allowed to record IP addresses because that's considered personally identifiable. IP address analysis is very important for higher level types of marketing analysis that help you determine how your out-of-home marketing is working. You can read more about the PIP topic here.
I used Bing in the example above to illustrate how search engine keywords are recorded. While Bing still allows us to see these keywords, Google does not. In October 2011 Google started obfuscating these keywords and reported them as "(not provided)" in the Analytics reports. At first Google only hid part of the keywords, but by September 2013 they were hiding all keywords from Google SERPs.
The review of website log files is a task best left up to the professionals, although it doesn't hurt for you to take a look at your own AWStats or Webalizer stats every once in a while. Your web programmer should be able to provide you with directions on how to read them.
Of all the data reported in the log files, I find the referral information to be the most interesting. Although Google Analytics also records referrals, the referrals recorded in your server log file are still more accurate than Google Analytics. I'll talk more about log file referrals tomorrow.
In closing for today, I want to alert you to a potential security risk regarding your log files. The log files are stored as massive text files. Most web servers will save them in hidden hard drive directories that can't be accessed without FTP software or a special web browser interface. Make sure your own log files can't be accessed without a username and password.