16

Monitoring visits

Create site12 by copying site11.

  1. /cms
    1. ...
    2. site11
    3. site12

In this chapter, we are going to program the logging of all the requests to the site, in a file and in the database.

To test the result online, enter http://www.frasq.org/cms/site12 in the address bar of your navigator.

Add a folder called log directly under the root of the site:

  1. /cms/site12
    1. log

IMPORTANT: Make sure the Apache process is allowed to write in the folder:

$ chgrp www-data log
$ chmod 775 log

NOTE: The name of the group of the Apache process is defined by the Group directive. The configurations of the local and the remote servers can be different, depending on your internet provider.

Define the global variable $log_dir in config.inc:

  1. global $log_dir;
  2.  
  3. $log_dir = ROOT_DIR . DIRECTORY_SEPARATOR . 'log';

Add the files clientipaddress.php, validateipaddress.php and log.php in the folder library with the following contents:

  1. /cms/site12
    1. library
      1. clientipaddress.php
      2. validateipaddress.php
      3. log.php
  1. function client_ip_address() {
  2.     return $_SERVER['REMOTE_ADDR'];
  3. }

client_ip_address returns the PHP variable $_SERVER['REMOTE_ADDR']. NOTE: $_SERVER['HTTP_X_FORWARDED_FOR'], whose value has been added by a proxy server, and $_SERVER['HTTP_CLIENT_IP'], which is assigned directly by the client, are not reliable.

  1. function validate_ip_address($ipaddress) {
  2.     return preg_match('/^(([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]).){3}([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/', $ipaddress);
  3. }

validate_ip_address returns true if $ipaddress is a valid IP address, false otherwise.

  1. require_once 'clientipaddress.php';
  2. require_once 'validateipaddress.php';

Loads the functions client_ip_address and validate_ip_address.

  1. function write_log($logfile, $textline=false) {
  2.     global $log_dir;
  3.  
  4.     $ipaddress = client_ip_address();
  5.  
  6.     if (!validate_ip_address($ipaddress)) {
  7.         return false;
  8.     }
  9.  
  10.     $timestamp=strftime('%Y-%m-%d %H:%M:%S', time());
  11.  
  12.     $logmsg="$timestamp $ipaddress";
  13.     if ($textline) {
  14.         $logmsg .= "\t$textline";
  15.     }
  16.  
  17.     $file = isset($log_dir) ? ($log_dir . DIRECTORY_SEPARATOR . $logfile) : $logfile;
  18.  
  19.     $r = @file_put_contents($file, array($logmsg, "\n"), FILE_APPEND);
  20.  
  21.     return $r;
  22. }

write_log obtains the IP address of the client and validates it, stamps the message with the current date and time followed by the IP address separated by spaces, then adds $textline preceded by a tab character. Next the message is written at the end of the file $logfile in the directory designated by the global variable $log_dir if it's defined.

The logging parameters of a request are defined in config.inc:

  1. global $track_db, $track_log;
  2. global $track_visitor, $track_visitor_agent;
  3.  
  4. $track_db=true;
  5. $track_log=true;
  6. $track_visitor=true;
  7. $track_visitor_agent=true;

$track_visitor set to true triggers logging requests. $track_visitor_agent adds the content of the header User-Agent to the registered data. $track_db gives the name of the DB table which contains the log, track by default if $track_db is just true. $track_log gives the name of the file which contains the the log, track.log in the folder defined by $log_dir by default if $track_log is just true. If $track_db and $track_log are false, no logging is performed.

NOTE: You can easily register other pieces of information like the visitors' languages.

Logging requests is managed by the dispatch function in engine.php:

  1. require_once 'track.php';

Loads the track function.

  1.     global $track_visitor, $track_visitor_agent;
  2.  
  3.     $req = $base_path ? substr(request_uri(), strlen($base_path)) : request_uri();
  4.  
  5.     if ($track_visitor) {
  6.         track($req, $track_visitor_agent);
  7.     }

Calls track with the client's request $req and the globale option $track_agent if the global variable $track_visitor is true.

Add the files useragent.php, validateuseragent.php and track.php in the folder library with the following contents:

  1. /cms/site12
    1. library
      1. useragent.php
      2. validateagent.php
      3. track.php
  1. function user_agent() {
  2.     if (isset($_SERVER['HTTP_USER_AGENT'])) {
  3.         return $_SERVER['HTTP_USER_AGENT'];
  4.     }
  5.  
  6.     return false;
  7. }

user_agent returns the PHP variable $_SERVER['HTTP_USER_AGENT'].

  1. function validate_user_agent($agent) {
  2.     return preg_match('/^[a-zA-Z0-9 \;\:\.\-\)\(\/\@\]\[\+\~\_\,\?\=\{\}\*\|\&\#\!]+$/', $agent);
  3. }

validate_user_agent returns true if $agent designates a valid agent, false otherwise.

  1. require_once 'clientipaddress.php';
  2. require_once 'validateipaddress.php';
  3. require_once 'requesturi.php';
  4. require_once 'useragent.php';
  5. require_once 'validateuseragent.php';

Loads the functions client_ip_address, validate_ip_address, request_uri, user_agent and validate_user_agent.

  1. function track($request_uri=false, $track_agent=false) {
  2.     global $track_log, $track_db;
  3.  
  4.     if (! ($track_log or $track_db) ) {
  5.         return true;
  6.     }
  7.  
  8.     if (!$request_uri) {
  9.         $request_uri=request_uri();
  10.     }
  11.  
  12.     if (!$request_uri) {
  13.         return false;
  14.     }
  15.  
  16.     $user_agent=$track_agent ? user_agent() : false;
  17.     if (!validate_user_agent($user_agent)) {
  18.         $user_agent=false;
  19.     }
  20.  
  21.     $r = true;

track returns immediately if the global variables $track_log and $track_db are false. If the parameter $request_uri isn't defined, track initializes it by calling the function request_uri. If $request_uriis still not defined, track exits. If the parameter $track_agent is true, the variable $user_agent is initialized by calling the function user_agent and validated.

  1.     if ($track_log) {
  2.         require_once 'log.php';
  3.  
  4.         $logmsg = $request_uri;
  5.         if ($user_agent) {
  6.             $logmsg .= "\t" . $user_agent;
  7.         }
  8.  
  9.         $r = write_log($track_log === true ? 'track.log' : $track_log, $logmsg);
  10.     }

If the global variable $track_log is true, track loads the function write_log and asks it to log the request and possibly the agent in the file whose name is defined by $track_log or called track.log by default.

  1.     if ($track_db) {
  2.         $ip_address=client_ip_address();
  3.  
  4.         if (!validate_ip_address($ip_address)) {
  5.             return false;
  6.         }
  7.  
  8.         $sqlipaddress=db_sql_arg($ip_address, false);
  9.         $sqlrequesturi=db_sql_arg($request_uri, true);
  10.         $sqluseragent=db_sql_arg($user_agent, true, true);
  11.  
  12.         $tabtrack=db_prefix_table($track_db === true ? 'track' : $track_db);
  13.  
  14.         $sql="INSERT $tabtrack (ip_address, request_uri, user_agent) VALUES ($sqlipaddress, $sqlrequesturi, $sqluseragent)";
  15.  
  16.         $r = db_insert($sql);
  17.     }
  18.  
  19.     return $r;
  20. }

If the global variable $track_db is true, track obtains the IP address of the client and validates it then it prepares and executes an SQL order which registers the parameters of the request in the table whose name is defined by $track_db or called track by default.

Add the table track to the DB of the site:

$ mysql -u root -p
mysql> use frasqdb2;
mysql> CREATE TABLE track (
  track_id int(10) unsigned NOT NULL AUTO_INCREMENT,
  time_stamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  ip_address varchar(15) NOT NULL,
  request_uri varchar(255) NOT NULL,
  user_agent varchar(255) DEFAULT NULL,
  PRIMARY KEY (track_id)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;
mysql> quit

Check in config.inc that the parameters $track_log and $track_db are set to true. Enter http://localhost/cms/site12/en/home in the address bar of your navigator, access the contact page, change of language.

Display the content of the connection log:

$ tail track.log

To obtain the total number of visitors:

$ cut -f 1 track.log | cut -d' ' -f 3 | sort | uniq | wc -l

To list the 10 most consulted pages:

$ cut -f 2 track.log | sort | uniq -c | sort -rn | head -10

Check the DB:

mysql> SELECT * FROM track;

To obtain the total number of visitors:

mysql> SELECT COUNT(DISTINCT ip_address) from track;

To list the 10 most consulted pages:

mysql> SELECT request_uri, COUNT(request_uri) AS count from track GROUP BY request_uri ORDER BY count DESC LIMIT 10;

IMPORTANT: The amount of data generated can rapidly fill up the DB and the log file. Choose only one mode by setting $track_db or $track_log to false. Once a campaign for analyzing the types of the clients (navigators, mobiles, robots, etc.) is over, leave the parameter $track_agent to false.

Comments

To add a comment, click here.