75

Syntax coloring of source files

Pretty printing source files by coloring them makes them more readable and more attractive. Nothing more simple with GeSHi and a little piece of code in PHP.

Download GeSHi - Generic Syntax Highlighter. Open the archive geshi-*.bz2.

Organize the content of your site by creating at the root of the site the folders includes for included files and library for your own PHP code. Copy the geshi directory and the geshi.php file from the archive in /includes.

  1. /
    1. includes
      1. geshi
      2. geshi.php
    2. library
      1. prettyfile.php

Copy the following code in a file called prettyfile.php in /library.

prettyfile.php defines two functions: read_file and pretty_file. read_file reads an entire file, or a part of it, in a string. pretty_file returns the content of a file, or a part of it, properly enhanced for a given computer language.

  1. require_once 'geshi.php';

The code starts by loading functions from geshi.php. NOTE: Configure your site so the directories /includes and /library are listed in the PHP include path. Add the following lines in your code in the root directory of the site:

define('ROOT_DIR', dirname(__FILE__));

set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'library');
set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'includes');
read_file
  1. function read_file($file, $startline=0, $endline=0) {
  2.     if ($startline or $endline) {
  3.         $lines=@file($file);
  4.  
  5.         if (false === $lines) {
  6.             return false;
  7.         }
  8.  
  9.         $offset=$startline ? $startline-1 : 0;
  10.  
  11.         if ($endline) {
  12.             $length=$startline ? $endline - $startline + 1 : $endline;
  13.             $lines = array_slice($lines, $offset, $length);
  14.         }
  15.         else {
  16.             $lines = array_slice($lines, $offset);
  17.         }
  18.  
  19.         $s=implode('', $lines);
  20.     }
  21.     else {
  22.         $s=@file_get_contents($file);
  23.  
  24.         if (false === $s) {
  25.             return false;
  26.         }
  27.     }
  28.  
  29.     $s=rtrim($s);
  30.  
  31.     return $s;
  32. }

If $startline and $endline are both 0, we read the entire file with file_get_contents.

To extract a part of the file, we read it with file which returns all the lines in an array. Each line in $lines includes the end of line character, except maybe the last one. $lines is reduced with array_slice to the wanted portion of the file. If $startline is 0, the first $endline lines are extracted. If $endline is 0, all the lines from $startline to the end of the file are extracted. If $startline and $endline are different from 0, the lines between $startline and $endline are extracted. The resulting array is changed into a single character string with implode.

Finally, empty lines at the end of the text are removed.

read_file returns the lines read in $file in a string or false in case of error.

pretty_file
  1. function pretty_file($file, $language, $startline=0, $endline=0) {
  2.     $s=read_file($file, $startline, $endline);
  3.  
  4.     if (!$s) {
  5.         return false;
  6.     }
  7.  
  8.     if (!$language) {
  9.         return $s;
  10.     }
  11.  
  12.     $output = false;
  13.  
  14.     switch ($language) {
  15.         case 'plain':
  16.             $s = preg_replace("/\]\=\>\n(\s+)/m", "] => ", $s);
  17.             $s = htmlentities($s, ENT_COMPAT, 'UTF-8');
  18.  
  19.             $output = '<pre class="plain">' . PHP_EOL . $s . '</pre>' . PHP_EOL;
  20.             break;
  21.         default:
  22.             $geshi = new GeSHi($s, $language);
  23.             $geshi->enable_classes(true);
  24.             $geshi->set_header_type(GESHI_HEADER_DIV);
  25.             $geshi->enable_line_numbers(GESHI_NORMAL_LINE_NUMBERS);
  26.             $geshi->start_line_numbers_at($startline > 0 ? $startline : 1);
  27.             $geshi->enable_keyword_links(false);
  28.             $geshi->set_tab_width(4);
  29.  
  30. //          echo '<pre>' . PHP_EOL .$geshi->get_stylesheet( ). '</pre>' . PHP_EOL;
  31.  
  32.             $output = $geshi->parse_code();
  33.  
  34.             if ($geshi->error()) {
  35.                 return false;
  36.             }
  37.     }
  38.  
  39.     return $output;
  40. }

pretty_file starts by calling read_line to read $file from $startline to $endline. In case of error, it returns false.

If $language is not specified, the content of $file is returned unformatted. NOTE: Avoid reading a file whose content can be interpreted by a navigator.

If $language is 'plain', the content of $file is returned in a <pre> block with its class attribute set to plain for CSS. Newlines and indents are neaten and HTML entities are rewritten with htmlentities. Note that the text if expected to be encoded in UTF-8.

All other values of $language will call GeSHi. $geshi is initialized with the lines to format and the computer language. Then, before calling parse_code, the object is configured in order to obtain the smallest output without any CCS code.

pretty_file returns $output or false in case of error.

Write a short document which will test pretty_file by coloring itself. Save the following code in a file called geshitest.php in the root directory of the site.

  1. <html>
  2. <head>
  3. <link href="/css/html4strict.css" rel="stylesheet" type="text/css" media="screen" />
  4. <title>Geshi</title>
  5. </head>
  6. <body>
  7. <p>
  8. <?php
  9. define('ROOT_DIR', dirname(__FILE__));
  10.  
  11. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'library');
  12. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'includes');
  13.  
  14. require_once 'prettyfile.php';
  15.  
  16. echo pretty_file('geshitest.php', 'html4strict');
  17. ?>
  18. </p>
  19. </body>
  20. </html>

If we access the document with a navigator, we don't exactly see the expected result.

  1. <html>
  2. <head>
  3. <link href="/css/html4strict.css" rel="stylesheet" type="text/css" media="screen" />
  4. <title>Geshi</title>
  5. </head>
  6. <body>
  7. <p>
  8. <?php
  9. define('ROOT_DIR', dirname(__FILE__));
  10.  
  11. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'library');
  12. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'includes');
  13.  
  14. require_once 'prettyfile.php';
  15.  
  16. echo pretty_file('geshitest.php', 'html4strict');
  17. ?>
  18. </p>
  19. </body>
  20. </html>

Notice the <link> tag which tries to include a style sheet file named /css/html4strict.css. We need to build this file.

Styling

The HTML returned by pretty_file doesn't contain any CSS. To properly style your pages, ask GeSHi to generate the style sheet and save it in a separate file which you will include in the document. pretty_file can do it. Just uncomment the line which outputs the style sheet.

  1. //          echo '<pre>' . PHP_EOL .$geshi->get_stylesheet( ). '</pre>' . PHP_EOL;

Reload geshitest.php in your navigator. The source of the style sheet is displayed at the beginning.

/**
 * GeSHi (C) 2004 - 2007 Nigel McNie, 2007 - 2008 Benny Baumann
 * (http://qbnz.com/highlighter/ and http://geshi.org/)
 */
.html4strict {font-family:"Courier New", Courier, monospace; font-size:10pt;}
.html4strict .de1, .html4strict .de2 {margin:0; padding:0; background:none; vertical-align:top;}
.html4strict .imp {font-weight: bold; color: red;}
.html4strict li, .html4strict .li1 {font-weight: normal; vertical-align:top;}
.html4strict .ln {width:1px;text-align:right;margin:0;padding:0 2px;vertical-align:top;}
.html4strict .kw2 {color: #000000; font-weight: bold;}
.html4strict .kw3 {color: #000066;}
.html4strict .es0 {color: #000099; font-weight: bold;}
.html4strict .br0 {color: #66cc66;}
.html4strict .sy0 {color: #66cc66;}
.html4strict .st0 {color: #ff0000;}
.html4strict .nu0 {color: #cc66cc;}
.html4strict .sc-1 {color: #808080; font-style: italic;}
.html4strict .sc0 {color: #00bbdd;}
.html4strict .sc1 {color: #ddbb00;}
.html4strict .sc2 {color: #009900;}
.html4strict span.xtra { display:block; }

  1. <html>
  2. <head>
  3. <link href="/css/html4strict.css" rel="stylesheet" type="text/css" media="screen" />
  4. <title>Geshi</title>
  5. </head>
  6. <body>
  7. <p>
  8. <?php
  9. define('ROOT_DIR', dirname(__FILE__));
 10.
 11. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'library');
 12. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'includes');
 13.
 14. require_once 'prettyfile.php';
 15.
 16. echo pretty_file('geshiexample.php', 'html4strict');
 17. ?>
 18. </p>
 19. </body>
 20. </html>

The output might differ depending on the version of GeSHi which is installed. Feel free to change some properties like font-family or font-size.

Copy the CSS and paste it in a file named after the language of the code, html4strict.css in the example. Save the file in a css folder under the root of your site.

  1. /
    1. includes
      1. geshi
      2. geshi.php
    2. library
      1. prettyfile.php
    3. css
      1. html4strict.css

After you have generated and saved all the CSS files for all the different languages you are publishing, comment out the line in pretty_file which outputs the stylesheet.

Now, if you reload the document, the CSS file is found and the source code looks fine.

  1. <html>
  2. <head>
  3. <link href="/css/html4strict.css" rel="stylesheet" type="text/css" media="screen" />
  4. <title>Geshi</title>
  5. </head>
  6. <body>
  7. <p>
  8. <?php
  9. define('ROOT_DIR', dirname(__FILE__));
  10.  
  11. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'library');
  12. set_include_path(get_include_path() . PATH_SEPARATOR . ROOT_DIR . DIRECTORY_SEPARATOR . 'includes');
  13.  
  14. require_once 'prettyfile.php';
  15.  
  16. echo pretty_file('geshitest.php', 'html4strict');
  17. ?>
  18. </p>
  19. </body>
  20. </html>
Application

The function from iZend which formats comments with some help from GeSHi:

  1. require_once 'geshi.php';
  2.  
  3. function bbcode($s) {
  4.     static $bbcode = array(
  5.             '#\[br\]#is'                            => '<br />',
  6. //          '#\[(h[1-6])\](.+?)\[/\1\]#is'          => '<\1>\2</\1>',
  7.             '#\[(b|i|u|s)\](.+?)\[/\1\]#is'         => '<\1>\2</\1>',
  8.             '#\[(p|pre)\](.+?)\[/\1\]#is'           => '<\1>\2</\1>',
  9.             '#\[quote\](.+?)\[/quote\]#is'          => '<blockquote>\1</blockquote>',
  10.             '#\[(url)\=(.+?)\](.*?)\[/\1\]#ise'     => "filter_var('\\2', FILTER_VALIDATE_URL) ? '<a href=\"\\2\" target=\"_blank\">\\3</a>' : '\\0'",
  11.             '#\[(url)](.*?)\[/\1\]#ise'             => "filter_var('\\2', FILTER_VALIDATE_URL) ? '<a href=\"\\2\" target=\"_blank\">\\2</a>' : '\\0'",
  12.             '#\[(e?mail)\=(.+?)\](.*?)\[/\1\]#ise'  => "filter_var('\\2', FILTER_VALIDATE_EMAIL) ? '<a href=\"mailto:\\2\">\\3</a>' : '\\0'",
  13.             '#\[(e?mail)\](.*?)\[/\1\]#ise'         => "filter_var('\\2', FILTER_VALIDATE_EMAIL) ? '<a href=\"mailto:\\2\">\\2</a>' : '\\0'",
  14.             '#\[code\=(.+?)\](.+?)\[/code\]#ise'    => "bbcode_highlite('\\2', '\\1')",
  15.             '#\[code\](.+?)\[/code\]#ise'           => "bbcode_highlite('\\1')",
  16.     );
  17.  
  18.     $s = preg_replace('#\[code([^\]]*?)\](.*?)\[/code\]#ise', "'[code\\1]'.bbcode_protect('\\2').'[/code]'", $s);
  19.  
  20.     $s = htmlspecialchars($s, ENT_COMPAT, 'UTF-8');
  21.  
  22.     return preg_replace(array_keys($bbcode), array_values($bbcode), $s);
  23. }
  24.  
  25. function bbcode_protect($s) {
  26.     return base64_encode(preg_replace('#\\\"#', '"', $s));
  27. }
  28.  
  29. function bbcode_highlite($s, $language=false) {
  30.     $s = trim(base64_decode($s));
  31.  
  32.     if (!$language) {
  33.         return '<code>' . htmlspecialchars($s, ENT_COMPAT, 'UTF-8') . '</code>';
  34.     }
  35.  
  36.     $geshi = new GeSHi($s, $language);
  37.     $geshi->enable_classes(true);
  38.     $geshi->set_header_type(GESHI_HEADER_DIV);
  39.     $geshi->enable_keyword_links(false);
  40.     $geshi->set_tab_width(4);
  41.  
  42.     $output = $geshi->parse_code();
  43.  
  44.     if ($geshi->error()) {
  45.         return false;
  46.     }
  47.  
  48.     head('stylesheet', 'geshi/' . $language, 'screen');
  49.  
  50.     return '<div class="geshi">' . $output . '</div>';
  51. }

Comments

Your comment:
[p] [b] [i] [u] [s] [quote] [pre] [br] [code] [url] [email] strip help 2000

Enter a maximum of 2000 characters.
Improve the presentation of your text with the following formatting tags:
[p]paragraph[/p], [b]bold[/b], [i]italics[/i], [u]underline[/u], [s]strike[/s], [quote]citation[/quote], [pre]as is[/pre], [br]line break,
[url]http://www.izend.org[/url], [url=http://www.izend.org]site[/url], [email]izend@izend.org[/email], [email=izend@izend.org]izend[/email],
[code]command[/code], [code=language]source code in c, java, php, html, javascript, xml, css, sql, bash, dos, make, etc.[/code].