iNET Interactive - Online Advertising Agency
          
   Home    Authors    About    Login    Contact Us
   Search:   
Advanced Search     
  Articles

  ASP (26)
  ASP.NET (19)
  C and C++ (4)
  CFML (2)
  CGI and Perl (16)
  Flash (2)
  Java (7)
  JavaScript (28)
  PHP (92)
  MySQL (13)
  MSSQL (3)
  HTML (35)
  SEO (9)
  Visual Basic (12)
  CSS (13)
  SSI (5)
  XML (12)
  C# (14)

  Developer News

August 8, 2008
Reader Question: What graphics compression program do you use?
About
 
August 7, 2008
Google's Big Mistake: Getting Rid of Google Page Creator, What Do...
About
 
August 7, 2008
Wish XML a happy birthday
About
 
August 7, 2008
Poll: How important is SEO to your overal website strategy?
About
 
August 7, 2008
How to Create a Search Feature with PHP and MySQL
WebReference.com
 
August 7, 2008
1 comment
.net
 
Courtesy of moreover.com
 
Want to receive new articles via e-mail? Click here!
/Home /PHP

Simplest PHP Site Search-engine Using Unix Grep 

  Views:    16142
  Votes:    17
by Chief Programmabilities 8/23/04 Rating: 

Synopsis:

You can build the simplest possible search-engine (with no database) for your site with PHP by simply using Unix's grep command instead of writing a lot of PHP code from scratch.
Pages: 
The Article

Grep is a common Unix command. It is used to search. Grep searches one or more input files for lines containing a match to a specified pattern. By default, grep prints the matching lines.

PHP can call external programs. --It can call the Unix commands that are on your Linux server. In Unix, we can easily use the command grep to make a simple search-engine. We will add some complexity to this, by having the form to accept the search string and the code to display the results, all in the same file. (See working example: http://programmabilities.com/php/grep.php)

Here is the PHP script using grep that includes the PHP code and the HTML search-engine form all in one page (save it in a file with a .php extension):

<html>
 <head><title>Site Grep Search-engine</title></head>
 <body>
  <p>
   <form action="<?=$PHP_SELF;?>" method="post">
    <input type="text" name="searchstr" value="<?php echo "$searchstr"; ?>" size="20" maxlength="30"/>
    <input type="submit" value="Search!"/>
   </form>
  </p>

<?php
   if ( ! empty( $searchstr ) ) {
        // empty() is used to check if we've any search string
        // if we do, call grep and display the results.
        echo "<hr/>\n";
        // call grep with case-insensitive search mode on all files
        $cmdstr = "grep -i $searchstr *";
        $fp = popen( $cmdstr, "r" ); // open the output of command as a pipe
        $myresult = array(); // to hold my search results
        while( $buffer = fgetss ( $fp, 4096 ) ) {
            // grep returns in the format
            // filename: line
            // So, we use split() to split the data
            list( $fname, $fline ) = split( ":", $buffer, 2 );
            // we take only the first hit per file
            if ( !defined( $myresult[$fname] ) )
                $myresult[$fname] = $fline;
        }
        // we have results in a hash. lets walk through it & print it
        if ( count( $myresult ) ) {
             echo "<ol>\n";
             while( list( $fname, $fline ) = each( $myresult ) )
                  echo "<li><a href=\"$fname\">$fname</a> : $fline </li>\n";
             echo "</ol>\n";
        } else {
             // no hits
             echo "Sorry. Search on <strong>$searchstr</strong>returned no results.<br/>\n";
        }
        pclose( $fp );
   }
?>
 </body>
</html>

 ...And that's it! By using Unix's built in grep search command, you don't have to write reams of PHP code yourself from scratch to conduct the search part of your PHP search-engine program.

Please note that this is not an optimal way to implement a search-engine. It will help to learn about PHP. Ideally, one should build a database of keywords and then use the search against that. This example is not an optimal way to implement a search-engine because of the overhead and the server load it generates by grepping each document every time a user initiates a search. That is exactly why more clever search engines with flat structure index all pages and just search a file generated from all. Arguably this means you have to update this file every time the site gets updated, but in the long run it'll be a lot less straining for the server.

A working example of the PHP search-engine using grep can be found here: http://programmabilities.com/php/grep.php

Notes:

  • PHP_SELF is a variable maintained by PHP. Contains the name of the current file.
  • fgets() function reads a line, at the most 4096(specified) characters long.
  • fgetss() is just like fgets(), but it will parse the output to have proper HTML.
  • split() is called with 2, because we need only a split by two. Further ':' are ignored.
  • each() is an array function which helps to easily walk through an array.
  • popen() / pclose() are identical to fopen() / fclose(), but operate on pipes.



Chief Programmabilities
http://programmabilities.com
Pages: 

Similar/related articles:


 
  Sponsors