iNET Interactive - Online Advertising Agency
          
   Home    Authors    About    Login    Contact Us
   Search:   
Advanced Search     
  Articles

  ASP (26)
  ASP.NET (19)
  C and C++ (4)
  CFML (2)
  CGI and Perl (16)
  Flash (2)
  Java (7)
  JavaScript (28)
  PHP (92)
  MySQL (13)
  MSSQL (3)
  HTML (35)
  SEO (9)
  Visual Basic (12)
  CSS (13)
  SSI (5)
  XML (12)
  C# (14)

  Developer News

July 3, 2009
Why Freelancing is Awesome
About
 
July 3, 2009
Twitter spurs Bing and Facebook real-time initiatives
WebDevTips UK
 
July 3, 2009
Maybe ?Paid? Is the Future of Online Business
WebDevTips UK
 
July 3, 2009
Will Microsoft, Google and Amazon talk you out of your datacentre?
WebDevTips UK
 
July 3, 2009
US Couple Gets Prison Time For Internet Obscenity
WebDevTips UK
 
July 3, 2009
Indenting Lists Consistently Across Different Browsers
About
 
Courtesy of moreover.com
 
Want to receive new articles via e-mail? Click here!
/Home /PHP

Simplest PHP Site Search-engine Using Unix Grep 

  Views:    19253
  Votes:    18
by Chief Programmabilities 8/23/04 Rating: 

Synopsis:

You can build the simplest possible search-engine (with no database) for your site with PHP by simply using Unix's grep command instead of writing a lot of PHP code from scratch.
Pages: 
The Article

Grep is a common Unix command. It is used to search. Grep searches one or more input files for lines containing a match to a specified pattern. By default, grep prints the matching lines.

PHP can call external programs. --It can call the Unix commands that are on your Linux server. In Unix, we can easily use the command grep to make a simple search-engine. We will add some complexity to this, by having the form to accept the search string and the code to display the results, all in the same file. (See working example: http://programmabilities.com/php/grep.php)

Here is the PHP script using grep that includes the PHP code and the HTML search-engine form all in one page (save it in a file with a .php extension):

<html>
 <head><title>Site Grep Search-engine</title></head>
 <body>
  <p>
   <form action="<?=$PHP_SELF;?>" method="post">
    <input type="text" name="searchstr" value="<?php echo "$searchstr"; ?>" size="20" maxlength="30"/>
    <input type="submit" value="Search!"/>
   </form>
  </p>

<?php
   if ( ! empty( $searchstr ) ) {
        // empty() is used to check if we've any search string
        // if we do, call grep and display the results.
        echo "<hr/>\n";
        // call grep with case-insensitive search mode on all files
        $cmdstr = "grep -i $searchstr *";
        $fp = popen( $cmdstr, "r" ); // open the output of command as a pipe
        $myresult = array(); // to hold my search results
        while( $buffer = fgetss ( $fp, 4096 ) ) {
            // grep returns in the format
            // filename: line
            // So, we use split() to split the data
            list( $fname, $fline ) = split( ":", $buffer, 2 );
            // we take only the first hit per file
            if ( !defined( $myresult[$fname] ) )
                $myresult[$fname] = $fline;
        }
        // we have results in a hash. lets walk through it & print it
        if ( count( $myresult ) ) {
             echo "<ol>\n";
             while( list( $fname, $fline ) = each( $myresult ) )
                  echo "<li><a href=\"$fname\">$fname</a> : $fline </li>\n";
             echo "</ol>\n";
        } else {
             // no hits
             echo "Sorry. Search on <strong>$searchstr</strong>returned no results.<br/>\n";
        }
        pclose( $fp );
   }
?>
 </body>
</html>

 ...And that's it! By using Unix's built in grep search command, you don't have to write reams of PHP code yourself from scratch to conduct the search part of your PHP search-engine program.

Please note that this is not an optimal way to implement a search-engine. It will help to learn about PHP. Ideally, one should build a database of keywords and then use the search against that. This example is not an optimal way to implement a search-engine because of the overhead and the server load it generates by grepping each document every time a user initiates a search. That is exactly why more clever search engines with flat structure index all pages and just search a file generated from all. Arguably this means you have to update this file every time the site gets updated, but in the long run it'll be a lot less straining for the server.

A working example of the PHP search-engine using grep can be found here: http://programmabilities.com/php/grep.php

Notes:

  • PHP_SELF is a variable maintained by PHP. Contains the name of the current file.
  • fgets() function reads a line, at the most 4096(specified) characters long.
  • fgetss() is just like fgets(), but it will parse the output to have proper HTML.
  • split() is called with 2, because we need only a split by two. Further ':' are ignored.
  • each() is an array function which helps to easily walk through an array.
  • popen() / pclose() are identical to fopen() / fclose(), but operate on pipes.



Chief Programmabilities
http://programmabilities.com
Pages: 

Similar/related articles:


 
  Sponsors