Grep is a common Unix command. It is used to search. Grep searches one or more input files for lines containing a match to a specified pattern. By default, grep prints the matching lines.
PHP can call external programs. --It can call the Unix commands that are on your Linux server. In Unix, we can easily use the command grep to make a simple search-engine. We will add some complexity to this, by having the form to accept the search string and the code to display the results, all in the same file. (See working example: http://programmabilities.com/php/grep.php)
Here is the PHP script using grep that includes the PHP code and the HTML search-engine form all in one page (save it in a file with a .php extension):
|
<html> <head><title>Site Grep Search-engine</title></head> <body> <p> <form action="<?=$PHP_SELF;?>" method="post"> <input type="text" name="searchstr" value="<?php echo "$searchstr"; ?>" size="20" maxlength="30"/> <input type="submit" value="Search!"/> </form> </p>
<?php if ( ! empty( $searchstr ) ) { // empty() is used to check if we've any search string // if we do, call grep and display the results. echo "<hr/>\n"; // call grep with case-insensitive search mode on all files $cmdstr = "grep -i $searchstr *"; $fp = popen( $cmdstr, "r" ); // open the output of command as a pipe $myresult = array(); // to hold my search results while( $buffer = fgetss ( $fp, 4096 ) ) { // grep returns in the format // filename: line // So, we use split() to split the data list( $fname, $fline ) = split( ":", $buffer, 2 ); // we take only the first hit per file if ( !defined( $myresult[$fname] ) ) $myresult[$fname] = $fline; } // we have results in a hash. lets walk through it & print it if ( count( $myresult ) ) { echo "<ol>\n"; while( list( $fname, $fline ) = each( $myresult ) ) echo "<li><a href=\"$fname\">$fname</a> : $fline </li>\n"; echo "</ol>\n"; } else { // no hits echo "Sorry. Search on <strong>$searchstr</strong>returned no results.<br/>\n"; } pclose( $fp ); } ?> </body> </html> |
...And that's it! By using Unix's built in grep search command, you don't have to write reams of PHP code yourself from scratch to conduct the search part of your PHP search-engine program.
Please note that this is not an optimal way to implement a search-engine. It will help to learn about PHP. Ideally, one should build a database of keywords and then use the search against that. This example is not an optimal way to implement a search-engine because of the overhead and the server load it generates by grepping each document every time a user initiates a search. That is exactly why more clever search engines with flat structure index all pages and just search a file generated from all. Arguably this means you have to update this file every time the site gets updated, but in the long run it'll be a lot less straining for the server.
A working example of the PHP search-engine using grep can be found here: http://programmabilities.com/php/grep.php
Notes:
- PHP_SELF is a variable maintained by PHP. Contains the name of the current file.
- fgets() function reads a line, at the most 4096(specified) characters long.
- fgetss() is just like fgets(), but it will parse the output to have proper HTML.
- split() is called with 2, because we need only a split by two. Further ':' are ignored.
- each() is an array function which helps to easily walk through an array.
- popen() / pclose() are identical to fopen() / fclose(), but operate on pipes.
Chief Programmabilities
http://programmabilities.com