iNET Interactive - Online Advertising Agency
          
   Home    Authors    About    Login    Contact Us
   Search:   
Advanced Search     
  Articles

  ASP (26)
  ASP.NET (19)
  C and C++ (4)
  CFML (2)
  CGI and Perl (16)
  Flash (2)
  Java (7)
  JavaScript (28)
  PHP (92)
  MySQL (13)
  MSSQL (3)
  HTML (34)
  SEO (9)
  Visual Basic (12)
  CSS (13)
  SSI (5)
  XML (12)
  C# (14)

  Developer News

May 15, 2008
Reader Question - Would you host your client's work on your website?
About
 
May 15, 2008
How to Create an Ajax Autocomplete Text Field: Part 6
WebReference.com
 
May 14, 2008
Poll: Are the browser safe colors still needed?
About
 
May 14, 2008
Google Doctype launched
About
 
May 14, 2008
Web Editor Reviews - 6 New Reviews
About
 
May 14, 2008
Build Beautiful Buttons in Photoshop, Part I
SitePoint
 
Courtesy of moreover.com
 
Want to receive new articles via e-mail? Click here!
/Home /CGI and Perl

Useful Perl Scripts With Regular Expressions 

  Views:    19651
  Votes:    5
by Matthew Drouin 11/07/04 Rating: 

Synopsis:

Many people talk about Perl and many more about regular expressions but unless you are a programmer you probably never use either. We will discuss a few unique and very useful ways to use both of them.
Pages: firstback1 2 4 5 6 forwardlast
The Article

Replace On A Single File

Here we will hard code our script to edit a single file for certain words. You could set the script up to prompt the user for the file but I figured that was over kill since we would probably have to go in and make some minor changes to the script anyway. If you want the script to prompt you can use the code below though.

my $dir;
print "Please enter dir name: ";
chomp ( $dir = <STDIN> );
print $dir;

The code above prints out Please enter dir name: and then we use chomp, which we will discuss later, to remove the linefeed and or newline that comes in when the user hits enter.

The code below will parse a file that contains a <body ... > tag and we want to remove all the attributes in the body tag because we are starting to use cascading style sheets (CSS) and we no longer want there to be any attributes in the body tag since we will define all of the attributes in our .css file. Our file will start out with <body bgcolor="green"> and we will end up with <body>.

It would probably not make much sense to use this script to edit just one file because it would be faster to open the HTML document and make the change manually than it would be to edit this script and then run it. We will be building upon this script so it is not a waste of time.


#!/usr/bin/perl

$filename = "/home/directory/file.txt";
open ( FILE, $filename) or die "Cannot open file: $!";

while ( $line = <FILE> ) {
    # i is case insensative
    # ([^>]*) match zero or more characters but not '>'
    $line =~ s/<body([^>]*)>/<body>/i;
    push(@outLines, $line);
}

close FILE;

open ( OUTFILE, ">$filename" );
print ( OUTFILE @outLines );
close ( OUTFILE );

You will notice above that we open the file once to read it and then we open the file again to write to it. The reason I did this is because it is possible you would actually output to a different file than the original and if that is the case then the code already exists for you to do so easier than if I had only done this process with one open file statement.

In the above code you will also notice the regular expression ([^>]*) which stops the regular expression from doing a maximum munch; i.e. it tells the regular expression to stop at the first greater than sign instead of stopping at the last greater than sign in the file. If this were not here, and you can feel free to give this a try, the regular expression code would actually take everything from the body tag all the way to the last greater sign removing everything, in a well formatted HTML document, from the body tag to the closing html tag and replace it will just a simple <body> tag.

Pages: firstback1 2 4 5 6 forwardlast

Similar/related articles:


 
  Sponsors