Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
 Discussion Forums
 HTML, XML, JavaScript...
 Software Reviews
 Editors,Others...
 Top100
 JavaScript Tutorials, ...
 Tutorials
 ASP, CSS, Databases...
 Discussion List
 FAQ, Roundup, Configure ...
 Authoring
 HTML, JavaScript, CSS...
 Design
 Layout, Navigation,...
 Graphics
 Tools, Colors, Images...
 Software
 Browsers, Editors, XML...
 Internet
 Domains, E-Commerce, ...
 WDVL Resources
  Intermdiate, Tutorials,...
 WDVL
 Discussion Lists, Top 100,...
 Technology Jobs


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us
Website Load Testing
Corporate Awards
Server Racks
PDA Phones & Cases
Free Business Cards
Prepaid Phone Card
Logo Design
Home Improvement
Imprinted Promotions
Imprinted Gifts
Best Price
Baby Photo Contest
Laptops
Promotional Pens

Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Just because Web sites are easy to build these days, that doesn't mean it's easy to build a quality Web site that meets your business objectives.

Before developing your next Web site, or redesigning an existing site, download this Internet.com eBook to guide you through the process and plan your project, whether you're developing a site in-house or outsourcing the project.
Register now for your free Internet.com membership to download your complimentary eBook. Membership will also give you access to:

eBook library         Whitepapers         Webcasts
Newsletters         WinDrivers
Top 10 Articles
  1. Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions
  2. JavaScript Tutorial for Programmers
  3. Design
  4. JavaScript Tutorial for Programmers - Objects
  5. JavaScript Tutorial for Programmers - JavaScript Grammar
  6. JavaScript Tutorial for Programmers - Versions of JavaScript
  7. Cascading Style Sheets
  8. JavaScript Tutorial for Programmers - Embedding JavaScript
  9. JavaScript Tutorial for Programmers - Functions
  10. Authoring JavaScript
Domain Name Lookup
Search to find the availability of a domain name. Just enter the complete domain name with extension (.com, .net, .edu)

Regular Expressions Introduced - Page 2

July 16, 2001

We will be working with two types of regular expressions. The first is the match operator and the second is the search and replace operator. A match expression exists between a pair of forward slashes. The match operator is expressed with an m placed in front of a pair of forward slashes:

m/expression/

Most programmers do not put the m operator in front of a regular expression because Perl automatically recognizes a regular expression when it sees the pair of forward slashes. So for example, if we wanted to search for all occurrences of my first name in a file, it might look like the code below.

my $counter = 0;
while (<>) {
$counter++ if m/Jonathan/gi;
print "Found 2001 $counter times\n";
}

To have the script read through a file, you would pass the filename on the command-line with the redirection operator or the < character.

match.pl < index.html

The script will read through each line of the index.html file searching for my name. When my name is found, it will increment the $counter variable. After the script has processed all the lines in the file, it will print the number of occurrences of the text string Jonathan.

There are two characters after the trailing forward slash in the regular expression above (g and i). Those are called modifiers. They have a special meaning and change the way the regular expressions work. In this case, the g modifier tells Perl to keep searching for my name on a line even if it has already found an occurrence of my name. Also called the global modifier, it keeps searching for as many matches as it can find in the string. Otherwise, it would simply stop looking after it found the first occurrence. That would give us an inaccurate count. The second modifier, i, tells Perl to ignore the case of the string. So if there was an occurrence of my name without a capital J, it would still find a match. Or if someone made my whole name upper- case, it would still match because the i modifier had been turned on.

Perl regular expressions also allow us to use special character classes that represent words, digits, and whitespace. These character classes are represented with a back-slash and a character: \w for alphanumeric characters, \d for numbers, and \s for whitespace. Alphanumeric characters include a through z and 0 through 9. These special character classes can be used as a shorthand for building a regular expression. For example, sometimes people spell my name Jon instead of Jonathan. We don't want to miss a nickname when counting the occurrences of my name, so we should have a regular expression that catches both. Let's change the line containing the regular expression in the example above:

$counter++ if /Jon(\w\w\w\w\w)?/gi;

There, now it will match Jon or Jon plus five characters. There's something else new here. The character classes are surrounded by parenthesis and there's a question mark after it. The question mark is called a quantifier because it looks for a specific number of occurrences of the text inside the parenthesis. The question mark is a true or false quantifier. That is, there will either be exactly five characters after Jon, or there won't. There are two other common quantifiers. They are + and *. The + quantifier will match one or more instances in the expression and the * quantifier will match zero or more instances of the expression. A quantifier modifies the expression to its immediate left. In this case, that would be the expression inside the parenthesis.

Another way to perform the match using the + quantifier would be:

$counter++ if /Jon(\w+)?/gi;

That would match Jon plus one or more characters or just Jon. We could have also written it a different way using the * modifier:

$counter++ if /Jon\w*/gi;

Since the * quantifier was placed after the \w character class, it will match zero or more characters after Jon, therefore, we don't need to use the ? modifier because it will match Jon or Jon plus zero or more characters. Unfortunately, both methods are less than optimal because it might match a name other than Jonathan. There is a way to specify the exact number of characters.

$counter++ if /Jon(\w{5})?/gi;

That will match Jon or Jon plus exactly five characters. Of course, that might not work either. There might be a Jonothon, which is a different person. So to be exact, we actually need to specify the characters that may or may not occur:

$counter++ if /Jon(athan)?/gi;

There we go. Now the only two things that we can match are Jon or Jonathan. Another problem we may need to solve is alternate spellings of my name. Sometimes people will end my name with 'on' instead of 'an'. To make sure we match the correct and incorrect endings, we need to add a custom character class. Above we used special character classes that are internal to Perl, but a character class can also be a list of characters that you specify. Character classes in a regular expression are surrounded by square brackets:

$counter++ if /Jon(ath[ao]n)?/gi;

So in the regular expression above, a match will occur if it finds Jon, Jonathan, or Jonathon. As you can see, regular expressions are very flexible and powerful. There are additional features in regular expressions that allow you to match any expression you can dream up. Later, I'll show you an example of a very complex regular expression that someone developed for parsing XML.

Weaving Magic With Regular Expressions
Replacing Strings in Files - Page 3


Up to => Home / Authoring / Languages / Perl / Weave




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers