Home Account Search
pc, mod, and geek articles
"Every day you may make progress. Every step may be fruitful. Yet there will stretch out before you an ever-lengthening, ever-ascending, ever-improving path. You know you will never get to the end of the journey. But this, so far from discouraging, only adds to the joy and glory of the climb." — Sir Winston Churchill
RegEx Basics for .NET
The term regular expression is a method of matching a string or characters to some specified text. For example, using regular expressions is just as easy to find all forms of the acronym fubar in a text file, as it is to find all links in an HTML document that have the property target="...". Regular expression tools are the Swiss Army knives of text searching.

RegEx is case sensitive, for obvious reasons. Therefore, it is important to address both upper and lower case matching if the capitalization of the input string is unknown. Here's my own super-simplistic guide for using regular expressions...


Basic Construction:


(...)  Logical group
{...}  Explicit quantifier
[...]  Explicit char set


These help order and combine matches. Example: In the string "abcABC123aBcabc", ([aA][bB]c){2} will match "aBcabc" because the brackets [] treats the search as an "OR" condition (a OR A, useful for single characters), the parenthesis () groups "abc" together, and the trailing brackets with a number {2} specifies to match for the previous text in parenthesis () if it occurs 2 times. The explicit character set brackets [] may also be used to specify ranges. Example: [A-Za-z0-9] will match all alpha-numeric characters, both lower and upper case.


^  Start of string
$  End of string


(^...$) This specifies that conditions for the match must occur at the start or end of a string. Example: In the string "abcxyz", (^xyz) will match nothing, (xyz) will match just "xyz", and (xyz$) will match the entire string.


?  0 or 1 instances only (minimal matching has priority)
*  0 or more instances
+  1 or more instances


These, as with other operations below, always points to the preceding character or group. Examples:
(abc?xyz) will match "axyz", "bxyz", "xyz", "acxyz", etc., but will not match "abbcxyz", etc.
(abc*xyz) will match "aaaxyz", "aaabbbxyz", "xyz", "aaccxyz", etc.
(abc+xyz) will match "abcxyz", "abbcxyz", "aaabcxyz", etc., but will not match "abxyz", etc.



.  Any character (except line breaks: \r and \n)
|  Alternation (the pipe sign is used for logical OR)
\  Escape notation to specify a special character (ex: \s, \w, \' etc.)


Examples:
(ab.xyz) will match "abcxyz", "abgxyz", "abzxyz", etc., but will not match "abccxyz", etc.
abc(xy|xyz) will match "abcxy" or "abcxyz", but will not match "abczyx", etc.
abc\sxyz will match "abc xyz" because \s is the notation for a space.


Declaring RegEx in C# and VB.NET: (using basic construction methods from above -- not an accurate e-mail filter)


C# Examples:

Regex reEmail = new Regex (
    "(^           (?# Match from start of the string) " &
    "(.){2}       (?# Find two characters) " &
    ".*           (?# Find zero or more characters) " &
    "(@)          (?# Find a @ symbol)" &
    ".+           (?# Find one or more characters)" &
    "(.*)|([.]*)  (?# Find zero or more characters, or zero or more dots)" &
    "[.]          (?# Find exactly one dot)" &
    "(.){2}       (?# Find two characters)" &
    ".?           (?# Find zero or one characters)" &
    "$)           (?# Match all the way to end of the string)",
    RegexOptions.Singleline, RegexOptions.IgnoreCase);


Regex reWebsite = new Regex (
    "^(((h|H?)(t|T?)(t|T?)(p|P?)(s|S?))\://)?(www.|[a-zA-Z0-9].)[a-zA-Z0-9\-\.]+\.[a-zA-Z]*$",
    RegexOptions.Singleline
);


VB.NET Examples:

Dim reEmail As New Regex ( _
    "(^           (?# Match from start of the string) " & _
    "(.){2}       (?# Find two characters) " & _
    ".*           (?# Find zero or more characters) " & _
    "(@)          (?# Find a @ symbol)" & _
    ".+           (?# Find one or more characters)" & _
    "(.*)|([.]*)  (?# Find zero or more characters, or zero or more dots)" & _
    "[.]          (?# Find exactly one dot)" & _
    "(.){2}       (?# Find two characters)" & _
    ".?           (?# Find zero or one characters)" & _
    "$)           (?# Match all the way to end of the string)", _
    RegexOptions.Singlelin
e Or RegexOptions.IgnoreCase)

Dim reWebsite As New Regex ( _
    "^(((h|H?)(t|T?)(t|T?)(p|P?)(s|S?))\://)?(www.|[a-zA-Z0-9].)[a-zA-Z0-9\-\.]+\.[a-zA-Z]*$", _
    RegexOptions.Singleline
)


I'd like to thank the authors for their work on the following regular expression articles, guides, and tools. These are excellent web references for regular expressions.


Another confused company
Either Microsoft is paying this company to provide access only for Microsoft Internet Explorer (IE) users, or the company is absolutely clueless. Since the 21st century, when you develop for the web, you should develop non-platform-specific apps and sites.

I just ran across a site (parent company or organization is mapleglobal.com), that refuses to allow non-IE users to visit. Interesting. This domain nexon.net, specifically blocks any visitor not using IE (here's their error page). Nexon is a company or group offering free online games (ie: low-quality mmorpgs). I wonder how the web development team of Nexon would fair if the sites they visit only allow Mozilla, Firefox, and Opera users? These people need a clue. Millions of individuals and plenty of institutions, including major universities like Penn State, run Firefox as their standard browser.

This almost encourages me to block IE users for my blog. Hmmmm: that thought is worthy of filing away in the back of my mind. *wink* But *sigh*, open source geeks like me must adhere to our high standards of non-platform-specific ideals. Since I loathe Microsoft, I wonder how many more times I can say that with a straight face? Microsoft, it's only a matter of time before your new Vista O/S drives everyone away. Then we'll see how the market share compares when previous Windows users decide to run a Linux distro. The Linux community grows, while Windows becomes less of a power user O/S and more of a consumer electronic O/S.

I encourage you to send your complaints to any company you find that hosts a site to specifically block non-IE users. Also, visit Mozilla's tech evangelism page to complain about a specific site. This specific site has been entered in Mozilla's Bugzilla as Tech Evangelism bug # 316550.