RegEx in 20 minutes..
On your marks, get set, GO!
October 24, 2018Like any ex, regex evokes emotion. Some good some bad, but always both... Lot's of jokes and puns in here, but the truth is, we will inevitably have to learn how to use Regular Expressions. So buckle up, sit up straight and learn it!
On Your Marks
The forward slashes /
simply define the beginning and ending of your regular expression like your curly brackets in programming {}
or your quotes in a string "blah blah blah."
In JavaScript, the slashes allow you to put the RegEx directly into other JavaScript intrinsic functions like the replace
or search
functions.
A RegEx will work on 1 character at a time. If you want it to work for every character (of course you always do!), then add a g
at the end for 'global.' ۞
//javascript RegEx example of replacing all the 1s with 2s
var s1 = 's1111111';
var rx = /1/g;
s2 = s1.replace(rx, '2');
alert(s2);
Get Set
The square brackets []
say match any of the patterns identified inside of me - unless there is a hat symbol ^
inside of the brackets which is a NOT symbol meaning don't match me.
This is somewhat confusing because the hat symbol ^
also means start of line when it is outside the brackets like /^[0-9]/
- Dumb - I know. ۞
Ranges The dash -
denotes a range but if you aren't careful, it will just mean a '-' character.
Ranges work for any ascii table ranges like numbers, letters and other type-able characters. For example, the regex /[a-zA-Z]/
says you are testing for any upper or lower case letters. If you want to match special characters like cartoon cussing: !@#$%, then you can use them specifically, but you will probably have to either escape them or use them in an range like this:
/[ -~]/
Let's break that down. The slashes and the brackets are just the containers and the -~
is a range between space ' ' and tilde ~ which includes every printable character.
//javascript RegEx example of finding any number within a string
var s = 'The quick brown fox jumped over 1 lazy dog.';
var rx = /[0-9]/g;
alert(s.search(rx));
//javascript RegEx example of finding anything but a letter
rx = /[^a-zA-Z]/g;
alert(s.search(rx));
GO!
Your head is about to start spinning, so before I lose you, I'm going to make sure you are screwed up real good with a couple of quick reference hammers to the head.
Symbol | Meaning |
---|---|
\ | escape character |
+ | and operator |
{3,16} | from 3 to 16 times |
< | nothing - just the actual character (good for parsing HTML) |
> | nothing - just the actual character (good for parsing HTML) |
Groups (words, phrases, abbreviations, etc.)
They call stuff in parentheses ( )
"capturing groups." Unfortunately, this is totally confusing. Parentheses are simply your "this or that or that" lists.
The difference between brackets [ ] and parentheses ( ) is that the brackets are just looking at 1 character at a time and the parentheses are looking for each 'group' of characters. Each group of characters need to be separated by a pipe |
which in regex means or
Check out the following example and don't lose your momentum, you're almost there
//javascript RegEx example of looking for several words within a string but failing
var s = 'The quick brown fox jumped over 1 orange dog.';
var rx = /(red|green|blue )/g;
var f = (s.search(rx) > 0) ? 'true' : 'false';
alert(f);
//javascript RegEx example of looking for several words within a string and succeeding
rx = /(red|brown|blue )/g;
f = (s.search(rx) > 0) ? 'true' : 'false';
alert(f);
Bonus Round
If you were lost on this line:
var f = (s.search(rx) > 0) ? 'true' : 'false';
Please read my Ternary is NOT Explanatory blog. Super short but super helpful.
Also, there really is a reason to call (red|green|blue )
capturing groups but it is totally confusing and yet sometimes helpful. So I'll just leave you with an example and an explanation, but don't worry about it if you don't get it:
/(red|green)\1blue/g
the \1
is a place holder for whatever thing you matched in your first parentheses list - in this case red or green. So, the regex above would match this:
redredblue
or greengreenblue
but not redgreenblue
or redblue
.
In English: /(red|green)\1blue/g
means I am a regular expression looking for strings within a string that has either redredblue
or greengreenblue
.
Now, go start practicing your own Reg Ex using this: https://www.regextester.com/
Here are some really **helpful links:
https://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149
https://www.w3schools.com/jsref/jsref_obj_regexp.asp
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions