9 Regular Expressions You Should Know

Regular expressions are a language of their own. Today, we'll review nine regular expressions that you should know for your next coding project.

For a refresher on regular expressions, check out our regex cheat sheet.

What Is a Regular Expression?

Regular expressions (also known as regex) are a concise and flexible way to search and replace within text strings. With a regular expression, you can easily match characters, words, or patterns within text. A really basic example would be the regex /c*t/—this would match "cat", "cot", or "cut", but not "pat" or "but".

The regular expressions I'll show you today will allow you to match:

3. a hex value
4. a slug
5. an email
6. a URL
8. an HTML tag
9. dates

As the list goes down, the regular expressions get more and more elaborate.

The key thing to remember about regular expressions is that they are almost read forwards and backwards at the same time. This sentence will make more sense when we talk about matching HTML tags.

The delimiters used in the regular expressions are forward slashes: /. Each pattern begins and ends with a delimiter. If a forward slash appears in a regex, we must escape it with a backslash like so: \/.

my-USER_n4m3

Pattern:

 1 /^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$^&*()_-]).{8,18}$/ 

Description:

One way to make passwords hard to guess is to make sure that people include at least one digit, at least one small letter, at least one capital letter, and at least one special character. It also makes sense for us to include some kind of length check. For example, the password needs to have a length between 8 and 18 characters.

Our regex checks for all these conditions. The regex is divided into five distinct parts. The first four parts are positive lookahead expressions—in brackets, prefixed by ?=.. The positive lookahead expressions will match anywhere in the string; they don't need to appear in order. Let's consider (?=.*\d) as an example. It looks for any digits anywhere in our string. Similarly, (?=.*[a-z]) looks for any small letters, and (?=.*[A-Z]) looks for capital letters.

Keep in mind that we did not use the expression (?=.*[a-zA-Z]) within our regex. This would have meant that the presence of either a small letter or a capital letter would have given a positive result. We wanted both small and capital letters.

Myp^ssw0rd

String that doesn't match:

Myp^ssword (does not contain any digits)

3. Matching a Hex Value

 1 /^#?([a-f0-9]{6}|[a-f0-9]{3})$/i  Description: We begin by telling the parser to find the beginning of the string (^). Next, a pound sign is optional because it is followed a ?. The question mark tells the parser that the preceding character—in this case a pound sign —is optional, but to be "greedy" and capture it if it's there. Next, inside the first group (first group of parentheses), we can have two different situations. The first is any lowercase letter between a and f or a number six times. The | tells us that we can also have three lowercase letters between a and f or numbers instead. Finally, we want the end of the string ($). We also use the case insensitive flag by adding an i at the end of our expression. This will allow us to match #ffffff as well as #FFFFFF.

The reason that I put the six characters before is that parser will capture a hex value like #ffffff. If I had reversed it so that the three characters came first, the parser would only pick up #fff and not the other three 'f's.

#a3c113

String that doesn't match:

#4d82h4 (contains the letter h)

4. Matching a Slug

my-title-here

String that doesn't match:

my_title_here (contains underscores)

5. Matching an Email

john@doe.com

String that doesn't match:

john@doe (TLD is missing)

6. Matching a URL

Description:

Now, I'm not going to lie, I didn't write this regex; I got it from RegExr.

The first capture group really isn't a captured group because ?: was placed inside, which tells the parser not to capture this group (more on this in the last regex). We also want this non-captured group to be repeated three times—the {3} at the end of the group. This group contains another group, a subgroup, and a literal dot. The parser looks for a match in the subgroup and then a dot to move on.

The subgroup is also another non-capture group. It's just a bunch of character sets which together describe the numbers between 0 and 255 (things inside brackets): the string "25" followed by a number between 0 and 5; or the string "2" and a number between 0 and 4 and any number; or an optional zero or one followed by two numbers, with the second being optional.

After we match three of those, it's onto the next non-capturing group. This one wants: the string "25" followed by a number between 0 and 5; or the string "2" with a number between 0 and 4 and another number at the end; or an optional zero or one followed by two numbers, with the second being optional.

We end this confusing regex with the end of the string.

73.60.124.136

String that doesn't match:

256.60.124.136 (all the parts must be less than 255)

8. Matching an HTML Tag

Description:

Let's start with the dates. The dates in a month can go from a minimum value of 1 to 31 at the most. Users can also write the dates as 02 instead of just 2 for the day of the month. We cover all these scenarios with the first part of the expression. As you can see, if the first digit is 1 or 2 we allow the second digit to be anything between 0 and 9. If the first digit is 3, the second digit is only allowed to be 0 or 1.

For separators, we want the characters to only be a hyphen, dot, space, or slash. This is put inside a capturing group so that we can check that the same separator is used between the month and the year value.

The month can only go up to 12, so we allow the second digit to only be 0, 1, or 2 if the first digit is 1. There is no restriction on the year number. It can be any four-digit number like 1508 or 9999. We also allow the year to be written with two digits in case someone wants to write the date as 11/09/91.

Remember that the above regex is for dates which follow the format DD/MM/YYYY. Try modifying it for the date format MM/DD/YYYY.

One more thing that I would like to point out is that the above regex will consider 31.02.1991 as a valid date. However, we know that this is an invalid date since February has at most 29 days. We could write our regex to make sure that the number of days in February never exceeds 28 for regular years and 29 for leap years. However, that would make the regex unnecessarily complicated. It is much more practical to use date validation libraries for these edge cases.

Conclusion

I hope that you have grasped the ideas behind regular expressions a little bit better. Hopefully you'll be using these regexes in future projects! Many times, you won't need to decipher a regex character by character, but sometimes if you do this, it helps you learn. Just remember, don't be afraid of regular expressions—they might not seem it, but they make your life a lot easier. Just try to pull out a tag's name from a string without regular expressions!

This post has been updated with contributions from Monty Shokeen. Monty is a full-stack developer who also loves to write tutorials and to learn about new JavaScript libraries.