FREELessons: 65Length: 7.1 hours

• Overview
• Transcript

# 7.4 More About Regular Expressions

There’s still a lot to learn about regular expressions. In this lesson, we’ll learn some advanced features of regular expressions, including matching and replacing strings.

Key terms:

• |
• ?
• replace
• ?=
• ?:
• match

## 11.Conclusion1 lesson, 00:43

### 7.4 More About Regular Expressions

Hi, folks, in this lesson we're going to continue looking at regular expressions. As there are still a lot of things that we haven't covered yet. So one simple special character that we didn't cover in the last lesson, was the pipe character. And that allows us the specify and/or condition within a regular expression. One use for this is to verify whether a string starts with either http or https. So the test method in this case should return True because the test string does begin with http. And if we were to change that to https instead, It still passes. Whereas if we remove it completely, we can see that it now fails. So the regular expression is not as concise as it could be. We've duplicated the h, t, t and p characters, when really the only differentiator between http and https, is the s character. So let's improve the regular expression slightly. So in this case, we've updated the pattern to include a group. The group, in this case, is the character s. And we're using the question mark to say that the s may or may not occur. So this is much more concise than the previous regular expression, but it essentially means the same thing. We find that it's still passing and if we add an s on to the pattern being tested, again we see that it still passes. I just want to point out as well that just like the email address validation from the last lesson. This pattern that we're using now is nowhere near complex enough to be able to successfully test for proper URLs. We're just checking the first few letters of the URL. As well as breaking up patterns to match in a regular expression, parentheses also remember the characters that are matched. And can be used later by some methods, like the replace method of a string, for example. This can be really useful if we want to change the string that we're testing in some way. So let's say that we wanted to take a camelCased string, and insert an underscore character before each capital letter. So let's view the output in a browser first and then we'll just come back and just talk about the syntax. So it has kind of worked, it's found the first uppercase letter in the test string, which was an uppercase C. And it has replaced that uppercase C with an uppercase C preceded by an underscore character. Now it hasn't done the second uppercase letter, the S, and we'll talk about why in just a moment. Let's just go back to the example and take a look at the syntax. So this goes in a slightly different way than some of the examples that we've looked at previously. This time instead of storing the pattern in a variable, we have stored the string that we want to test in a variable instead. The reason why we've done that is because the replace method is called on strings and not on regular expressions. However, the replace method can accept a regular expression as the first arguments. And that's where we actually provide the regular expression. So we still use the forward slash at the start and the end of the regular expression. This time we create a group using the parentheses. Inside that group, we specify a character set of any letter between A and Z, which is upper case. And this second argument to replace, we have specified an underscore character. As the literal character that we'd like to insert into this string whenever an upper case letter is found. But we've also used this special $1 token, that is only available with regular expressions. And that means the value remembered by any parenthesis within the regular expression. So when it matched the capital C for the word Cased, the capital C is available as$1. Now the reason why I only matched the first uppercase character, the C. Is because by default, regular expressions will match as little as possible. If we want to make the regular expression, match all uppercase letters within the test string. We can just make the regular expression global, and we do that by specifying the global flag. So the global flag is provided after the regular expression, but weirdly, it is still part of the regular expression. Even though it comes after the delimiter that marks the end of the regular expression. And now we find that both of the uppercase letters in the string have been replaced with an underscore and the uppercase letter. So parentheses will automatically remember, whatever the group inside them matches. We might not always want that to happen. And if we don't want that to happen, we can use a question mark and a colon inside the parentheses. So we do that at the start of the group, and watch what happens now. The $1 has gone from being part of the regular expression, to part of the literal replacement string. So because we've specified the group as a non-capturing group using the colon and question mark here. That has caused the$1 to stop working. So one final way that we can use parentheses in regular expressions, is for lookaheads. Lookaheads can be either positive or negative, and can match a Pattern only if it is followed by or not followed by another Pattern. So let's see what this outputs in the browser, it outputs False. So we've gone back to specifying the regular expression in the variable, but this time we've used the script variable. We've used the different flag this time. As you can see, we've passed I after the regular expression, and that just turns the regular expression case insensitive. So by default regular expressions are case sensitive. So if we're not worried about case, we can just pass the I flag after the regular expression. We've gone back to using the test method here, and we've passed the string Java. So even though Java does form the first part of our regular expression. It doesn't match, because the question mark and the equal sign means match only if followed by. So we only want to match if the word is java scripts and not if the word is just java. So let's go back and add scripts now to the end of the test ring. And we find that now the regular expression matches. So that is a positive lookahead, we will only match if the regular expression is followed by the lookahead string. If we want to negate that instead, so we only want to match if it doesn't follow the string. Then we can change the equal sign to an exclamation mark, so now we find the JavaScript doesn't pass. And if we go back to just the string Java, it does pass, and that is a negative lookahead. So in this lesson we looked at some more aspects of regular expressions. Including the special pipe character which is used as an or condition. We then focused on the different uses of parentheses within regular expressions. Including capturing and non-capturing groups, and positive and negative lookaheads. In the next lesson, let's move on to look at Classes in JavaScript. Thanks for watching!

Back to the top