Regular Expressions Part XI: Real Wildcards .* | LunaMetrics

Regular Expressions Part XI: Real Wildcards .*

/

Now we are (I am) ready for a Google Analytics Regular Expression that is truly a wildcard .*

Months ago, I wrote a blog post about Regular Expressions Wildcards for Google Analytics. But when I went back to it, it was only semi-intelligible, so I deleted it and created all the Regular Expression building blocks first. If you like, you can read all ten of them:

… you can read all of them, stretching out over a year:

Backslashes
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes –
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
Intro to RegEx
{Braces}
Minimal Matching
Lookahead

Now that you (or perhaps more correctly, I) understand the building blocks, let’s talk about how to create real wildcards.

Most of us are familiar with a star as a wildcard, outside of Regular Expressions. We can search for all our .jpg files on our computer with this: *.jpg, which to us means “get everything.jpg.” However, with Regular Expressions, a star only means repeat the last character zero times or once or more than once. In order to make it mean “get everything,” you have to pair it with a dot, like so: .*

Why? Because, a dot means get any character. A star means, repeat the last character zero times or once or more than once. So the combination means, repeat any characters as often as you like, i.e. get everything.

If we wanted to get every occurance of a jpg file, we would do it with a RegEx that looked like this:
.*.jpg

For those of you who are scratching your head instead of nodding your heads, here is why: .* tells Google Analytics to match everything (as described above). The next part of the expression . tells GA to then match a real dot. This is because dots are usually wildcards in their own right, but using a backslash turns them into ordinary dots. The last three characters, jpg, tells GA to match the letters jpg. So we get end up with “everything.jpg,” which was just what we wanted.

Robbin
LunaMetrics

Many thanks to Justin and his awesome RegEx Tool (which doesn’t require a download.) Postscript: And of course, thanks to Steve, who taught me Regular Expressions from the beginning and found an error in this original post.

Our owner and CEO, Robbin Steif, started LunaMetrics twelve years ago. She is a graduate of Harvard College and the Harvard Business School, and has served on the Board of Directors for the Digital Analytics Association. Robbin is a winner of a BusinessWomen First award, as well as a recent Diamond Award for business leadership. You should read her letter before you decide to work with us.

  • Anonymous

    Ooo. So close.
    * is zero or more, not one or more. 🙂
    One or more is the plus: +

    – Steve

  • http://www.lunametrics.com/blog LunaMetrics Blog

    Thanks. Good catch, I fixed it. (I have never seen your laconic side before.) Robbin

  • Anonymous

    Laconic?

    😉

    – S

  • Anonymous

    Thanks, your simple 15 part blog allowed me to figure out how to write my GA regex! I had the same exact reaction you did to the GA regex explanations….what?????

Contact Us.

LunaMetrics

24 S. 18th Street, Suite 100,
Pittsburgh, PA 15203

Follow Us

1.877.220.LUNA

1.412.381.5500

getinfo@lunametrics.com

Questions?
We'll get back to you
in ONE business day.