412.343.3692
1.800.975.1844

Regular Expressions Part X: Stars *

This is Part X of the long long series I have been doing on Regular Expressions (RegEx) for Google Analytics. It is the last one I will do that explains what Google says vs. what they mean.

When it comes to stars (or call them asterisks if you like), Google Analytic says this:

* Match zero or more of the previous items

Perfectly reasonable, if you know how to create a list of previous items. If you already read Post IX, use of the plus sign in RegEx, this will be easy, and if not, I’ll try to make it easy.

If the only special character you are using is the star *, then the previous item is defined as the previous character. For example, let’s say that my company has five digit part numbers, and I want to know how many people are searching for part number 34. The problem I have are all those leading zeros - technically, the part number is PN00034. So I could use the little Google Analytics filter box in my search report with a RegEx like this: PN0*34. That will bring me back all the searches for PN034 and PN0034 and PN00034 and PN00000034 and for that matter, PN34, since using the star means that the previous item doesn’t need to be in the search — zero or more of the previous items, it says.

Alternatively, we could build a list of previous items using square brackets. Like in my post on plus signs, I had a hard time finding a reason someone would want to use this, but again, used the example that Steve gave me. His example was square brackets with a space. So, I could do a search for my company name in the same filter box on the keywords report, like so:
Luna[ ]*metrics. That will come back with LunaMetrics (no use of the space) or Luna Metrics, or Luna Metrics, etc.

For the sake of completeness, I should point out that you can put real characters in the square brackets like this:b[aeiou]*d, and it matches bad and bed and bid and bod and bud. But for that matter, it matches baaaad and boud and bd, so I don’t think it is particularly useful. If I really just wanted to see those five examples (bad, bed, bid, bod and bud), I would be smarter to use the OR pipe | and do it like this: b(a|e|i|o|u)d.

Anyone who has a great example of using a star with square brackets is strongly encouraged to comment.

Backslashes \
Dots .
Carats ^
Dollars signs $
Question marks ?
Pipes |
Parentheses ()
Square brackets []and dashes -
Plus signs +
Stars *
Regular Expressions for Google Analytics: Now let’s Practice
Bad Greed
RegEx and Good Greed
{Braces}
Minimal Matching
Lookahead

Robbin
LunaMetrics

Share and Enjoy:
  • Digg
  • del.icio.us
  • StumbleUpon
  • Sphinn
  • Facebook

2 Responses to “Regular Expressions Part X: Stars *”

  1. Christopher Says:

    I appreciate your writing about RegEx! However, could you tell me something? Would this:
    100\.100\.100\..*
    be the same as this IP range:
    100.100.100.0-255
    ?

  2. LunaMetrics Blog Says:

    Hi Christopher. It works but is overkill and might be slow. Since each set of numbers between the dots in an IP address is a number between 0 and 255 (and this we know from reading Steve’s blog) you don’t need to have such a wild expression at the end, in fact, you don’t need anything. The regular expressions are so greedy, they match everything they can unless you tell them not to. You can do it like this: 100\.100\.100\.

    Anything after the last dot will get matched, but that is not a problem in your example.

    Robbin

    ps thanks for the vote of confidence, I sometimes think that besides Steve and Justin, I am the only person who is fascinated with this topic.

Leave a Reply