Analytics Regular Expression Characters

By ,

250 SHARES

Regular expressions can seem like a foreign language, but once you know how they work you won’t know how you ever used Google Analytics without them . This post will help you understand how each of the regular expressions function and includes examples of how they can be used for your account.

Backslash Analytics Regular ExpressionWhat is does: Turns the character following the backslash into plain text.

How it works:

Say you want to create a goal for the url /thankyou?id=123. In Regex, “?” has another meaning, which we’ll get to in a little bit, but we need it to be plain text since it is part of the url string. To do that, we place a backslash before the ? to tell Analytics to treat it as plain text.

/thankyou\?id=123

Pipe Analytics Regular Expression

What it does: Creates an “or” statement.

a|b will match a or b

 

How it works:

Let’s say you want to find all visits from branded terms for PPC Hero. You can create a custom filter setting a regular expression for all brand keywords.

ppc hero|ppchero

Branded Regular ExpressionAll terms containing ppchero or ppc hero will be returned.

Branded Search Term Regex

Question Mark Analytics Regular ExpressionWhat it does: Tells Analytics that the previous item is optional.

ab?c will match ac or abc

How it works:

This expression comes in handy when you are filtering for keywords that are commonly misspelled. I want to find all visits to our site that contain the term “heroes,” which is often misspelled as “heros.”

heroe?s

Question Mark Regex

This will catch keywords that contain either “heros” or “heroes.”

Regex Misspelling Results

Parentheses Analytics Regular ExpressionsWhat it does: Tells other regex characters how to function. Works the same way as in math.

2 + 3 x 5 = 17                                         (2+3) x 5 = 25

 

How it works:

You’ll most often see parentheses working in conjunction with pipe bars. I want to all the searches for Google Display Network. I know people also refer to it as the Google Content Network and I want to include both searches in my results. Without the parentheses, Analytics would return anything containing “Google Content” or “Display Network.”

Google (Display|Content) Network

Parentheses RegexBy including parentheses, this Regex will return anything containing “google content network” or “google display network.”

Parenthesis Analytics Regex

Square Brackets Analytics Regular Expressions

What they do: Create a list of items to match to. The regular expression will only match ONE item in this list.

p[aiu] will match pan, pin, pun but NOT pain

How it’s used:

I’m interested in how many click to the 2nd, 3rd, and 4th pages when they come to our blog. The url for each page x on the blog is /page/x. To find pages 2, 3 and 4 I would set my expression as follows:

/page/[234]

Before we see what the results look like in Analytics, I want to introduce you to the next regex character which helps in creating lists.

Dash Analytics Regular ExpressionsWhat it does: Works with brackets to extend lists.

  • [a-z] matches all lower case letters in the alphabet
  • [A-Z] matches all upper case letters in the alphabet
  • [a-zA-Z0-9] matches lower and upper case letters and digits

How it works:

Lets use the same example above. Using a dash in my regular expression, I can quickly include more page numbers for it to match to without having to type them all out.

/page/[2-9]

Dashes and Brackets RegexThis will return any page which url ends in /page/2 through /page/9.

Dash regular expression filterLooking at these results, you might be wondering about two things. What happens when you want to view pages higher than 9 and how do you keep regex from including the category pages. Those questions will be answered as we continue to get to know the rest of the Regex characters.

Braces Analytics Regular ExpressionsWhat they do:Braces tell Analytics to repeat the last piece of information a certain number of times.

Braces can be used with one or two numbers.

  • {x,y} – repeat the last item at least x times and no more than y times
  • {z} – repeat the last item exactly z times

How it works:

I can use the braces, combined with brackets and dashes, to include page numbers higher than 9 in the example above. I’ll also need to change the starting number from a 2 to a 0, or else the regex will ignore any pages containing the number one.

/page/[0-9]{1,2}

This will pull all urls that end in page/1 through page/99.

Dot Analytics Regular ExpressionsWhat it does: A dot matches ANY one character. Characters include letters, numbers and symbols. A dot even matches a whitespace.

a.c will match “abc”, “adc”, “a$c”, “a c” ,etc. It won’t match “ac” because there is no character between a and c.

 

How it works:

Truth be told, I really don’t use just the dot in analytics much. Even so, it’s still important to know how it functions so you set up your regular expressions correctly.

If I want to see all keywords for which someone included “.com” and I don’t use \ to remove the regex function of the dot, will find anything that has any character before “com.” Look at the difference in results below when .com is used with and without the \.

dot regex resultsPlus Sign Analytics Regular ExpressionsWhat it does: A plus sign matches one or more of the previous items, and only the previous items.

a+bc will match abc, aabc, aaabc but not bc.

You can also use lists with plus signs to match more than just one previous item.

[abc]+ wil match a, ab, abc,  acb, c, b, bbbbbbb, etc.

How it works:

Going back to the page number example, we can use + instead of { } to match to pages above 9.

Plus Sign Regex

Plus Sign Regex Results

Star Analytics Regular ExpressionsWhat it does: A star matches zero or more of the previous items. Similar to plus signs except they allow you to match ZERO or more of the previous items (plus signs require at least one match).

a*bc will match abc, aabc, aaabc AND bc.

How it works:

Let’s take the example above. This would only match to page urls that have some number after them. If I use a star, it will match all urls that end in page/ with or without a number after it.

Dot Star Analytics Regular ExpressionsWhat it does: These two regular expressions put together mean “get everything.”

How it works:

If I want to compare visits for the 2nd page of every piece of content on my site I can set up my regular expression as .*/page/2/.* to catch every 2nd page url on my site.

Dot Star RegexDot Star Regex Filter

Carat Analytics Regular ExpressionsWhat it does: When you use a caret in your RegEx you force the Expression to match only strings that start exactly the way your RegEx does.

^abc will match ab, a, abc but not bc

How it works:

Let’s revisit my earlier task of wanting to see all the pages in my main blog feed past page 1. Remember, I was getting category pages and not just main pages. Placing the carat at the beginning of my string can solve this problem.

^/page/[1-9]*

Carat RegexCarat Regex Filter

Dollar Sign Analytics Regular ExpressionsWhat it does: Indicates the end of the string. It tell Analytics not to match any target string that has any characters beyond where I have placed the dollar sign in my Regular Expression.

abc$ will match abc, bc but not abcd

 

How it works:

Finally we have all of the characters necessary to create an expression that only shows the different pages of the main blog.

Dollar Sign RegexDollar Sign Regex FilterUnderstanding how to use these regular expressions will help you quickly find the information you are looking for in your Analytics account. If you have any questions or comments please post below!

 


Get more weekly links with our Fast Five newsletter! Five Fast Links in Your Email Every Friday.

Also send me a daily RSS digest

Social Advertising Toolkit

Twitter Facebook LinkedIn Google+ Email Print More
  • http://twitter.com/john4math John Barth

    Wow, what a great guide!  I’m definitely bookedmarking this. A couple of things:

    - In your braces example, you use /page/[1-9]{1,2}  to match all pages from 1 to 99. That expression will exclude all pages with 0′s in them, such as 10, 20, etc. You probably want to use /page/[0-9]{1,2} .

    - It looks like you used a “+” instead of a “*” in your * example.  Copy/pasta :)

    • http://www.hanapinmarketing.com Bethany Bey

      Thanks for pointing these out John! I’ve updated the post so it should be correct now.

  • Pingback: Marketing Day: January 26, 2012

  • Pingback: SearchCap: The Day In Search, January 26, 2012 | Market 7

  • http://twitter.com/torka Diane Aull

    Thanks for one of the clearest explanations of regular expressions I’ve come across in awhile. I’m now all fired up to try ‘em out!

    • http://www.hanapinmarketing.com Bethany Bey

      Thanks Diane! That’s exactly what I was trying to do. It took me awhile to start using regular expressions because I thought they were complicated to understand.

  • Thomas Grübel

    Very understandable explanation. Thanks.

  • Pingback: Google Analytics and RegEx | Progrexion Corporate Blog

  • Anonymous

    Nice post on regex, which can be very confusing. I understand that this is in reference to analytics, but it may be helpful for other folks who find their way to this post looking for a general regular expressions tutorial to get into some of the find/replace functionality that makes regex so powerful.

  • Brian Jensen

    Great resource here. Regular expressions can get confusing and you did a great job clearly explaining the functionality of each metacharacter alone and in conjunction with each other.