412.343.3692
1.800.975.1844

Archive for the ‘A/B and MVT’ Category

Free beginner seminar: GWO/GA/Webmaster tools

Thursday, June 26th, 2008

Google is doing a triple event. I don’t really think that they need me to blog about it to our 1900 readers (which is why I never blog about new features, everyone has already heard about them by the time I get to my Wordpress dashboard.) But you do favors for your friends, and this is one of them.

In the PR, which you can also read on the GWO blog, they say that they will:

  • Briefly introduce the products
  • Highlight recent product releases and developments
  • Discuss the benefits of using the products together
  • Answer selected questions that attendees have submitted

So I believe it is a beginner seminar. Just ideal for the person who is beginning to work with one of those tools, or who works with one or two but doesn’t know the value of the others.

Here’s the What/Where/When:

TITLE: The Google Trifecta: Webmaster Tools, Analytics, Website
Optimizer
DATE: Tuesday, July 8, 2008
TIME: 9:00 - 10:00 am PT (Pacific Time)
JOIN US: Register to attend

Last chance for NYC Google Analytics Training: Wednesday, June 4

Friday, May 30th, 2008

Getting Ahead w GAI worked and worked and worked with Bernadette, at the New York City Harvard Club (where we are having our Google Analytics training on Wednesday, 6/4/08) so that we could accommodate extra people. She is such a gem. I drive her crazy, I know…

If you are coming, now is the time to register. We’ll have two concurrent tracks: implementation and analysis.

Implementation Track: If you are a techie (or even if you aren’t a techie, but need to do the technicals to make your Google Analytics work well), you’ll want to hear John Henson talk about filters and profiles, goals and cross-domain tracking.

Analysis Track: If you’re an analyst (or even if you aren’t, but your company expects you to do that kind of work), you’ll want to hear us talk about how to make sense of all that Google Analytics data. (That’s me — I am speaking for the first three sessions of the day.) Traci Scharf will talk about creating a leaner, meaner AdWords program. Megan Kiel is going to join us from the Google Website Optimizer team to talk about testing your website.

At the end of the day, Jonathan Weber will talk about getting the whole organization on board. And we’re going to work and work and work to be sure your questions are answered.

Don’t worry, you’ll be able to pick and choose among the sessions, and if you have to miss a critical session, we’ll see if we can’t sit with you at lunch and go through the key points. Everyone will get copies of all the handouts, too.

So here’s the link to read all about the training, and here’s the link to look at the agenda.

Robbin

Testing: How does the Website Optimizer calculator work?

Sunday, September 16th, 2007

Don’t you ever wonder about the computations of that little calculator that Google gives you to figure out the length of a multivariate test?

I don’t have any insider knowledge. But I have studied it enough to understand certain issues (and many thanks to Dylan Lewis of the Web Analytics Wiki for confirming my suspicions wrt how it should work.) Specifically, you should need more data to “prove the same thing” if your control has a higher conversion rate, up to a conversion rate of 50%.

So let’s start: why does the GWO calculator ask you to input the conversion rate of your current page? Well, here’s why they care. If you hold everything else the same and tell the calculator that your current conversion rate is 4% instead of 3%, it will want a larger sample (translation: more pageviews, or more time to get those more pageviews) in order to get the statistical significance it needs.

So look at these two examples. All the variables are the same (sort of — I promise I will explain.) However, in the examples below, one conversion rate was 3% and the other is 4%. Notice, also (here is the explanation just promised) that I changed the expected increase in conversion rate. With the 4% test, I have it expected to increase by 25% (so that I will get a one point lift in my conversion — after all, .25*4=1) And with the 3% test, it’s expected to increase by 33.33% (because 3% times .33333 is also a one point lift):gwo-calculator-3.jpg gwo-calculator-4.jpg

So when the current conversion rate is higher, and you are looking for the same absolute expected improvement, the test takes longer, so that you can get more pageviews - i.e. get a greater sample size.

Why?

Why do we need more data to prove improvement with a highly-converting page than with a poorly converting page? Here, I will use a more extreme example: an absolute increase of one point is pretty low when you are looking at a page that converts at 25%. So we need lots of data to prove that a test will do better than the 25%. But a one point increase — that a whopping increase if your control page converts at 4% right now. So we can prove that our new test is better than our old control with just a little bit of data in that situation.

Here’s the really interesting part: when your control has a conversion rate of 50%, you need the most pageviews, i.e. time to get those pageviews. As you keep going beyond 50%, the time to run your tests starts to decrease. When you get to a conversion rate of 75% for your control, the time it takes for the test should mirror the time it takes at a control conversion rate of 25%. (It’s not perfectly exact for mathematical reasons that are too boring to go into here.) But check it out:

gwo-calculator-25.jpggwo-calculator-75.jpg

(notice that 25* 10% is a 2.5 point lift, and 75* 3.33333 is a 2.5 lift in conversion rate, also.)

Why?

Why does it all turn around at 50%? And I want to try to explain this without using ps and qs and little hats, since I’m not a statistician. So I won’t use fancy equations. Just simple ones.

All these equations that are behind all these kinds of calculators, they include two events: heads or tails. Conversions or non-conversions. They never say (to the extent that they talk), “Conversion is good.” Only people think that conversion is good and non-conversion is bad. (Those equations also include other stuff, but we don’t have to go there.) In fact, you have to have five conversions and five non-conversions for a combination to show up in the graphical area of the website optimizer (the area where the bars are green and grey and red.)

So when you start playing with conversion times non-conversion, you find out that they multiply out to the largest amount when they are both 50%. Right? .5*.5= .25 but if you now use a little 2% conversion rate instead, you have .02*.98 = .0196. That’s way lower than .25 (and remember — this is not sample size, but is one of the important parts of the sample size equation.)

My fourth grade teacher, Mrs. Petrowski, insisted that I learn all those math laws, and one of them was about “commutativity” — it doesn’t matter what the order is in multiplication, you still get the same answer, she lectured. So we can swap those numbers and say that the conversion rate is 98%, leaving the non-conversion rate to be 2%, and the product is still .0196.

So whether you have a 98% conversion rate or a 2% conversion rate — your sample size is going to be the same. (Remember that there is a lot of other junk that goes into the equations, but this basic principle should hold, even though I don’t have access to the innards of the calculator.) And from all this gobbledygook we learn:

  • To prove that a test is 1% better than the control, you need more pageviews if the control has a high conversion rate than you would if the control had a low conversion rate.
  • However, once the control has a conversion rate over 50%, you start needing fewer pageviews.
  • This is a hard topic. If you didn’t understand, please comment and I will do my best.

Whew. This post took me at least two months to write. Many thanks toDylan, again; to Wendi Malley; to Tom Leung (whom I have driven crazy on this topic); and to EV, the GWO engineer who must be sorry he ever gave me his email address.

Everyone who thinks that change in conversion rate should be viewed as a PERCENT and not as an absolute lift in conversion is welcome to flame in the comments.

Robbin

Wendi Malley vs GWO: who is correct?

Tuesday, July 10th, 2007

How long do you have to run a test to consider it a tie?

You could consider this to be part II of a series, where part I is a post by statistician Wendi Malley. She writes about how many pageviews I need in my sample size before I can call my Google Website Optimizer (GWO) test a tie. You should bear with me even if statistics aren’t your thing, because by the end of this post, I put it in plain English. (OK, en-us.)

If you didn’t read her post, she looked at my GWO tests, which were all running neck and neck for two+ weeks, with a conversion rate of greater than 4% for the control (and the other ones, too). From there, she figured out that I can call it a day (i.e. they are a tie), when I have 1728 pageviews. I only had 783 views of the test page when I sent her the data.

Her answer assumes that I am looking for 95% confidence in my answer and a margin of error of plus or minus 1%. Since it only took 2+ weeks to get 783 views, I figured I only needed another 2+ weeks to go.

But at the same time that I wrote Wendi, I also wrote GWO. On the surface, their answer seemed to be very different from hers:

Given enough time, every test (assuming there are perceptible differences in the variations) will generate a winner in the report. This is because with enough data, even the smallest differences will be discernible. The question is, are those differences worth waiting for? At this point, there aren’t many conversions in your experiment. Because of the low traffic and low conversion rate, you may have to wait for months to get something more definitive.

Hmm, those two things didn’t seem to go together. So I pushed a little harder, and as usual, the GWO people were very responsive, and they came back with this answer:

What Wendi is describing in her blog is a power calculation. This
says: if I want to be able to measure a difference of a given size
(delta), if I wait so long (n), I will be 95% (alpha) certain that I
can see the difference…

My original statement is also correct: If you wait long enough, a
difference of any magnitude will be measurable. What Wendi shows is
that you qualify that statement with an amount of difference one is
interested in, you can calculate the number of impressions required to
detect that difference with a given degree of certainty.

So I pushed through the Greek letters (and wrote Wendi) in an effort to really understand her equation. Here is what it means in English - no Greek letters or subscripts (and Wendi, you correct me if I am wrong):

Given a conversion rate we already know (the control) and a confidence that we want (95%), how many views of the test page do we need to have in order to feel that the conversion rate of the other tests will be no more than plus or minus 1%? In the case of my test, how many pageviews do we need to see to feel 95% confident that the conversion rates of the other tests will be between 3.72 and 5.72%? (after all the control has 4.72%, so that’s plus or minus one percentage point, right?)

And in fact, GWO is right also - they *are* both right. We can decrease that “margin of error” (I wish we could call it “conversion rate difference”) to be .0001% and we will need over 17 million page views to have 95% confidence that there is a tie. Of course, I owe that calculation to Wendi’s spreadsheet.

And finally — look! I am starting to see a little spread in the data:

gwo-blog-shot.JPG

Greg Niland on testing: Not so excellent

Saturday, June 2nd, 2007

Since it’s the weekend, I thought I could rant a little about Greg Niland’s recent podcast on testing your website. But first, a few nice things about Greg:

  • He has the world’s most adorable laugh. And it is even more fun to hear it over the phone after you have heard him do it on fifty podcasts
  • Unlike his SEO peers, he does a podcast for newbies. Most advanced web marketing professionals, be they in SEO or web analytics, want to show off their advanced techniques.
  • He devotes a lot of his time to charitable causes.

But on to the main attraction. I only this week had a chance to hear his podcast with Shimon Sandler. The exact topic was “Making money the Unfun way.” Hmm, I thought as I started to listen. What will be so unfun here? Rewriting URIs? Playing with your robots file? Canonical issues?

Oh, I was wrong. The major unfun issue they wanted to talk about was testing.

Testing? Unfun? Well, I guess what is fun to one person might be boring to another.

But then guest Shimon Sandler, who is an SEO, put testing into two categories, A/B and Mutivariate, “also known as Taguchi.” Really? Greg asked, I’ve never heard of multivariate testing called Taguchi. Maybe multivariate testing was designed by someone named Taguchi, Shimon answered.

(For the record, very simplified: the Taguchi method is one way of decreasing the number of page views and conversions your test needs, as opposed to a full-factorial analysis, which uses every variation. Offermatica uses Taguchi; Google Website Optimizer uses a full-factorial analysis. The Taguchi Method was designed by someone named Taguchi, but as a way of applying statistics to manufacturing.)

Sandler rattled off a few names in the MVT field, but forgot (or didn’t know) most of the important ones. He knew Offermatica, Vertster and at least one other, but missed Optimost, Sitespect and Website Optimizer. And then, the two of them talked about what you could test. But being SEOs instead of conversion “scientists” or MVT jocks, they could only come up with SEO-type things to test. Your PPC ads. Your urls.

Well, everyone who works in testing knows that to make a difference, you should start by testing BIG things. Your headlines. Your buttons. Your shopping cart. Your call to action.

But what was I expecting? Greg needed to have someone from the testing world on that show. How about Sean Purtell from ROI Revolution? How about Ophir Prusack? How about Bryan Eisenberg?

Well, in closing my rant, I do want to give credit to Shimon Sandler for addressing the web analytics issue. When I called Greg and pitched him on doing a show on web analytics (back when our company wasn’t a Google Analytics Authorized Consultant, life was sane and I didn’t spend every weekend working), he told me that his listeners “just aren’t interested in logfiles.” (I tried to explain that not all WA is server side, but I don’t think he cared.)

Robbin

101 Things to do with Website Optimizer (and a new blog)

Tuesday, May 22nd, 2007

I stumbled upon (without using Stumble Upon) this really great conversion resource in the UK, Conversion Rate Squirrel. When they have time to write, they do some really cool things. They told me that one of their most popular and appreciated pieces is 101 Things to do with Website Optimizer. (I particularly interested in points #37 and #59 - those were new for me.) So read and enjoy.

Also, welcome Coremark Analytics to the blogosphere. Judah Phillips convinced me that the author is Wendi Malley, who works with the WAA Research Committee. But, who really believes Judah anyway? ;) Well, Mystery Blogger, I hope you do more work on Statistics, we really need a great statistics blog.

Endnotes: Many thanks to Neil Mason, who taught me to stop writing “England” and start writing, “the UK.” Very special thanks to David Meerman Scott, who is publishing a new book, The New Rules of Marketing and PR, and who changed our blog address in his post. (He pointed out to me that the hard copy has already been printed. Ah, paper.)

Website Optimizer: 5 non-conversions required

Wednesday, April 4th, 2007

Did you know that you needed five non-conversions for each of your combinations in Google’s Website Optimizer? Yeah, me neither.

When I was at Google last week, I was showing the results of a 72 way test to Eric, one of the engineers who is behind Google’s Website Optimizer product. (In case you are not familiar with WO, it is Google’s free multivariate software. It enables website owners to test multiple pieces of a page and even a funnel or series of pages to see which combinations of the variables convert the best. Read why multivariate testing is like a game of Clue.)

Well, anyway, I showed it to Eric, and he pointed out that even though we had a few combinations with greater than a 97% chance to beat the original, one variation was’t even showing up in the Combinations tab. “And that’s your best one so far,” he said. It converted 14 out of 17 times, the dashboard proclaimed. The reason that the combination data didn’t show, he explained, was that there needed to be enough data in general for that test, and there needed to be at least five conversions and (here is the part I didn’t expect) five non-conversions. In the case of this particular combination, there were only three non-conversions. (17-14, right?)

So why the need for non-conversions? After all, the more conversions, the merrier, right? Well in fact, no. “It does best when the conversion rate is 50%” Eric emailed to me.

In order to begin to wrap your head around this (assuming you aren’t a statistician), you have to stop thinking that conversions are good. Instead, there are two states here, a or b. Conversion or non-conversion. Heads or Tails.

So let’s say that we take a finite, maybe only 20, visitors and estimate the conversion rate based on the fraction of the 20 that converted. Is that a good estimate of the true conversion rate? The holy “Law of large numbers” in statistics says that the average conversion of a finite set of visitors becomes a good estimate of the true value as the number of trials becomes large. But, the fine print in this law states that the number must be really large when the true mean (the true conversion rate) is very small (very few convert) or large (nearly all convert). In fact, for the estimate based on finite visitors to be good you need to have enough counter examples. “Counter examples” are non-conversions when the conversion rate is high, or conversions when the conversion rate is low.

I know, you want to know where they got the number five from, and why it’s an absolute number and not a percent. Me too. I’m thinking that the issue is, it can’t be a percent, because if you have only 20 visits, you need a high percent, and if you have 100 visits to that combination, you need a low combination. By fixing a specific amount, you make sure you get something, for both conversions and non-conversions. For both heads and tails. But that last part is speculation. Now if you want to learn something really cool about WO, go over to ROI Revolutions’s blog and read Shawn Purtell’s magnificent piece on the marriage of GA and Website Optimizer.

Robbin Steif

Analytics: How I cheated on Google AdWords

Saturday, August 19th, 2006

It’s true. I cheated on Google AdWords so that I could measure A/B Google tests in my Google Analytics.

First, the background. Lots of small websites and advertisers aren’t ready to take the multivariate testing plunge, so start by using Google AdWords to A/B test. This is pretty tried and true: You create two ads for the same AdGroup which are absolutely identical, down to the URL that shows on the Google page (SERP). However, when the customer clicks, each ad has a different landing page. Then the advertiser compares conversion rate (or revenue, or average order size) for all the customers who start with Ad1 vs Ad2.

Admittedly, it has its limitations. Search engines are demographically skewed, and just because it works for Google customers doesn’t mean that it will work on Yahoo. But it’s way better than saying, “I know what will work. I just know.”

Google has recently made measuring different ad versions easier in the AdWords interface, but with Google Analytics, you still have to know how pull down the right menus and segment to see what you need. And even then, if the ads have the same name, you can’t tell them apart.

Step 1: Pulling down the right menus. (I can’t remember whether I learned how to do this from Justin’s GA blog or from ROI’s GA blog, and I can’t find the reference.)


Choose Marketing Optimization > Marketing Campaign Results > Campaign Conversion. Left click on the Analysis Options next to one of your Google AdWord campaigns (which you get with the little red circle to the left of the campaigns - follow the top red arrow in my picture); choose Cross Segment performance (that’s the middle red arrow I’ve drawn); finally, choose Content. When you choose Content, you’ll get a list of the different ads that are running for that campaign, by goal.

Step 2: This is where you cheat: Retitle your ads, ever so slightly. When you are using your Google AdWords to do A/B testing, as described above, the ads are identically worded. Google Analytics lists them out by title, which means, it can’t tell you that Ad1, titled, “Increase your Conversion Rate,” and which lands on www.lunametrics.com , is doing terribly, and that Ad2, titled, “Increase your Conversion Rate,” which lands on www.lunamerics.com/conversionrate, is doing great. It only sees one ad, called “Increase your Conversion Rate.” You can cheat on Google AdWords by changing the titles very very slightly. In this case, I would change one of the ads to have a capital Y in Your, so that it reads, “Increase Your Conversion Rate.” The difference is slight enough that it shouldn’t matter, and will enable you to read the results in GA.

Robbin Steif
LunaMetrics

ABA testing

Wednesday, April 19th, 2006

I’m at the eMetrics Summit in Santa Barbara, and there are lots of cool lessons to be heard, some of which I’ll write about in the following days. But today, I’m writing about lunch.

I didn’t really notice what I *ate*, but I sat with Matt Roche from Offermatica (whom I referenced a few days ago but only met today) and Bill Bruno from Stratigent. During the course of lunch, I asked them both what they thought about ABA testing, which companies sometimes use when they have too few visitors (too little data) to do standard split-path testing.

With ABA, the company does an A/B test with three groups instead of two. They randomly split visitors into three groups. One of them sees the test page (the B group.) The other two groups both see the same control page (Those are the two A groups.) When the first A group has the same conversion rate as the second A group, the company who is running the test decides that it has collected enough data, and then feels that it is in a position to declare either the control or the new test page a winner.

Matt and Bill both dismissed ABA testing as a not-too-great idea. Matt drew a picture on the back of Bill’s business card, showing how data usually presents itself:

It is never very clean, he said, and it doesn’t usually go in a beautiful curve, but it bounces all over the place. So in the above picture, the two “A” tests are in black and red, and the B test is in blue. At which arrow should we say that the two A tests are the same — the first one, where the blue is a winner, the second one, where they are all tied, or the third one, where the blue line is the loser?

Robbin Steif
LunaMetrics

Amazon isn’t always worth copying

Saturday, March 18th, 2006

Today, a colleague paid me (and my designer) the ultimate website compliment. He asked if he could copy my website structure. “I figured you had it all worked out,” he wrote.

There was a lot of truth in this — enough that I could feel good about telling him to go ahead. I really did try to work in best practices, because I had to. How can I tell customers with lead generation sites to put a contact form on every page if I don’t do it? How can I tell them to link to their privacy policies right by the email field if I don’t do the same?

On the other hand, I didn’t win every battle with my designers and still have a lot of work to do on my own. For example, I don’t have a great 404 error page, so that’s still on my list. I don’t have on-site search that can handle spelling errors and stemming — if you type in “emarketing” instead of “e-marketing,” you get a No Results page. (It’s a nice No Results page, but plain old Results would be better.) I didn’t have enough negotiating capital to get my navigation below the banner, where people would actually see it.

Which brings me to my point. Big companies like Amazon or a small web conversion company like mine aren’t always worth emulating. You assume they’ve tested everything — but maybe they haven’t. Sometimes the politics of an organization force upon companies sub-optimal solutions that everyone can live with.

Because my colleague wrote to me first, I was able to point out one change that would improve his site based on the experience I had with my own. But if you’re copying Amazon — well, the employees at Amazon are better about keeping secrets than the CIA is. Even if you do have a friend at Amazon, she won’t tell you anything. In that case, you have to measure and test, test and measure.

Robbin
LunaMetrics