This is post N° 5 in my series of posts about Google Ads step-by-step daily optimization routines (see the introduction post here, the general overview here, the Monday routine here, and the Tuesday routine here).
Well, if it’s Wednesday and your Google Ads account is underperforming, this is what you should do.
This is the day to check any split tests (also called A/B tests) you are running, and see if it’s the right time to make any decisions
This day is when the “art and science” aspects of Marketing come together, as you will be relying both on math & statistics and your copywriting abilities to optimize your account. It can be time-intensive and will require a good dose of creativity, so be ready to do the work.
Like always, we will follow the 80/20 principle, so you shouldn’t go crazy and try to split test ads in all your ad groups one by one. Although you can create ad experiments at the account level, which can be effective, and this is something I’m currently testing.
I have two ways to pick which ad groups to optimize.
The “quick & dirty” way, when you are short of time, is to just pick the top 20% of campaigns or ad groups that have spent the most over the last 30 days, and optimize those.
The better but more time-consuming process is as follows:
- Find any campaigns that over the last 30 days:
- Are under-performing (remember the two-fold criteria I explained on my overview post)
- Have seen negative changes in conversion rate (use the “compare” function in the date selector and then sort by % change in conversion rate)
- Have received a decent amount of traffic (for example, more than 100 clicks, or 1,000 clicks, depending on the scale of your account)
By doing this, you raise the odds that the ad groups you optimize will have enough data on them for the split test to be valid. Maybe not completely, scientifically valid (because most accounts I have seen are not setup correctly for split-testing), but at least somewhat valid.
This selection criteria also raises the odds that any changes you make to the ads will actually help improve your account’s performance.
You can also do this analysis at the ad group level. It depends on how big your account is and how much time you have available.
Regardless of if you analyze at the campaign or ad group level, just focus on the worse 20% of campaigns or ad groups.
Once you have identified your targets, open the ad groups, select a relevant time period (I prefer to use “All Time”, unless a significant website/landing page was done at some point in time, which can skew the results. In that case, I would select from that date forward).
Now you might think that at this point you should just go ahead and pause the worst performers for each ad group you selected.
Wrong.
There’s an additional step you need to do. You need to check for what’s called “statistical significance”.
This concept basically means how confident you can be that your split testing results are true, and not just due to blind luck or chance.
In general, the more data you have, and the greater of a different result you see from 2 different ads, the more likely they will truly perform different on the long run. That means you can confidently pause a loser ad and keep a winner ad.
If you roll a dice 2 times, and it comes up on the number 3 both times, you wouldn’t conclude there is something wrong with the dice, right? It was probably just random chance that the dice did that, and you wouldn’t bet money on that happening again if you were to roll the dice another 2 times.
This is why it’s dangerous to make decisions when you have too little data.
But if you roll the dice 100 times, and it ALWAYS or almost always comes up on the number 3, you would suspect the dice is loaded, and it’s very likely it will come up on 3 most of the time again, if you were to roll it another 100 times. And you would probably be willing to make a bet on that, and you would probably win.
It’s the same with ads.
There is a tool you can use to check the statistical significance of a split test.
After you have identified your ad groups to test, visit this website:
Here you can enter the numbers from your ads.
You can test a maximum of 4 ads per ad group with this tool. If you have more, you can just go ahead and pause the worst performers, or the ones that have the least traffic, and make a vow with yourself to never do that again.
I like testing just 2 ads per ad group, MAYBE 3. The more you have, the longer it will likely take to get a statistically significant result.
With the split testing calculator, you need to enter the amount of clicks for each ad on the “trials” column, and the amount of conversions on the “successes” column. If you have multiple conversion types, pick the one that is most valuable to your business. You also need to click the “include” checkbox next to each row for each ad that you want to include in the calculation.
Once you click “Calculate”, the tool will spit out values for each ad on the “Apprx probability of being best” column. The higher this number is, the more you can trust that a winning ad is effectively the best.
Scientifically and statistically, a 95%+ value is considered good and reliable. But depending on the amount of traffic and the market response, it might take too long to get to 95% certainty for a particular test.
And since we are not doing science here, but business decisions under uncertainty and risk in a competitive and constantly changing marketplace, in some cases it’s better to settle for a lower value, say 90%, 85% or 80%. I think it’s better to make lots of decisions, even if you sometimes get it wrong, than to sit and wait and do nothing.
The minimum value you choose to accept will depend on how much it bothers you to make the wrong decision occasionally – for example, if you pick a winner with an 80% probability of being best, it has a 20% probability of actually being worse than the rest of the ads.
If you end up pausing the loser ads and keeping a winner, you will now have just one ad on that same group.
You should always be testing 2 ads on your most important ad groups, but if you have a lot of other ad groups to check, and don’t have enough time to write a new ad, you can leave it is as it is for now, and add another ad to test when you can.
If you decide to test a new ad, a quick & easy way to do that is to shuffle the order of your headlines & descriptions. Ideally, just change one variable at a time, for example, swapping out headline 1 for headline 2, and vice versa, and leaving everything else the same.
An important thing to do at this point is change your ad rotation settings. In order to run an effective and fair test, you want the traffic to be split more or less evenly between the different ads you are testing.
However, by default Google will use the “Optimize: Prefer best performing ads” setting, which will send more traffic to some ads that it believes will work better, and less to others.
This setting is not good for doing a fair test, so you need to change it to “Do not optimize: Rotate ads indefinitely”, either at the ad group or campaign level (I prefer to do it at the campaign level, so I don’t have to worry about this setting again).
Another thing I like to do, is setup a “triangle” of ads. The way this is done, is you create a copy of your old winning ad, and alongside it, a new challenger ad.
You give the 2 new ads (remember, one of them is just a copy of the old winner) a few days to run, and then you pause the old winner.
This way, you don’t disrupt the ad group too much, by allowing Google to continue sending traffic to the “warmed up” old ad, while the new ones get started.
And that’s it, this is how you properly A/B test ads.
You can do this procedure every week if you want, not only if your account is underperforming, as long as you make sure your results are statistically significant before making any changes.
P.S.: If you need help with your Google Ads, go here to book a free consultation with me.