Skip to main content
Dave on Design
Aspects of the important, overlooked and subtle in UX, design, management, and the world.

Metrics and Mistakes

Stories

Let me tell you two stories about metrics. Stick with me: the results are more interesting than they sound. We’ll cover choosing what you measure, and what happens when you start acting on the results from your choice of metrics.

Story 1: Blog Tourism (aka seeking many visitors)

Some years ago I worked with a company which made programming tools, including compilers and database libraries. Our blogs and articles were usually technical: ones that help developers learn, understand technology, understand what our new features let them achieve, how-to guides, key knowledge, and so forth. Mostly solid technical content.

Some blogs had low hit counts. For example, an article on OS X (as it was then) stack alignment requirements. The details don’t really matter to this story, and if you’re not a programmer, the takeaway is that it was useful specialised info for the audience. Hopefully, whatever field you work in, you value people writing down the useful stuff.
But if you really want to know, 32-bit OS X required each call stack frame be 16-byte aligned, which could cause slight performance issues and put a burden on each caller, and was done regardless of if that code used MMX/SSE instructions or used 128bit data, or was intended for external consumption or stack walking. Why? Undocumented, sorry, only informed speculation.
At the time this story is set it was valuable to our audience because this was not widely known information, hard to discover and the reasons undocumented by Apple, yet critical if you happen to encounter a situation where this fact was relevant—but also, and this is key, not read very often. This is a case of interesting, high value, but rarely visited.

For a while, we had a Marketing lead who believed we should only have blogs that had a high number of visitors. We saw a lot of low-content, high-fluff pieces not very related to our product or technology that gained visitors from generic keywords. These were cases of uninteresting, not relevant, low value, but often visited.

At the same time, the Marketing team was migrating the blog engine, and, as you would not expect from a tech company which made database admin tools and database-oriented programming tools, were migrating blogs by hand, via overseas workers contracted to copy-paste text and manually migrate from one database to another. At least one member of that Marketing team, a person who to my memory heavily pushed this approach of manual data migration while working within a database-centric tools company whose core competencies they personally marketed included making tools suitable for this work, moved on after this story to work at the world’s largest developer knowledge company. Eh. Good luck to them. Since this was slow, error-prone and expensive, to save costs a large number of blog posts needed to be culled, and the metric by which this was chosen was the number of visitors to each post. The low value but often-visited blogs made the cut; the example high-value blog above did not.   My team fought strongly for this and other interesting or historically valuable content to be preserved in the migration, and much of it indeed was preserved. Sadly I cannot find this specific blog post on our blog site today.

Here, I’ve spoken of ‘high value’ being interesting technical content, and I’d like to skip past whether that is a good judgement of value for a programming and developer tools business or not and for this article, accept that yes, it is a good basis for value. What I’d like to focus on is the metric chosen to represent value to the business for each blog post, and that was hit count: number of visitors.

What happened once that metric was chosen and applied?

Judgements and actions were taken based on that metric.

And that meant that low-value but low-content blogs that gained lots of visitors were kept, high-value blogs were deleted.

Now, hit count is not necessarily a bad metric for value. However: when we saw the change in content style driven by the marketing team, and their happy announcements of high hit rates, I asked questions about the behaviour of the visitors who landed on those blogs. Did they stay on the page long? Did they move on to other areas of our website? Did they download a trial? Could we convert them into leads? Did we have any indication from their tracking profiles that they were developers or students? Was there any indication at all that these were potential customers?

The answer was ‘no’ to all of those. If business value is building a community around your product, or driving leads for sales, these blogs failed on both counts. But that didn’t matter. The team had chosen a metric and by that metric were doing very well.

Story 2: Product Page Popularity

Another Marketing team, and a Facebook page for our product. It’s worth noting, in case someone tries to draw any connections to the deliberately anonymous people in this story, that (a) please do not connect this to my current workplace and (b) I have considerable faith in the team I work with today, who I like and respect a lot for their capability and professionalism. We pushed news articles and advertisements to that audience. Marketing had a metric of increasing follower growth, and increasing article & ad interaction, both metrics that seem apparently very reasonable.

I noticed that our Facebook follower count was growing rapidly, and many of these same new followers were liking our Facebook posts. This seemed good news. When I dug into who these new followers were, they were all from countries where we have low product usage: the nationalities did not seem representative of who we’d expect to follow our product page. This by itself was not concrete—maybe we had tapped into a previously untapped demographic—but a random sample of followers showed the accounts were likely fake.

What happened when the metric of increasing followers and high follower interaction was chosen and applied?

We saw fake followers, and interactions from those fake followers, which meant less overall business value.

(Why? Let’s suppose it is valuable to a business that your customers see an advertisement you want them to see. If you have ten thousand followers, all of whom are genuine, and ten percent of them are shown that ad, you have genuine business-valuable viewership of one thousand people. But if you have twenty thousand followers, half of whom are fake, and ten percent of your audience are shown the ad, you still have genuine business-valuable viewership of one thousand people but because you’re paying for two thousand, you got them at twice the cost, ie half the business value. Or, if you pay for one thousand people, you have genuine business-valuable viewership of only five hundred people: half the value you paid for.)

But this was unwelcome news, and it didn’t matter. The team had chosen a metric and by that metric were doing very well. Note that there was no suggestion that the team was buying followers: I have no knowledge where the apparently fake followers came from and only guesses about why they had a high interaction rate.

Metrics as Proxies

A metric is a proxy. A stand-in, something meant to represent something else. This is forgotten by many people. When that marketing team chose a metric of high blog hit counts, they were originally trying to ensure we had blog content that was interesting to a large number of developers, our product (and therefore sales and therefore revenue) audience. The proxy for being interesting to our audience was how many people visited the blog posts.

There was no actual value in having many people visit the blogs if they were not interested in the blog content, did not stay on the site, and were not someone who might ultimately purchase the software the content is created for.

This means that by itself, hit count was a poor proxy for measuring the actual goal (a growing developer audience.) Facebook growth and interaction growth was similarly a poor proxy for that goal (growing visibility among developers and potential customers on social media.)

When you measure something, if you have no way to directly measure what you’re interested in, you’re forced to measure something else: to choose a measurable proxy that you believe has high correlation to your actual goal or interest. Make sure you choose your proxies well.

Acting on Metrics

Because you are almost always measuring proxies for what you actually want to measure, you can only behave as though the metric is measuring the proxy, not as though the metric is measuring what you want to measure. This is forgotten by many people. Once you see data from your metrics, you will take action based on those results, action based on the metrics. This is reasonable: if your goal is gaining views by developers and you see that certain types of blog posts attract more developers, you’ll try to do more of what was successful. Note that this itself is a proxy: what certainty do you have that developer visibility correlates to your actual goal of making more sales of your developer tool? Sometimes it seems all marketing and outreach behaviour is a proxy for something else, a deep rabbit hole. Sanity comes not from recursively following this but by returning to: choose your proxies well. Make them as direct as possible. If your metric directly measures your goal, this is fine, but because almost always your metric measures a proxy for your goal, once you take action based on the metric you are not taking action based on your goal but on something other than your goal.

This marketing team started optimising marketing content based on the hit count metric, and it worked: lots of hit counts. But it had no value for the goal that the hit count metric was a poor proxy for: content both interesting to many developers and encouraging use of the product we wanted to sell. This meant that the marketing team was being ineffective and less successful at their own goals: reaching out to the potential sales / revenue audience.

But even though they failed their purpose, they had great results for their metrics.

Being circular, if a team having great results on their metrics is the metric by which the team’s success is judged, that itself can be a poor choice of proxy for the team’s actual success at their goal. It’s a circular, recursive rabbit hole!

Luckily, in our case, we had management that was insightful enough to view that team’s results skeptically, and today we have different people and a different approach. Not all companies or departments have management wise enough to question the choice of metrics and to question apparent success.

◆     ◆     ◆

Polish-American philosopher Alfred Korzybski said ‘the map is not the territory.’

Poet Wallace Stevens wrote in ‘Not Ideas About the Thing But the Thing Itself’, which I love for the title alone: Not Ideas About the Thing But the Thing Itself. How well this captures the essence of an easy mistake in thinking. It was like / A new knowledge of reality…’.

And:

Say it, no ideas but in things—
nothing but the blank faces of the houses
and cylindrical trees
bent, forked by preconception and accident…
William Carlos Williams

Williams was reacting against symbolism and abstraction in poetry and trying to focus on ‘the thing’ itself. See A Place For Abstraction. Forked by preconception and accident’: does this match your use of metrics?

Metrics do not measure your goals. Metrics measure a proxy. The proxy may or may not be aligned so that the measurements return close to what you would see if you could measure your goals themselves.

Never act to optimise a metric. Only act to optimise what the metric is a proxy for: your actual goal. When the metric becomes the goal, you’ve made a mistake.

Many people forget this.

◆     ◆     ◆

There are three takeaways:

1 — Metrics are proxies;

2 — Choose your proxies wisely;

3 — Act on the goal not the metric.