In software development metrics are always 'interpretable' and prone to what I refer to as (pardon me but I cannot term it better) 'technical masturbation'. An interesting metric at work has been build box % pass rate. I'm not sure what it tells us. Each team's culture is different on how to treat the build box. One team has a 30% pass rate while another has a pass rate around 80%. One uses CI and the other does not. So what does it tell us?
Members in a CI team may find it useful, and acceptable, to "lean on the build box" by not running all tests prior to committing. This can be productive if the tests take longer than 5% of the commit rate and the build box breakages are fixed quickly (is 'quickly' relative to commit rate?). Are the gains can be greater than the cost? Does % pass rate metric reflect productivity. By 'productivity' I mean minimising time to profitable delivery.
Other teams may consider the build box pristine and not to be broken ever.
It occurs to me that perhaps the real issue is the time broken. In companies I've worked if the build took a long time (e.g. hours). So the consequence of a build break was higher and hence the team usually aspired to a no breakages policy. If you have such a slow build, faire enough. But a build time of hours is really a smell of tight coupling, I would eliminate the problem of long build time first.
So, I wonder if a useful metric coming out of this is the % time the build box is broken rather than the build box pass rate against commits?