The Explainability Layer: Making AI Legible

"Your algorithm says this trade is risky. I need to know why." The model had an answer — 72% — and no way to defend it. She overrode it. Three weeks later the trade lost $2.1M. The model had been right. The interface just couldn't say so.

The question the system couldn't answer

In 2021 a compliance officer put that question to me directly, and I didn't have a good answer. The risk model flagged a trade at 72%. She wasn't refusing to listen — she was doing her job, which is to never sign off on something she can't explain to an auditor. The system gave her a number and a recommendation and nothing in between. So she did the only defensible thing and waved the model away. The trade went through unsupervised. Three weeks later it cost the firm $2.1M.

Here's the part that stayed with me: the model wasn't wrong. It had caught the risk. The failure was entirely on the interface — nobody had designed the layer that turns a score into a reason. And a score without a reason is, to an expert who's accountable for the decision, just noise with a confident font.

What she actually needed

I used to think explainability meant exposing the machinery — model weights, feature vectors, the architecture. It doesn't. The compliance officer didn't want a lecture on the model. She wanted an answer to one human question: why this, and not that? When I rebuilt the surface, that's the only question I designed around. The recommendation moved down the page. The reasoning moved up: this counterparty has six failed trades in eighteen months; their average position is $500K; this one is $2M, twice their normal appetite. Suddenly she wasn't being asked to trust a number. She was being handed a case she could check, argue with, and put her name next to.

Explainability isn't a data-science problem. It's a design problem. You can ship the best model in the world; if the person on the hook for the decision can't see why it decided, they won't use it — and they'll be right not to.

Why I design the explanation first

The most useful habit I took from that engagement is to design the explanation surface before the model is finished. It sounds backwards, but it does something quietly powerful: when the interface has to show why, the model is forced to produce reasons a person can actually read — not just an output, but the handful of signals behind it, and the boundary at which the answer would flip. Demand legibility from the UI and you end up demanding it from the model too.

The sophisticated systems rarely fail because the math is wrong. They fail at the explanation layer, in front of a person who has to be accountable for what they approve. Interpretability is the bridge between a model that's correct and a model anyone is willing to act on. Most teams build the model and bolt the bridge on later. I'd rather build the bridge first and let it shape the model.

The reference version

Want the playbook, not the story? The ML-explainability patterns I design from — provenance, feature importance, counterfactuals — with the do's, don'ts, and worked examples.

See the pattern reference: ML Explainability Patterns →

Continue exploring