The AI Experimenter: The Most Common Profile in Financial Services Modernization

Heather Page

The AI Experimenter: What It Means and What to Do About It

Part 1 of 6 in the Six Profiles of Financial Services Modernization series

There's a version of this story that plays out at nearly every mid-market and enterprise financial institution we work with.

The AI team is doing genuinely impressive work. They've built propensity models that outperform legacy rules engines by a wide margin. There's a fraud detection pilot in production that's catching patterns the previous system missed entirely. A customer churn model is sitting in a notebook somewhere that, if operationalized, would save the retail division eight figures a year.

Leadership is invested. The board has seen demos. The CTO has a roadmap. The Chief Data Officer is building a team.

And yet.

The pilots stay pilots. The roadmap keeps extending. The churn model is still in a notebook. When someone asks why the fraud detection system hasn't rolled out to the other three business lines, the answer involves six months of data pipeline work that nobody budgeted for.

This is The AI Experimenter — and it is, by a significant margin, the most common modernization profile we see in financial services.

The Pattern

The AI Experimenter profile has a distinctive shape. When we map institutions across the four pillars of the TribalScale Financial Services Modernization Index™ — Data Foundation & Architecture, Governance & Regulatory Readiness, AI & Real-Time Decisioning, and Customer Intelligence & Personalization — one pillar spikes noticeably above the others.

AI & Real-Time Decisioning scores a 3.5 or higher. There are models in production. There's genuine capability.

But Data Foundation sits below 3. Governance & Regulatory Readiness sits below 3. And that gap — between what the AI team can build in isolation and what the enterprise can actually operationalize — is where the value leaks out.

From the outside, this institution looks like it's ahead. The demos are impressive. The talent is real. The executive narrative is about AI transformation.

From the inside, the people closest to the work know the truth: they're building on sand.

Why It Happens

The AI Experimenter profile isn't a failure of intelligence or ambition. It's the predictable result of how most financial institutions entered the AI era.

AI talent was hired before the architecture was ready. Starting around 2018–2020, most large financial institutions began aggressively hiring data scientists and ML engineers. This was the rational move — the talent market was competitive, and waiting meant falling further behind. But these teams were hired into environments where the data infrastructure was still designed for batch reporting and regulatory compliance, not for training and deploying machine learning models at scale. The talent arrived before the foundation was ready to support what they were being asked to build.

Pilots are funded differently than platforms. An AI pilot can be stood up with a small team, a curated dataset, and a cloud sandbox. The business case is compelling: "Give us six months and $500K, and we'll prove the model works." That's an easy yes. What's harder is the $5M–$15M infrastructure investment to build the unified data layer, feature stores, model serving infrastructure, and CI/CD pipelines that would let that pilot run in production across the enterprise. Pilots get innovation budgets. Platforms require capital allocation decisions. These are governed by different committees, different timelines, and different risk appetites.

Success created its own trap. The cruelest irony of the AI Experimenter profile is that the pilots often genuinely succeed. The fraud model works. The churn predictor is accurate. The next-best-action engine outperforms the rules engine in every A/B test. These successes create executive confidence that the AI strategy is working — which reduces urgency around the foundational investments that would allow those successes to scale. Why approve a $10M data platform overhaul when the AI team keeps delivering wins? The answer, of course, is that those wins are confined to sandboxes and single use cases, and they'll stay there until the platform catches up.

Governance was treated as a later-stage concern. When the AI team was small and the models were experimental, governance was informal. Model validation was a conversation, not a process. Data lineage was tracked in spreadsheets or not at all. As the number of models in production grows, this informality becomes a structural risk — particularly in financial services, where regulatory expectations around model risk management (SR 11-7, SS1/23, OSFI E-23) are explicit and increasingly enforced. The AI Experimenter often discovers their governance gap not through internal audit, but through a regulator asking questions they can't answer quickly.

The Hidden Risk

The immediate risk for the AI Experimenter is well understood: pilot fatigue. Eventually, the board stops being impressed by demos and starts asking why the transformation roadmap keeps slipping. Executive confidence in AI as a strategic lever erodes — not because the technology failed, but because the institution couldn't operationalize it.

But there's a deeper risk that's less obvious.

The talent leaves. The data scientists and ML engineers who were hired to build transformative AI capabilities are, in practice, spending 60–70% of their time on data wrangling, pipeline fixes, and workarounds for infrastructure gaps. This is not what they were recruited to do. Top AI talent has options, and they will migrate to institutions — or to tech companies — where the platform allows them to do the work they were hired for. The AI Experimenter doesn't just fail to scale its models. It loses the people who built them.

Competitors close the gap from below. While the AI Experimenter is stuck in pilot mode, institutions that invested in data foundation first — the Infrastructure-First Institutions and Balanced Modernizers — are reaching the point where their platforms can support AI at scale. They may be 18 months behind on AI talent, but they're 18 months ahead on the architecture that makes AI talent productive. When they deploy, they'll deploy enterprise-wide. The AI Experimenter's early lead in AI capability becomes meaningless if it can't be operationalized before competitors catch up on a more stable foundation.

Regulatory risk compounds quietly. Every model running in production without a mature governance framework is a regulatory exposure. In financial services, this isn't theoretical. Model risk management failures result in consent orders, remediation mandates, and reputational damage. The AI Experimenter's governance gap widens with every new model deployed, and the cost of retrofitting governance increases nonlinearly as the model inventory grows.

The Highest-Leverage Move

If you recognize your institution in this profile, the instinct is to try to fix everything at once — launch a data platform modernization, build a governance framework, and continue the AI roadmap in parallel. This is understandable and almost always wrong. Parallel tracks at this scale compete for the same resources, the same executive attention, and the same change management bandwidth.

The single highest-leverage move for the AI Experimenter is to freeze the model inventory and invest the next two quarters in data foundation and governance.

This sounds counterintuitive. The AI team is your most visible capability. Telling them to pause while the infrastructure catches up feels like slowing down.

It's not. It's removing the bottleneck that's preventing everything they've built from reaching enterprise scale.

Specifically:

Consolidate your data layer into a unified, governed foundation. This doesn't mean a three-year data warehouse migration. It means implementing a modern lakehouse architecture — and this is where partners like Databricks become critical — that can serve both your analytics and AI workloads from a single source of truth. The goal is a feature store and data serving layer that any model can access without the six-month pipeline project that currently blocks every deployment.

Build the governance framework in parallel with the data foundation, not after it. Model validation workflows, data lineage tracking, bias monitoring, and regulatory documentation should be embedded into the platform from day one. Retrofitting governance onto a scaling AI operation is an order of magnitude harder than building it alongside the foundation.

Then — and only then — resume the AI roadmap. When you do, the models that were stuck in pilot mode will have a platform that can support enterprise deployment. The churn model that's been sitting in a notebook for eighteen months can go to production in weeks instead of quarters. The fraud detection system can roll out to every business line without a custom integration project each time.

The AI Experimenter's assets aren't the problem. The architecture beneath them is. Fix the architecture, and the assets you've already built become the fastest path to enterprise-scale AI in your competitive set.

What This Looks Like in Practice

Consider a mid-market North American bank — $30B in assets, strong retail and commercial divisions. Their Chief Data Officer, hired two years ago from a major tech company, built a data science team of 15 and stood up an ML platform on cloud infrastructure. Within 18 months, they had four models in production: fraud detection, credit risk scoring, customer churn prediction, and a next-best-action engine for the retail branch network.

The results were real. Fraud detection improved by 23%. The churn model identified at-risk customers with 81% accuracy. The CDO presented these wins to the board quarterly, and executive confidence in the AI strategy was high.

The problems surfaced when the bank tried to scale. The next-best-action engine worked in three pilot branches but couldn't be deployed across the 200-branch network because each branch's customer data lived in a different system — a legacy of three acquisitions over the past decade. The fraud model was trained on one division's transaction data and couldn't ingest the commercial division's data without a pipeline that didn't exist. The credit risk model needed retraining on a quarterly cycle, but the data refresh process was manual and took six weeks.

Meanwhile, the model risk management team — two people, reporting into compliance — was reviewing models using spreadsheets and email chains. The regulator's annual examination included, for the first time, specific questions about AI model governance. The bank couldn't produce a complete model inventory, let alone validation documentation, within the examiner's timeline.

The bank's Modernization Index profile: AI & Real-Time Decisioning at 4.0. Data Foundation at 2.3. Governance at 2.0. Customer Intelligence at 2.7. A textbook AI Experimenter.

The resolution didn't start with more AI. It started with a 90-day data foundation sprint — consolidating the three acquisition-legacy data environments into a unified lakehouse, implementing a feature store, and standing up automated data quality monitoring. In parallel, a governance framework was built: a model inventory system, standardized validation workflows, automated documentation generation, and a model monitoring dashboard.

Six months later, the four existing models were redeployed on the new foundation. The next-best-action engine — the same model, unchanged — rolled out to all 200 branches in three weeks. The fraud model ingested commercial transaction data for the first time. The quarterly model retraining cycle dropped from six weeks to four days.

The CDO's team didn't build a single new model during those six months. They didn't need to. The models they'd already built, deployed on a real foundation, delivered more enterprise value in the following quarter than the previous eighteen months of piloting combined.

Find Out Where You Stand

The AI Experimenter is the most common profile we see — and the one that surprises executives most. Institutions that consider themselves AI leaders often discover that their capability is concentrated in pockets that can't scale.

The TribalScale Financial Services Modernization Index™ maps your institution across all four pillars in about five minutes. You'll see your radar profile, your archetype match, and your single highest-leverage move.

The Financial Services Modernization Index™ and the six modernization archetypes are proprietary frameworks developed by TribalScale.

‹ The Compliance-Driven Organization: When Governance Leads But Growth Capabilities Lag

What “AI-Ready Architecture” Actually Means in Manufacturing (And What It Doesn’t) ›