The AI Evolution: Approaching Data and Integration

“I’ve seen things you people wouldn’t believe.”
– Roy Batty, Blade Runner

Working in consulting gives you a kind of X-ray vision. You walk into a room with a new client and they start listing all the reasons they’re unique—how no one understands their business, how their systems are one-of-a-kind, how the complexity of what they do defies replication. And sure, some of that is true. Every organization has things that make it unique and its oddities. But once you get past the surface, you usually find something that feels familiar: a recognizable business structure layered with years of adaptations, workarounds, and mismatched systems that were never quite built to talk to each other.

When it comes to AI, this same story plays out over and over again. We start talking about the opportunities—where it could go, what it might unlock—and then we hit the same wall: the data. Or more accurately, the data they think they have.

Here are some common refrains I’ve heard across industries:

• “Those two systems don’t talk to each other.”

• “That data is stored in PDFs we print and file away.”

• “We purge that information every few months because of compliance.”

• “It’s in SharePoint. Somewhere. Maybe.”

• “Our marketing and sales platforms use different ID systems, so we can’t link anything.”

None of these answers are surprising. What’s surprising is how often people are still shocked when their AI project struggles to get off the ground.

In our survey, 44% of business leaders said that their companies are planning to implement data modernization efforts in 2024 to take better advantage of Gen AI.

PWC 2024 AI Business Predictions

This chapter is about getting real about your data. Before you can build intelligent systems, you have to integrate them. And before you can integrate them, you have to understand what data you have, where it lives, what shape it’s in, and whether it’s even useful in the first place.

Most companies assume their data is more usable than it actually is, which creates the Illusion of Readiness.

They picture their systems like neat rows of filing cabinets, all labeled and accessible. The reality is more like a junk drawer: some useful stuff, some random receipts, and a bunch of keys no one remembers the purpose of.

And here’s the kicker: AI doesn’t just use data. It relies on it. Feeds off it. Becomes it. If you give it bad data, it doesn’t know any better. It won’t tell you it’s confused. It will confidently give you the wrong answer—and that can have consequences.

Before we get into the mechanics of how AI consumes data, we need to talk about what kind of AI we’re actually working with.

The term you’ll hear a lot is foundation model.

These are large, general-purpose AI models trained on vast swaths of data—think billions upon billions of pieces of information. They’ve read the internet. Absorbed the classics. Ingested code repositories, encyclopedias, manuals, blogs, customer reviews, Reddit threads, medical journals, and everything in between. Foundation models like ChatGPT, Claude, Gemini, and Llama are built by major AI labs with enormous compute budgets and access to vast training sets. The result? Models with broad, flexible knowledge and the ability to respond to all sorts of queries, even ones they’ve never explicitly seen before.

To understand how these models work—and how you’ll be charged for them—you need to know about tokens.

A token is a unit of language. It’s not quite a word, and not quite a character. Most AI models split up text into these tokens to process input and generate output. For example, the phrase “foundation models are smart” becomes something like: “foundation,” “models,” “are,” “smart.” Each token costs money to process, both in and out. That means longer prompts, longer documents, and longer replies increase your cost.

But it’s not just about billing. Tokens define the model’s short-term memory, called the context window. Each model has a limited number of tokens it can “see” at any given time. Once you exceed that limit, earlier parts of the conversation start to fall out of memory. This is why long chats start to lose focus—and why prompts or instruction sets, RAG results, and injected context have to be compact and relevant. The more efficient your language, the smarter your AI becomes.

But not every task needs a giant model.

If you’re running a chatbot that answers routine FAQs, sorting support tickets, or parsing form submissions, a smaller and faster model will likely serve you better—and at a much lower cost. Foundation models are impressive, but they’re not always the most efficient tool in the toolbox. The art of modern AI isn’t about grabbing the biggest brain in the room. It’s about choosing the right model for the right job—and knowing when to escalate to something more powerful only when the problem truly demands it.

They’re called “foundation” models for a reason: they serve as the base layer on which other, more specialized AI systems are built.

But here’s the catch: These models know a lot about everything, but nothing about you.

They can answer general questions, draft emails, and summarize the history of jazz, but they don’t know how your company operates, what your customers expect, or how your internal systems are structured. That’s your business’s knowledge. It’s edge. And that’s what they’re missing.

So when I talk to clients about working with foundation models, I often use a simple analogy:

Think of a foundation model like a shrink-wrapped college grad.

They’ve spent years absorbing general knowledge—history, math, language, computer science, maybe even a few philosophy electives. They’re smart. Broadly informed. But they don’t yet know how you do things. They’ve never been inside your business, they don’t know your workflows, and they haven’t lived through your weird industry quirks.

They’re ready to learn. But the quality of that learning depends entirely on how you teach them.

Some of the best-performing companies in the world are known for their onboarding—how they train employees on day one to not just do the job, but to do it their way. With AI, the same principle applies. But instead of crafting training programs, you’re curating datasets. Instead of a week-long orientation, you’re creating repeatable processes that teach the model how to think and respond like someone inside your organization.

The tools are powerful. But they’re blank on the most important stuff: your data, your culture, your expectations.

That’s where integration comes in. That’s where the real work starts.

So now, with that in mind, let’s pause and break down the major ways these foundation models actually consume and interact with your data:

• Fine-Tuning: Adjusting a general model with domain-specific data. It’s powerful, but expensive and slow.

• Prompt Injection: Feeding data into the model at runtime, via a prompt. Quick, flexible, great for prototypes.

• RAG (Retrieval-Augmented Generation): Dynamically pulling in relevant documents or facts to answer a question. This is where a lot of real-world business AI is headed—and where integration becomes make-or-break.

Let’s clarify something right out of the gate: you’re not picking and choosing one method from a menu. You’re using all of them—maybe not all at once, but certainly over time, across use cases, or layered within a single product. Each of these approaches—fine-tuning, prompt injection, and RAG—has its strengths, and more importantly, its purpose. Prompt injection can be a great place to prototype or test assumptions. RAG lets you pull in fresh, contextual data in real time. Fine-tuning adds deeper understanding over time. Each method puts different pressure on your data infrastructure, your team, and your expectations. But they all share one common requirement: accessible, well-governed data.

And that’s the part where most companies start to sweat.

But before we get deep into integration strategies or data lake architectures, we need to rewind a bit because the way we talk about prompting itself is already limiting how we think….

That’s just a slice of the chapter—and a small window into the work ahead.

The AI Evolution isn’t about theory or hype. It’s a real-world guide for leaders who want to build smarter orgs, prep their teams, and actually use AI without the hand-waving.

If this hit home, the full book goes deeper with practical frameworks, strategy shifts, and the patterns I’ve seen across startups, enterprises, and everything in between.

📘 Grab your copy of The AI Evolution here.
⭐️ And if you do leave a review. It means a lot.

Uncategorized

Thoughts on Tech & Things

Jason Michael Perry

The AI Evolution: Approaching Data and Integration