LLMs, Agents, and WTF Is Everything?
LLM 101
One of the challenges in trying to get oriented to the modern AI landscape is mapping the vocabulary to meaning. This is particularly challenging because a lot of people talking about this topic are using terms to mean different things. Since we’re going to talk a lot about these things, I figured it’d be helpful to share the clearest explanations I can for some of the big terms and how the products you see and hear about map to them.
Let’s start with Artificial Intelligence (AI). The term goes way back to the 1950s and has referred to a pretty broad range of things. What’s common about all of the things under that umbrella are that they use technology to do something that wasn’t explicitly spelled out in code in advance.
Regular computer code is deterministic in that what it will do given certain inputs is determined when the code is written. Stuff that gets the “AI” label shifts that to one degree or another to a machine doing something that’s determined when the code is run instead. There are things that still sit in a gray area at the edges, but that definition works well enough until you’re deep in the weeds on the topic.
Now, that’s NOT how a lot of people are using the term. For a lot of people, they’re using it as a substitute for something that usually has a more specific term, but the shorthand is, well, handy, so people use it. Which means when you hear it used, you can pretty safely assume it’s a stand-in for some other term and getting clarity on what they mean would be useful.
Large Language Models (LLM) are one of the most common things people mean now when they say “AI”. LLMs are statistical models built on a huge pile of text/language. They basically map what words appear close together and how often.
One of the big inflection points in this whole recent uptick in AI was that it turns out such a map, because humans use words to represent ideas/objects, gives a really strong way to look at the start of a text document and have a pretty good probability of being able to finish it in a way that it is likely to have been finished if people did the finishing.
And, that’s what LLM models themselves like GPT4, Claude Sonnet 3.7, Llama 3.3 (all of the llama references come from the LLM in LLaMa) are and are doing. They add on to a text document based on probabilities.
However, with just a model, it’s not really usable by people, so these models get wrapped up in a bunch of other software, stacked and combined with other models and generally bundled up into things that CAN be used by people and by other machines. These stacks of software wrapped around LLMs tend to surface as either user interfaces (UI) which humans interact with as apps and web apps or as application programming interfaces (API) that code, developers, and computers use.
ChatGPT and similar chatbots are UI applications built around an LLM model. When you type something into the box, there’s a hidden document underneath that your text gets added to as something like:
”User: Can you explain LLMs to me?”
The LLM looks at that document and adds:
“Assistant: Sure, they’re statistical models…”
The application translates back and forth between that actual document and the display you are interacting with. But, it also actually put together a document that’s larger than you probably would expect before your first addition to it. Those things get called a “system prompt” or “custom instructions” and a variety of other things depending on why they were added to the document.
Those instructions set the LLM up so that how it adds on to the document, it makes sense as a chat transcript that helps you (User) out. And, it includes the information needed for the LLM to add the *right* amount of information to the document to feel like a conversation where it takes turns with you. Before stuff like that was included, these models would just add on your response too and then “their” part and keep extending the document with back and forth.
As that document grows, it eventually reaches a size that the model can’t read it or can’t read it and also add to it. The size where that happens is called the “context window” and is why long conversations on many of the products “forget” early parts of the conversation. Because those products were/are just cutting off the earliest bits.
Lots of techniques have evolved to address that problem. If they edit the “old” parts of the document and then add your most recent addition, they can keep the context window manageable. If they go retrieve facts and information from previous conversation documents that were stored, or external documents, they’re doing some form of Retrieval Augmented Generation (RAG), though that’s come to mean one very specific version of doing that in the last year or so instead of the pattern.
So, then WTF is an “agent”?
This is another one where a lot of people use a definition that isn’t rooted in how things actually work and is more about being a “thought leader” talking about the future.
What the agent coding frameworks pretty much all use as a functional definition is that they introduce two things into the LLM chat transcript document:
A loop, where the “Assistant” adds a normal sized chunk for a “turn”, but then, is given a chance to take its own turn again. The system instructions set up agreed-upon ways to indicate that the loop should be paused to wait for you to interact as well as the software controlling the loop having rules for how many times it’s allowed to run on its own, among other things.
Tools, which are a way for it to run code outside the model. Again, the system instructions provide a way for it to add text to the document in a specific format that the containing code looks for and translates into running it with the inputs the LLM specified. This means agent tool use is actually the LLM writing out a very specific request that a tool be run rather than it actually being able to do so itself as a model.
At this point, almost all of the chatbot products like ChatGPT and Claude *are* agents/agentic by that definition. When those tools generate code off to the side, they’re using a tool to modify a different set of documents, when they use the web to search, when they display “thinking” before answering, those are all extra loops with specific instructions.
Why aren’t those typically called “agents” in the marketing or UI of those products?
Because the implications of where loops and tools leads is to the word that most people go to immediately in their definition: autonomy.
You can probably see how if you removed the limit on loops and gave sufficient tool access, as well as ways to manage the context window, the loop could go on forever. And, if you stated a goal and set the only exit condition to it declaring that the goal had been achieved, the LLM could take a shot at looping until it achieved that goal. Hence “autonomy”.
Except an awful lot of the experiments people did to do just that resulted pretty much universally in things going off the rails. They made messes, ran up huge bills in LLM usage, did things like changing the goals or cheating, while almost universally failing at what people *meant* in the stated goals as things went off the rails.
What followed was a lot of what modern LLM agent frameworks facilitate: smaller scope, more guardrails, human approvals on tool use, better instructions, better context management, etc. as well as more LLM “workflows” that are less loops and more pre-determined paths of sequences of LLM use, tools being invoked and all of it steered with “regular” software.
When these more limited agents get called such, that conflicts with what people imagine when they hear “autonomy”, but it is actually much more useful for real-world tasks today. Many of the workflows actually DO have small agents given very specific goals as tasks and they run mini loops with tool use to do that step. Those being chained together to make workflows is how a lot of business automation is happening with LLMs.
Since Substack limits how long these can be, I’m going to stop here. What other terms around AI would you like clarity on? Things about these explanations that weren’t clear enough? Additional questions these explanations raised for you?


Thanks for the article! I see you posting about ‘context frameworks’ and I’d like to know more about that. I have zero coding knowledge and tried to use an LLM to coach me into creating a custom RAG on my local machine so i could feed it a knowledge base on a piece of music equipment I use … but that didn’t go well lol. and actually the web-interface ChatGPT does a far better job of answering my questions!