Core Concept

What Is a Large Language Model?

A large language model (LLM) is an AI system trained on vast amounts of text data to predict and generate human-like text. LLMs are the technology behind ChatGPT, Claude, Gemini, and Copilot. They are powerful, statistically sophisticated, and fundamentally different from how conventional software works.

Definition

Large Language Model (LLM), a foundation model specifically trained on a very large corpus of text to predict and generate human language, capable of conversation, reasoning, code generation, and a wide range of language-related tasks.

LLMs are the most economically significant AI development of the past five years. Frontier LLMs (GPT-5.5, Claude 4 Opus, Gemini 2.5, etc.) are now used in enterprise productivity, customer service, software engineering, and research. Governance attention has focused on data residency, prompt injection vulnerabilities (OWASP LLM Top 10), training data IP exposure, hallucination, and, for the largest models, systemic risk classification under the EU AI Act.

Source: EU AI Act, Articles 51–55; OWASP LLM Top 10

How LLMs actually work

LLMs are trained by processing enormous amounts of text, books, websites, code, scientific papers, and learning statistical patterns in language. They do not store facts like a database. They learn relationships between words and concepts, then use those patterns to predict what text should come next. This is why they can write fluent prose, answer questions, and summarise documents, but also why they can generate confident-sounding content that is factually wrong. The model has no way to verify whether its output is true.

The key distinction from conventional software

Traditional software follows deterministic rules: given input X, it always produces output Y. LLMs are probabilistic, the same input can produce different outputs, and the model has no reliable mechanism to verify factual accuracy. This has fundamental governance implications. You cannot test an LLM the way you test a conventional system. A control that works for a spreadsheet does not work for a language model.

Governance implications for organisations

Hallucination risk

LLMs generate plausible-sounding but factually wrong content. Output requires human review before any professional or consequential use.

Training data opacity

LLMs are trained on vast internet data, raising copyright, privacy, and bias concerns. Understanding what trained a model matters for responsible procurement.

Non-determinism

The same input does not always produce the same output. LLMs cannot be tested like conventional software, governance must account for statistical, not binary, behaviour.

Privacy exposure

Data entered into an LLM may be retained and used for training. Entering personal or confidential information into consumer LLMs creates Privacy Act risk for Australian organisations.

EU AI Act scope

LLMs with over 10²⁵ FLOPs of training compute are GPAI models with systemic risk under the EU AI Act, subject to specific obligations from August 2025.

Accountability gap

When an LLM produces output that causes harm, liability falls on the deploying organisation, not the model vendor. Professional responsibility does not transfer to the AI.

LLMs and regulatory classification

Under the EU AI Act, LLMs deployed as standalone services are classified as General Purpose AI (GPAI) models. Those trained with more than 10²⁵ FLOPs of compute are considered to present systemic risk, subject to additional obligations including adversarial testing, incident reporting, and transparency. These obligations apply to providers, but deployers also have obligations around appropriate use and human oversight.

In Australia, LLMs are not specifically regulated, but the Privacy Act, AI6 framework, and sector-specific requirements all create obligations that apply to how organisations deploy them. An LLM processing personal information is subject to the Australian Privacy Principles regardless of where the model is hosted.

What is generative AI? What is hallucination?