LLM Classification Endpoint
A single LLM inference call where the application assembles a prompt, sends it to the model, and consumes the response directly - no tools, no loop, no agent behavior.
Capabilities
- Single inference call
- Structured output constraints
- Guardrails (input/output classifiers)
- Context engineering (prompt assembly)
Trust analysis
This is the simplest AI integration topology. The application developer controls the full context boundary: what enters the prompt and how the output is consumed. All intelligence comes from the model's weights and the quality of the assembled context. There is no tool access, no persistent state, and no ability to take actions beyond generating text.
The prompt is the only input surface, and the output goes directly to the consuming application or user. When the context includes untrusted input (user-supplied text, retrieved documents, third-party data), that input becomes the primary vector for prompt injection. Structured output constraints limit the response format, reducing the range of harmful outputs but not eliminating hallucination exploitation or guardrail bypass risks.
This is the baseline trust model that all more complex systems inherit. Every additional capability (multi-turn conversation, tools, retrieval, agent loops) adds trust surfaces on top of this foundation.
Interaction effects
Minimal - this is the atomic unit. No capabilities interact because there is only one capability (text generation from a prompt). The trust surface is contained entirely within the prompt/response boundary.
Threats
| Threat | Relevance | Note |
|---|---|---|
| Prompt injection | Primary | Untrusted input in assembled context overrides system instructions |
| Hallucination exploitation | Standard | Incorrect classifications, fabricated extractions |
| Guardrail bypass | Standard | Circumventing output format or content restrictions |
| System prompt extraction | Standard | Revealing instructions instead of producing structured output |
| User manipulation | Standard | Classification labels treated as ground truth by downstream systems |
| Misaligned model behaviors | Standard | Systematically biased classifications from sycophancy or shortcut-taking |
| Training data poisoning | Standard | Systematic misclassification of specific input patterns |
Examples
- A content moderation classifier that labels user-submitted text as safe or unsafe.
- A summarization endpoint that condenses a document into a brief summary.
- A translation service that converts text between languages in a single call.
- An extraction endpoint that pulls structured data from unstructured text.