---
title: "Architecture"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Architecture}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = TRUE)
```

```{css, echo = FALSE, eval = TRUE}
.llmshieldr-info-box {
  border-left: 4px solid #2f80ed;
  background: #f3f8ff;
  padding: 1rem 1.15rem;
  margin: 1.5rem 0;
  border-radius: 0.35rem;
}

.llmshieldr-info-box h2,
.llmshieldr-info-box h3,
.llmshieldr-info-box h4 {
  margin-top: 0;
}

.llmshieldr-info-box p:last-child,
.llmshieldr-info-box ul:last-child,
.llmshieldr-info-box ol:last-child {
  margin-bottom: 0;
}
```

This article is a compact maintainer-oriented map of the package. It explains
how safety decisions are produced without requiring a separate design document
at the repository root.

## Mental Model

```text
policy() creates rules, thresholds, controls, and optional rate guards
scan_prompt() checks user input before it reaches a model
scan_context() checks retrieved rows before prompt assembly
scan_conversation() checks role-preserving chat histories
scan_tool_call() and scan_tool_output() guard tool boundaries
scan_stream() scans streamed output with rolling context
scan_output() checks model text before display, storage, or downstream use
secure_chat() orchestrates scanning, chat execution, output scanning, and audit
write_audit_log() persists the end-to-end evidence trail
```

The package keeps the safety path inspectable. Every scanner result is based on
explicit findings. Every finding has a rule id, severity, action, optional OWASP
LLM category, and optional character span. Scanner reports resolve to `allow`,
`redact`, or `block`; orchestration results may also use `refuse` or
`escalate` when policy controls map a block to those outcomes.

## Design Goals

- Keep the first user path simple: choose a built-in policy name and call a
  scanner.
- Keep internals inspectable: policies are lists of explicit rules, not a
  hidden classifier.
- Support local-first safety workflows through deterministic rules, NLP checks,
  and optional Ollama review.
- Stay model-agnostic: any `ellmer` chat, object with `$chat()`, or plain R
  function can be used.
- Separate scanning from orchestration so prompt, context, output, tool, and
  stream checks can be used independently.
- Preserve auditability through scanner reports, final decisions, token
  estimates, and risk summaries.
- Make built-in controls extensible through custom policy objects and custom
  rules.

## Package Layers

1. Rule, report, audit, and result constructors in `R/rules.R`.
2. Built-in policy assembly and policy mutation helpers in `R/policy.R`.
3. Prompt scanning, normalization, scoring, redaction, and reviewer parsing in
   `R/scan_prompt.R`.
4. Context scanning and RAG anomaly/source checks in `R/scan_context.R`.
5. Output scanning in `R/scan_output.R`.
6. Chat orchestration and token accounting in `R/secure_chat.R`.
7. Optional surfaces: conversations, tools, streams, scanner options,
   redaction strategies, audit writing, HTTP reviewers, Ollama, and trust
   boundaries.

## Object Model

```text
shieldr_rule
    id             stable rule identifier
    pattern        regex pattern, or NULL
    fn             R predicate function, or NULL
    owasp          OWASP LLM category
    severity       low, medium, high, or critical
    action         allow, redact, or block
    description    human-readable explanation

shieldr_policy
    name             policy identifier stored in reports
    rules            list of shieldr_rule objects
    thresholds       redact_at and block_at numeric cutoffs
    rate_guard       optional shieldr_rate_guard environment
    trusted_sources  optional allowlist used by scan_context()
    controls         secure_chat() block/refuse/escalate/drop behavior

shieldr_report
    action        scanner action
    text_clean    normalized and possibly redacted text
    findings      list of finding objects
    risk_score    deterministic severity score
    policy        policy name
    checks        rules, nlp, llm, or both
    metadata      surface-specific operational metadata
```

## Scoring and Actions

Severity weights are:

| Severity | Score |
| --- | ---: |
| `low` | 0.1 |
| `medium` | 0.3 |
| `high` | 0.6 |
| `critical` | 1.0 |

Findings are deduplicated before scoring. Overlapping span findings from the
same source, OWASP category, and action count as the strongest single piece of
evidence instead of stacking together. Distinct findings still accumulate, and
the total score is capped at `1.0`. Synthetic scanner or context findings are
tracked separately and capped before being added to normal rule evidence.

Actions are resolved conservatively:

```text
if any finding is critical:
    block
else if any finding action is block:
    block
else if risk_score > block_at:
    block
else if any finding action is redact:
    redact
else if risk_score >= redact_at:
    redact
else:
    allow
```

The strict greater-than comparison for `block_at` keeps a single high-severity
redaction finding from escalating solely because its score equals a threshold.
Explicit `block` rules and critical findings still block immediately.

## Extension Points

- Add deterministic regex or function rules with `shieldr_rule()` and
  `add_rule()`.
- Configure prompt, context, output, conversation, stream, and tool surfaces
  independently.
- Use `scanner_options()` for local scanners such as encoded payloads, URL host
  policy, language allowlists, topic bans, and token limits.
- Use `redaction_strategy()` for replace, mask, hash, drop, and keep behavior.
- Use `policy_controls()` to choose refuse, escalate, drop, or keep-redacted
  outcomes after scanner blocks.
- Wrap local or remote reviewer models with `ollama_reviewer()` or
  `remote_reviewer()`.

::: {.llmshieldr-info-box}
## Release Hygiene

Before release, regenerate documentation, run the test suite, run
`R CMD check --as-cran`, review examples that require external services, and
update `NEWS.md` and `cran-comments.md`.
:::
