Browser-Safe AI Systems, Part 23: Secure Architecture Principles for Browser-Safe AI
Series: Browser-Safe AI Systems, Part 23 of 32.
This post continues the Browser-Safe AI Systems series by focusing on secure architecture principles for browser-safe ai. The goal is to keep the discussion useful for analysts who investigate alerts, red teams who validate controls, developers who build the pipeline, and technical stakeholders who own risk decisions.
| Series navigation: Previous: Part 22 | Series index | Next: Part 24 |
23. Secure Architecture Principles for Browser-Safe AI
Browser-safe AI systems should be designed as controlled security pipelines.
They should not be designed as black-box AI decision engines.
The difference matters.
A black-box decision engine asks the model what to do.
A controlled security pipeline uses AI as one component inside a larger architecture of evidence collection, minimization, redaction, classification, policy enforcement, logging, and review.
The core principle is:
AI may classify risk, but policy must decide trust.
23.1 AI as Classifier, Not Authority
The model should help interpret hostile browser content.
It should not independently decide whether the user, page, file, or workflow is trusted.
A safe design separates:
- evidence collection
- model classification
- policy decision
- enforcement action
- analyst review
- feedback handling
The model can return a verdict, confidence level, reason code, or structured classification.
Policy code should decide whether to allow, warn, isolate, block, restrict, or escalate.
23.2 Separate Trusted Instruction From Untrusted Content
Browser content is hostile input.
The system must clearly separate:
- trusted system instructions
- policy definitions
- tenant configuration
- user context
- page content
- DOM text
- screenshots
- OCR output
- metadata
- model response
Untrusted page content should never be able to redefine the model task, policy rules, or downstream action.
23.3 Minimize and Redact Before AI Processing
The system should collect only what it needs.
Before model submission, the pipeline should minimize and redact:
- credentials
- tokens
- cookies
- reset links
- OAuth codes
- personal information
- customer identifiers
- internal URLs when not needed
- document contents when not needed
- hidden sensitive form values
Model prompts and responses should be treated as sensitive records.
23.4 Use Structured Model Output
Free-form output is useful for analyst explanation but dangerous for enforcement.
Security-relevant output should be schema-constrained.
Useful fields include:
- verdict
- confidence
- reason codes
- detected workflow
- credential fields present
- QR code present
- brand mismatch detected
- DOM and screenshot mismatch detected
- recommended action
- evidence references
Unsupported fields should be rejected.
Invalid output should fail safely.
23.5 Keep Policy Deterministic
Policy should be explicit, testable, and reviewable.
Policy decisions should account for:
- user group
- device posture
- network context
- page risk
- workflow type
- data sensitivity
- credential presence
- file movement
- SaaS context
- prior behavior
- exception state
The model provides signal.
Policy provides authority.
23.6 Preserve Replayable Evidence
A browser-safe AI system should preserve enough evidence to explain decisions.
Useful evidence includes:
- URL
- timestamp
- user and device context
- rendered screenshot
- DOM snapshot
- OCR output
- QR target
- redirect chain
- iframe tree
- inspected artifacts
- model verdict
- policy decision
- enforcement action
- reason codes
- redaction status
Evidence should be redacted, access-controlled, and retained deliberately.
23.7 Fail Safely
The system should define behavior for:
- model timeout
- missing screenshot
- missing DOM
- invalid model output
- redaction failure
- policy lookup failure
- conflicting evidence
- delayed content
- oversized input
- malformed content
- uncertainty
High-risk workflows should not silently allow when evidence is missing or confidence is low.
23.8 Make Decisions Explainable
Explainability means the system can explain operationally:
- what was inspected
- what was detected
- what policy applied
- what action was taken
- why the action happened
- what evidence supports the action
- whether uncertainty existed
- whether redaction occurred
- whether an exception influenced the result
23.9 Design for Red-Team Regression
Browser-safe AI systems should include regression tests for:
- hidden DOM text
- prompt injection
- screenshot deception
- DOM and render mismatch
- QR handoff
- delayed content
- homograph spoofing
- oversized DOM
- malformed metadata
- seeded sensitive data leakage
- invalid model output
- fail-open behavior
- exception abuse
A control that cannot be tested cannot be trusted.
23.10 Defensive Principle
Browser-safe AI is valuable when it is bounded.
The safest rule is:
Use AI to interpret hostile browser evidence, but keep trust, policy, enforcement, retention, and feedback under explicit architectural control.