Model Engine — Operators Guide¶
Internal documentation for service owners and deployment engineers installing, operating, and debugging model engine in customer environments.
Not the end-user docs? The user-facing Python client documentation is at the root of this site.
Contents¶
| Document | What it covers | When to use it |
|---|---|---|
| Architecture | System overview, component deep-dives, lifecycle flows, autoscaling | Before your first deployment; when debugging unfamiliar failures |
| Helm Values Reference | Every configurable value, organized by concern, with high-risk callouts | During installation and upgrades |
| Smoke Tests | Post-deploy validation checklist (Tier A: CPU, Tier B: GPU+LLM) | After every helm install or helm upgrade |
| Cloud Support Matrix | Per-cloud config reference, behavior differences, image mirroring | When deploying to a specific cloud |
Quick Links¶
- Installing for the first time? → Start with Architecture for the mental model, then Helm Values for configuration.
- Validating a deployment? → Go straight to Smoke Tests.
- Deploying to Azure / GCP / on-prem? → See Cloud Support Matrix for what to configure differently.
- Something broken? → Troubleshooting Guide (Confluence).
Contributing¶
These docs live at docs/internal/ in the repo. Update them when you change behavior or configuration — the PR template includes a reminder.
Rule: If a PR author would need to update this doc when changing the code → it belongs here. Operational notes from specific deployments belong in Confluence.