Hanabi Developer Hub
/ Advanced / Architecture
Advanced

Architecture

How the host, capability bridge, worker sandbox, and Developer Portal fit together.

Module UI
sandboxed iframe
opaque origin
Capability bridge
  1. 1 declared
  2. 2 granted
  3. 3 scoped
  4. 4 REST call
Platform
REST API
Every call is declared, granted and scoped before it touches the platform.
Status: design of record for the v5 module-platform unification. This document is the target architecture. Sections tagged [live] are implemented today; [building] is in active development; [planned] is designed but not yet built. The roadmap at the end tracks per-phase status.

1. Why this document exists #

Today the word "module" means two unrelated things in the codebase:

  1. Built-in modules (Certificate Builder, Image Converter, Field Ops, Office editors, Calculator, Calendar, …) are hardcoded Svelte panels in +page.svelte backed by bespoke Python services and bespoke routes. They reach straight into platform internals (job_service, vfs_service, FileRecord).
  2. Packaged modules (the Developer Portal upload flow) are a hanabi.module.json + ui/ + worker/ + tests/ zip. They are validated by the CLI, the SDK, and the backend — and then cannot run: the uploaded bytes are discarded, there is no host that loads a module UI, and the capability SDK (HANABI_* postMessage) has no listener in the shell.

The goal of this architecture is to collapse those two into one model: every module — built-in or third-party — is described by the same manifest, launched by the same host, and talks to the platform through the same capability API, gated by the same permission model. The only difference between a trusted built-in and an untrusted upload is which capabilities are granted, not how it is wired.

2. The unified module model #

A module has two halves, both optional-by-presence and both declared in the manifest:

text
 module package (hanabi.module.json) 
                                                                                       
   UI half    entrypoints.ui    (HTML/CSS/JS, runs in a sandboxed iframe)             
   Worker     entrypoints.worker(server-side logic, runs in the trusted backend)      
                                                                                       

                                                          
         postMessage (capability bridge)                   job runner (worker registry)
                                                          
 Hanabi platform (the shell + API) 
  Window manager · Capability bridge · Permission grants · VFS · Jobs · Settings · SSE  
  • The UI half is plain web assets. It runs in a sandboxed iframe with an opaque origin (no cookies, no parent DOM access). It can only affect the outside world by sending capability messages to the host. This half is fully portable and safe to run for any module, including untrusted uploads.
  • The worker half is server-side logic that needs elevated or native access the browser can't have: Office COM automation, Pillow/rawpy, openpyxl, filesystem, network. Workers run in the trusted backend, so arbitrary third-party workers are a privileged capability (see §7 trust tiers).

This split is the crux of "unify everything onto packages" given that several built-ins depend on native Windows/Office automation: we unify the UI rail and the contract for everyone, and treat the worker rail as a platform-registered, trust-gated capability rather than something we sandbox.

2.1 Trust tiers

TierUIWorkerGranted toExample
Sandboxedsandboxed iframe, capability bridge onlynone (UI-only)any uploaded/published modulehello-world-webpage, Calculator
Trusted-workersame sandboxed iframe + bridgeplatform-registered server worker, or first-party server routes reached via moduleRequest/moduleStreamfirst-party / admin-blessed modulesCertificate Builder, Image Converter, Field Ops Planner
Privileged-hosta built-in Svelte panel instead of an iframe (deep shell integration only)optional workerfirst-party shell surfaces onlyFile Explorer, Settings, Module Store, Developer Portal

As of Phase 4 this destination is reached for every app module: all eight first-party app modules (Calculator, Simple Excel, Calendar, Image Converter, Word Editor, Excel Editor, Field Ops Planner, Certificate Builder) ship as packaged sandboxed iframes. Tier "privileged-host" has shrunk to just the genuinely shell-integrated surfaces (File Explorer, Settings, the Store, the Developer Portal itself), which stay built-in panels by design.

3. The manifest is the single contract [live] #

hanabi.module.json (schema 5.0) is already well-specified and validated in three places — keep that, extend it carefully. Full field reference lives in manifest-reference.md. The contract owns:

  • Identityid (lowercase slug), name, version, summary.
  • Entrypointsentrypoints.ui (required), entrypoints.worker (optional).
  • Permissions — an allowlist the platform enforces (§5).
  • Filesystemroot under Modules/<Name> plus role-tagged folders (input/output/workspace/cache/config). This is the module's only view of the VFS.
  • Dependencies — runtimes (browser/python/node/office), python/node packages, Office apps, named platform services. This is how a module declares requirements (§8).
  • UI window contract — sizing, resize, aspect-lock, multi-window.

Single source of truth: the canonical validator is module_manifest.py. The CLI (module-cli) and SDK (module-sdk) intentionally mirror it so authors get local feedback; a contract test must keep the three in lock-step (see roadmap Phase 5).

3.1 Planned manifest additions [planned]

These are additive and backward-compatible (schema stays 5.0 until a breaking change forces 5.1):

  • runtime.kind: "sandboxed-ui" | "trusted-worker" | "privileged-host" — lets the host pick the launch path without hardcoding ids.
  • entrypoints.worker_handler: dotted path the worker registry resolves (replaces the if module_id == … dispatch in write_module_output).
  • capabilities.requests: structured capability requests beyond coarse permissions (e.g. { "files.read": { "roles": ["input"] } }) so grants can be scoped to specific folder roles.
  • platform.min_version / platform.api: the platform contract version the module was built against, for forward-compat gating (§9).

4. Package lifecycle #

text
author  validate  pack  upload  store  extract  install  launch  run  update
  CLI      CLI       CLI    portal   API     API       API       shell    bridge  portal
StepTodayTarget
author/validate/pack[live] hanabi module init/validate/packunchanged
upload[live] POST /developer/modules/{id}/package validates the zipunchanged
store❌ bytes discarded after validation[building] persist package as a FileRecord-style blob under storage_root keyed by module_id+version
extract❌ none[building] unzip to a per-version served directory; reject unsafe paths on extraction too
install[live] per-user ModuleInstall row; folders createdalso materialize granted permissions (§5)
launch❌ blank window for packaged modules[live] generic ModuleHost iframe loads entrypoints.ui
runbespoke per-module routes / inline jobs[live] capability bridge (files/settings); worker registry next for jobs
update[live] new version row + admin reviewadd package GC of superseded versions

Package storage design [building]:

  • Persisted blob key: modules/{module_id}/{version}/package.zip under settings.storage_root (same root storage_path() already uses).
  • Extracted assets: modules/{module_id}/{version}/ui/… served read-only.
  • Extraction re-applies the same path-safety and secret/executable checks the validator runs (validate_package), because validation-at-upload and extraction-at-publish are different trust moments.

5. Capability API & permission enforcement [live] #

The module UI is isolated; everything it wants from the platform goes through a postMessage bridge. The SDK (module-sdk) defines the message shapes, and the host-side bridge (module-bridge.ts) answers them with permission + scope enforcement. The full message catalogue and SDK method list live in capability-api.md.

5.1 Protocol

Module → host (requests), host → module (responses/events), all JSON envelopes:

text
{ "v": 1, "type": "HANABI_READY" }
{ "v": 1, "type": "HANABI_PICK_FILE", "rid": "...", "accepts": ["xlsx"] }
{ "v": 1, "type": "HANABI_READ_FILE", "rid": "...", "fileId": "..." }
{ "v": 1, "type": "HANABI_OPEN_MEDIA", "rid": "...", "fileId": "..." }   //  { url, mime_type, size_bytes } range-capable stream
{ "v": 1, "type": "HANABI_WRITE_FILE", "rid": "...", "folderRole": "output", "name": "...", "bytesB64": "..." }
{ "v": 1, "type": "HANABI_READ_OFFICE", "rid": "...", "fileId": "..." }   //  parsed { kind, sheets | paragraphs }
{ "v": 1, "type": "HANABI_SAVE_OFFICE", "rid": "...", "fileId": "...", "payload": { "kind": "spreadsheet", "sheets": [] } }   // overwrite in place
{ "v": 1, "type": "HANABI_CREATE_JOB", "rid": "...", "options": {...}, "inputRefs": [...] }
{ "v": 1, "type": "HANABI_JOB_PROGRESS", "jobId": "...", "progressPct": 50 }   // host  module event
{ "v": 1, "type": "HANABI_JOB_OUTPUT", "jobId": "...", "seq": 1, "chunk": "…", "encoding": "text" }   // streamed worker output
{ "v": 1, "type": "HANABI_MODULE_REQUEST", "rid": "...", "method": "POST", "path": "generate-stream", "body": {}, "stream": true }   // first-party own route
{ "v": 1, "type": "HANABI_MODULE_STREAM", "rid": "...", "event": {} }   // host  module: one NDJSON line from a streaming route
{ "v": 1, "type": "HANABI_GET_SETTING", "rid": "...", "key": "self.theme" }
{ "v": 1, "type": "HANABI_NOTIFY", "rid": "...", "message": "Done!", "tone": "success" }
{ "v": 1, "type": "HANABI_SEND_TO_MODULE", "rid": "...", "moduleId": "other-mod", "fileId": "..." }

Every request carries a correlation id (rid); the host replies with { "type": "<type>_RESULT", "rid": "...", "ok": true|false, "data"|"error": … }.

5.2 Enforcement points (the security spine)

A capability is granted only if all three hold:

  1. The manifest declares the matching permission (files.read etc.).
  2. The installing user granted it (recorded at install time; default-grant for first-party, explicit consent UI for third-party — see limitations.md).
  3. The host scopes the operation: file reads/writes are clamped to the module's declared filesystem folders for that user; a module can never name a fileId/path outside its own Modules/<Name>/… subtree.

The bridge runs in the trusted parent (the shell), calls the existing authenticated, CSRF-protected API on the user's behalf, and never exposes raw storage keys or cross-module/cross-user data. This reuses the boundaries in security-model.md verbatim — the bridge is just a new, permission-checked caller of the same API.

5.3 Isolation model

  • UI runs in <iframe sandbox="allow-scripts"> without allow-same-origin, giving it an opaque originYour UI’s sandboxed identity: it belongs to nobody, so it can’t read the host page, cookies, or other modules.: no access to shell cookies, localStorage, or the parent DOM. Communication is postMessage-only.
  • The host validates event.source === iframe.contentWindow and ignores messages from anything else (the working WordEditorPanel precedent already does this for its bespoke protocol).
  • Module assets are served read-only and, in production, from a distinct origin (HANABI_MODULE_ORIGIN, e.g. modules.hanabimatsuri.net, Phase 11) for defense in depth. First-party module assets are served publicly (they are platform files with no user data, and the opaque iframe can't attach the session cookie to its own sub-resource requests); drafts and third-party packages stay auth-gated.
  • A per-module Content-Security-Policy is injected on the served index.html (no-store, so a policy change propagates on the next load). With a distinct origin it names that origin in the asset directives — the opaque origin makes a bare 'self' match nothing for sub-resources — allows blob: in frame-src and worker-src (a module-built document preview, or a bundled web worker like PDF.js rendering a PDF preview to a <canvas>) while keeping object-src 'none', and locks connect-src to 'self' plus any approved services origins, so a module reaches the platform only through the bridge, never by fetching its own origin.

6. The module host [live] #

A single ModuleHost.svelte component replaces the per-id {#if} switch in +page.svelte:

text
openApp(catalogItem)
   if runtime.kind == 'privileged-host'  render the legacy Svelte panel  (transitional)
   else                                  <ModuleHost item={catalogItem} />
                                             loads entrypoints.ui in a sandboxed iframe
                                             attaches the capability bridge
                                             applies ui.window sizing from the manifest

The window manager already reads manifest.ui.window for sizing/aspect-lock, so host windows inherit that for free. Built-in panels keep working during migration; each migrated module flips from the panel branch to the host branch.

7. The worker model & registry [live] #

The hardcoded if job.module_id == "certificate-builder": … dispatch is gone. module_workers.py holds a registry; job_service registers the first-party workers and dispatches through it:

python
register_worker("certificate-builder", lambda ctx: generate_certificate_job_output(...))
register_worker("image-converter",     lambda ctx: generate_image_conversion_job_outputs(...))
register_worker("simple-excel",        lambda ctx: generate_simple_excel_job_output(...))
# a module with no registered worker cannot run a server job
  • POST /runtime/modules/{module_id}/jobs is the generic contract for all modules and now recognizes packaged modules (via get_module_catalog_item_from_db); bespoke certificate-builder routes (/preview, /generate-stream, …) are additional module-specific capabilities, not the norm.
  • Workers receive a `WorkerContext` (user, job, module, options, plus log/progress/output_folder helpers) — the seam through which raw Session/Settings access can later be narrowed for less-trusted workers. Today first-party workers still read db/settings directly.
  • Third-party workers are not executed (running arbitrary server Python is a non-goal until a sandboxed worker runtime exists — §10). A packaged module with no registered worker gets a clear error from the job endpoint, not a 404.

8. How a module declares requirements & interacts with the platform #

This is the "if a module has requirements, or needs to interact with Hanabi, there's a way for it" requirement, made concrete:

The module needs…It declares…The platform provides…
to read user filespermissions: ["files.read"] + an input folderHANABI_PICK_FILE / HANABI_READ_FILE, scoped to its folders
to play/stream mediapermissions: ["files.read"]HANABI_OPEN_MEDIA → a signed, range-capable URL for <video>/<audio>/<img> (seekable, no bytes over the bridge)
to save outputspermissions: ["files.write"] + an output folderHANABI_WRITE_FILE, lands in Modules/<Name>/Exports
to organize its own folderspermissions: ["files.manage"]HANABI_FS_* (fs.list/stat/makeDir/rename/move/copy/remove), scoped to its folders; deletes go to Trash
to integrate with the Desktopconsent-gated desktop.*HANABI_DESKTOP_* — read the Desktop (desktop.read), place its files there (desktop.workspace), pin a launcher (desktop.shortcuts), or set wallpaper/accent/theme (desktop.personalize). An extensible namespace; the shell owns the actual desktop state
to notify the userpermissions: ["notifications"]HANABI_NOTIFY → a desktop toast + notification-centre entry (title/tone clamped)
to open a file typedesktop.file_associations: ["md"]appears in File Explorer's Open with for .md; launching delivers the file as openedFile, readable for that launch
to hand work to another modulepermissions: ["modules.use"]HANABI_LIST_MODULES + HANABI_SEND_TO_MODULE → opens the target module (visibly) with your file as its openedFile
heavy/native processingentrypoints.worker + dependencies.python/officea registered worker + the job runner + progress events
Microsoft Officedependencies.office: { excel: true, … }host checks Office is installed; surfaces a clear error if not
a Python packagedependencies.python: ["pillow>=10"]declared & shown at review; install/runtime provisioning (planned)
an external servicedependencies.services: ["…"] + (planned) CSP connect-srcreviewed; network allowlisted to declared services
its own settingspermissions: ["settings.self"]HANABI_GET_SETTING/SET_SETTING, namespaced to the module
persistent workspacea workspace folder (and/or, like Field Ops, a state blob)per-user storage scoped to the module

The rule: a requirement that isn't declared in the manifest doesn't exist. The host refuses undeclared capabilities, and the review process inspects the manifest's declared requirements. New requirement types are added by extending the manifest contract + the bridge + the validator together (never ad-hoc).

9. Versioning & forward-compatibility #

  • Manifest schema is 5.0; additive fields don't bump it, breaking ones do.
  • Capability protocol carries "v" on every envelope; the host supports a range and degrades gracefully (unknown message types get an explicit UNSUPPORTED error, never silent drop).
  • Platform API version (platform.api, planned) lets the host warn/refuse when a module targets a newer contract than the running platform.
  • Installed modules pin a version; updates flow through the existing draft→version→review pipeline.

10. Explicit non-goals (for now) #

  • Executing untrusted third-party Python workers outside the sandbox. The worker sandbox foundation now exists (Phase 9a) and runs an approved worker as a pure bytes → bytes transformation; the real isolation boundary (container backend, Phase 9b) is still pending, so the backend ships off by default. Untrusted code never runs in-process with platform access — that stays a hard non-goal. See worker-sandbox-design.md and limitations.md.
  • A public, internet-facing module registry. The portal stays seeded-user/admin-reviewed (see security-model.md).
  • Replacing genuinely shell-integrated surfaces (File Explorer, Settings) with iframes.

11. Product direction — third-party ecosystem (decided) #

The platform is steering toward outside developers publishing modules to your users — a real ecosystem, but guardrailed. Decided model:

  • Publishing (self-serve, with review): a new module still goes through manual admin review. Subsequent version updates are handled by an automated check (correct folder structure, required files present, manifest validation, safety scans) — no human in the loop for versions that pass.
  • Outbound network (approval-gated): a module that needs network calls declares it; network access requires approval per module, then its connect-src is allowlisted to the declared services. Default = no network.
  • Third-party server workers (approval-gated, sandboxed): running third-party backend code requires approval, then runs in a resource-limited sandbox with access only to shared resources, prioritized like VMs (CPU/memory/time quotas, no host filesystem/secrets). First-party workers stay privileged.
  • Broad filesystem access (consent-gated): built — a user grants files.read.all at install; never crosses into other modules' folders.
  • Heavier capabilities (built, Phase 12): long-running jobs with live progress, media streaming, desktop integration, inter-module / shared data, plus a structured data store, a background task queue + scheduler, native media compute, broad egress, and a persistent service supervisor — each off by default, declared, admin-granted, quota'd, and audited (see roadmap Phase 12).

The throughline: every elevated thing is opt-in, declared, reviewed/approved, and scoped — never ambient.

12. Roadmap #

PhaseScopeStatus
0Architecture (this doc) + developer docs + referencesdone
1Package persistence + extraction + servingdone
2Generic `ModuleHost` iframe + capability bridgeThe trusted layer that carries every request from your sandboxed UI to the platform, checking permissions. + permission enforcementdone
3Worker registry + WorkerContext seam; packaged-module job recognitiondone
4Svelte → module bundler (sandbox-safe single file) + migrate every first-party app module: Calculator (UI-only), Simple Excel (UI + worker), Calendar (durable settings), Image Converter (campaign UI + Pillow worker + bundled art), Word Editor (self-contained .docx engine), Excel Editor (in-place readOffice/saveOffice), Field Ops Planner (own server routes via moduleRequest), Certificate Builder (streaming moduleStream + a PDF.js <canvas> preview; PowerPoint COM kept server-side). All eight packaged — only shell surfaces stay built-in.complete
5Broad filesystem access (elevated perms + per-user grants + bridge enforcement + no size limit)done (foundation)
6Broad-access UX: install consent prompt + host file-open dialog (pickUserFile) + fallback prompt-on-use (examples/modules/broad-reader)done
7Self-serve publishing guardrails: new-module manual review + automated version checks (structure/files/safety) — version updates to a published module auto-approve on passing; failures keep the prior version livedone
8Outbound network: network.fetch perm + http(s) origins in dependencies.services → served CSP connect-src allowlist (approved modules only); origin changes re-trigger review; module stays live on its approved origins meanwhiledone
9Third-party worker sandbox (approval-gated, resource-limited shared compute). Container-first; pluggable WorkerRuntime. 9a pure bytes→bytes worker contract + invocation protocol + harness + runtime abstraction + worker.execute approval (+ worker-code-change re-review) + tier model + job wiring + local dev backend — done, off by default. 9b container backend — done (docker run boundary: --network none / --cap-drop ALL / --security-opt no-new-privileges / --read-only + tmpfs / memory+cpu+cpu-time+pids limits / code mounted ro; Dockerfile + image-build scripts; needs Docker on the host, off by default — see worker-sandbox-docker-setup.md). 9c worker pool — done (global concurrency cap + per-module fairness + priority-ordered queue with busy timeout). 9d host-mediated egress — done (container stays --network none; approved ctx.fetch shipped over stdio + performed host-side against the allowlist; no-redirect, private-IP block, size/count caps). 9f hardening — done (audit logging of runs + fetches, output file-count cap, base-image digest pin, seccomp-profile hook). 9e WASM deferred (future Docker-free backend).complete
10Capabilities: jobs + live progress ✅ (progress-worker rail: streaming endpoint → bridge → HANABI_JOB_PROGRESS; SDK createJob(opts, refs, onProgress)) → media streaming: inbound playback ✅ (HANABI_OPEN_MEDIA → host mints a signed, per-file, time-boxed ticket → GET /runtime/media/{ticket} serves with HTTP range/206; the sandboxed iframe streams & seeks <video>/<audio>/<img> directly, no bytes over the bridge) + worker→UI output streaming ✅ (a progress worker yields {output, encoding} chunks → NDJSON output events → bridge HANABI_JOB_OUTPUT; live logs / progressive / incremental results, interleaved with progress) → desktop integration ✅ (notifications HANABI_NOTIFY · file associations desktop.file_associations → "Open with" + openedFile launch grant · taskbar badge HANABI_SET_BADGE · start-menu jump-lists ✅ desktop.jump_list → indented quick-action rows under the module in All programs → launches it with the chosen action id as launchAction in context) → inter-module / shared data ✅ (HANABI_LIST_MODULES + HANABI_SEND_TO_MODULE: hand a scoped file to another installed module, which opens visibly with it as its openedFile) → persistent module KV ✅ (server-backed settings.self: a durable, cross-device per-user store strictly scoped to (user, module), value-capped — GET/PUT /runtime/modules/{id}/kv/{key}, with a local write-through cache)complete
11Hardening: package GC ✅ (admin sweep POST /developer/admin/packages/gc removes stored packages with no owning draft/version — busy-safe, dry_run) · contract-test parity ✅ (a test keeps the bridge's capability handlers and capability-api.md in lockstep) · served-module response headers ✅ (Referrer-Policy: no-referrer + a locked-down Permissions-Policy). separate serving origin + strict per-module CSP ✅ built (gated by HANABI_MODULE_ORIGIN): set it to a distinct origin (e.g. modules.hanabimatsuri.net) and module iframes load from there, get a per-module CSP that names that origin in its asset directives (the iframe's opaque origin makes a bare 'self' match nothing for sub-resources), allows blob: in frame-src (module-built previews) with object-src 'none', and keeps connect-src strict ('self' + approved services). First-party assets are served publicly (no user data); the session cookie is domain-scoped so it reaches the module origin for the parent-initiated navigation. The bridge posts replies to '*' — an opaque origin can't be addressed by a real targetOrigin — but only ever to this iframe's contentWindow. Empty = same-origin fallback (looser CSP). Activation needs only DNS for the host → this machine (vite allowedHosts is already open); GET /runtime/client-config hands the origin to the client.complete

| 12 | Capability expansion + governance — open the platform to "heavy" modules for a small trusted group, with every powerful capability off by default, granted per module, quota'd, and audited. G — governance: a capability registry (single source of truth; 24 caps across 4 tiers: baseline / consent / admin / admin-HIGH) → per-module grant + quota store on the draft → `enforce_capability` runtime choke-point + audit → registry-driven re-review of escalations → admin grant endpoint (G3). Heavy compute + client: bigger worker.execute tiers (A1), native-tools image worker.native — ffmpeg/imagemagick/libvips/ghostscript (A2), client.* runtime flags + WASM CSP (F/A5), storage.read (C2). Data: per-module structured data store data.store (C1). Background: task queue jobs.queue — background thread-pool runner (B1) + scheduler jobs.schedule — APScheduler over a ModuleSchedule source of truth (B2). Egress: broad fetch network.fetch.broad — admin policy + per-job byte budget + rate limit, still host-mediated/no-LAN (D). Services: persistent service supervisor worker.service — long-lived container with restart-backoff / crash-loop breaker / idle-stop / host ceiling, off by default (B3). | complete (B3 lifecycle; the in-container service egress channel + A3 streaming I/O + a G3 review-UI panel are documented follow-ups) |

Each phase keeps the app shippable: built-in panels keep working until the module they back is migrated and verified.

12. Where to read next #