# Commercial Platform Integration Analysis

**Status:** Draft v0.1 — for internal review
**Scope:** How six major AI-inference redistributors would (or wouldn't) integrate with OMLA, plus a critical callout on whether the hoster-fork economics survive contact with reality.
**TL;DR:** Two of six platforms are plausible early adopters. The hoster-fork design **needs revision** — see the explicit callout at the bottom.

---

## 1. Comparison Matrix

| Platform | Current license posture | Best integration surface | Friction (1–5) | Hoster-fork fits? | Likelihood of adoption |
|---|---|---|---|---|---|
| **Civitai** | Per-model license selector, active enforcement, Buzz economy already exists | **Registry lookup + upload-time SHA check** | 2 | Partially — Civitai already skims, would expect *additional* fee | Medium-high |
| **OpenRouter** | Pass-through routing to providers; no license surfacing | **Inference webhook** (per-request metering exists) | 3 | No — OpenRouter's 5% is on top of provider price; hoster carve-out breaks that model | Medium |
| **Hugging Face** | License metadata field in README.md; no enforcement | **Registry lookup + upload-time SHA check** | 4 | N/A — HF is a weights host, not an inference hoster (Inference Endpoints aside) | Low-medium |
| **Replicate** | License shown on model page; compliance pushed to user via ToS | **Inference webhook** | 3 | Yes — Replicate already pays creators a share; hoster carve-out is recognisable economics | High |
| **Fireworks** | No license surfacing; treats open-weights as commodity inputs | **Inference webhook** | 4 | No — Fireworks' margin is the inference markup itself; carving out 30% destroys unit economics on Llama-class models | Low |
| **Together AI** | Same as Fireworks — commodity inference provider | **Inference webhook** | 4 | No — same unit-economics problem as Fireworks | Low |
| **Modal / RunPod** *(bonus)* | Compute IaaS — agnostic to what weights run on them | **None natural** (customer is the hoster, not the platform) | 5 | N/A | Very low |
| **Groq / Cerebras** *(bonus)* | Hardware-differentiated inference; curated open-weight catalog | **Inference webhook** | 4 | No — same as Fireworks/Together | Low |
| **Anyscale / Baseten** *(bonus)* | Bring-your-own-model deployment | **None natural** — deployer is legally the hoster | 5 | N/A | Very low |

---

## 2. Platforms You Didn't List That Matter

- **Modal, RunPod, Baseten, Anyscale** — compute-as-a-service or bring-your-own-model deployment. The deployer is the legal hoster, not the platform. Out of scope.
- **Groq, Cerebras, SambaNova** — hardware-differentiated inference with curated open-weight catalogs. Economically identical to Fireworks for OMLA purposes.
- **Novita, DeepInfra, Lepton** — Fireworks/Together clones; same economics.
- **Poe, Perplexity** — consumer wrappers; they're commercial *users*, not hosters. Compliance targets, not integration partners.
- **Featherless, Arli** — fine-tune-focused small-model hosts. Exactly the crowd that'd care most about royalty routing (community LoRA heritage); worth seeding as first movers.
- **Cloudflare Workers AI** — the interesting omission. Catalog-style open-weight inference with enterprise compliance tooling and legal appetite for licensing paperwork Fireworks lacks.

---

## 3. Per-Platform Analysis

### 3.1 Civitai

**A. Current licensing model.** The only platform here with fine-grained per-model license surfacing. Creators pick commercial-use / derivatives / attribution flags at upload; the result appears on the model page; Civitai actively removes models that violate upstream licenses (Stability AI policy changes drove platform-wide cleanups in 2024–25). The Buzz point economy with creator payouts means a royalty-flow mental model already exists.

**B. Where OMLA slots in.** Registry lookup at upload time is the obvious surface: when a user uploads a LoRA, hash the safetensors, hit OMLA, and if it's a known OMLA-licensed base or derivative, auto-populate the license field and link to the `omla1…` address. The inference webhook is *less* relevant because Civitai's on-site generator is a small fraction of how the models get used — the real commercial use happens off-platform.

**C. Integration shape.**
```http
# Upload-time hash check
POST https://api.omla-ai.org/verify
Content-Type: application/json

{ "sha256": "c0ffee…", "uploader_fingerprint": "civitai:user/1234" }

# Response
{ "status": "registered",
  "omla_address": "omla1qxy…",
  "parent_lineage": ["omla1abc…"],
  "license_terms": { "non_commercial": "free", "commercial_royalty_pct": 30 } }
```

**D. Friction: 2.** Civitai has an upload pipeline, already hashes files for dedup, and already has a license UI. Adding an OMLA lookup is a cron-job-grade change.

**E. Does hoster-fork fit?** Partially. Civitai's on-site generator *is* inference, and they already skim Buzz spend. A carve-out from the creator's 30% would feel like OMLA is reaching into revenue Civitai thinks it owns. They accept only if OMLA's share stacks *on top of* their existing take — user pays X Buzz, Civitai keeps its margin, OMLA's 30% is routed from the portion the creator would otherwise receive. That actually works because Civitai's creator payouts are already carved from user spend — OMLA just sits between user spend and creator payout. **Hoster-fork compatible with a trivial reframe.**

**F. Reference snippet.**
```ts
// civitai: upload pipeline, after safetensors hash
const { sha256 } = await hashSafetensors(file);
const omla = await fetch(`https://api.omla-ai.org/verify?hash=${sha256}`)
  .then(r => r.json());
if (omla.status === "registered") {
  model.license = "OMLA-1.0";
  model.royaltyAddress = omla.omla_address;
  model.requiresCommercialReport = true; // block monetized endpoints until compliant
}
```

---

### 3.2 OpenRouter

**A. Current licensing model.** None. OpenRouter is a router — it shows provider pricing, passes traffic through, and takes ~5% off the top at credit purchase time. The model license page on OpenRouter is, charitably, a link to the upstream model card. Enforcement: zero. They don't have a concept of a model creator being a separate economic actor from the model host.

**B. Where OMLA slots in.** Inference webhook only. OpenRouter meters every request for provider reconciliation — forking a signed per-model summary to OMLA quarterly is small work. Hoster reimbursement would flow to the *upstream provider* OpenRouter routed to (Fireworks, Together, Lepton), not to OpenRouter itself.

**C. Integration shape.**
```http
POST https://api.omla-ai.org/hoster/report
Authorization: Signature ed25519=…
Content-Type: application/json

{ "quarter": "2026Q1",
  "hoster_id": "omla1openrouter…",
  "reports": [
    { "model_hash": "c0ffee…", "requests": 1_234_567, "total_run_cost_usd": 4219.30 }
  ] }
```

**D. Friction: 3.** The metering exists. The political problem is: OpenRouter has no contractual relationship with model *creators*, only with *providers*. Asking them to add an OMLA-aware routing tag is easy; getting them to care enough to ship it is a product-priority fight.

**E. Does hoster-fork fit?** **No.** OpenRouter's 5% is a thin margin on the provider's price; the *creator* isn't in the price stack today. Pushing a creator royalty onto upstreams raises their required price to stay whole — so OpenRouter's pass-through prices (and their 5%) go up. They don't lose money, but become the messenger for a price hike. Expect them to demand the creator royalty sits *additional to* their fee — which, structurally, is already how it would land.

**F. Reference snippet.**
```ts
// openrouter: after response, inside billing pipeline
await omlaReporter.record({
  modelHash: route.upstream.weightsSha256,
  hosterId: route.upstream.omlaHosterId, // if provider registered with OMLA
  requests: 1,
  runCostUsd: route.upstream.billedUsd,
});
// flushed quarterly as one signed batch
```

---

### 3.3 Hugging Face

**A. Current licensing model.** License field in README.md metadata, a small dropdown of ~70 license identifiers. Gated-repo access-request mechanism. *No* enforcement — public research has repeatedly shown that only ~35% of models even declare a license, and of those, license inheritance through fine-tunes is mostly broken. HF's Inference Endpoints and Spaces are the only places they're themselves a hoster, and those are niche revenue lines.

**B. Where OMLA slots in.** Registry lookup and upload-time SHA check. HF already computes LFS blob SHA-256s; plumbing them into an OMLA verify call on `push` is a Hub-side change. The `license:` field in the model card metadata should accept `omla-1.0` as a valid identifier, and the website should render the `omla1…` address and compliance state on the model page. The inference webhook only applies to HF's own Inference Endpoints product, which is commercially immaterial.

**C. Integration shape.**
```http
# As part of HF Hub push webhook
POST https://api.omla-ai.org/verify/batch
{ "shas": ["…", "…"], "repo": "meta-llama/Llama-4-70B" }
```
Rendered on the model page as a badge, similar to how HF already renders carbon footprint and eval results.

**D. Friction: 4.** Not because it's technically hard — because HF is a standards player, not an enforcer. Anything that looks like they're taking sides in a licensing dispute is politically painful for them. They'll add OMLA as *an* option in the license dropdown within weeks of asking; they won't add compliance enforcement or blocking until OMLA is a named industry standard with at least one major model-creator brand behind it.

**E. Does hoster-fork fit?** Mostly N/A — HF is a weights host, not an inference hoster, and the hoster-fork carve-out is compensation for *reporting inference volume*. HF has no inference volume to report for 99% of their catalog. For Inference Endpoints, yes, the economics are Replicate-shaped and the carve-out works.

**F. Reference snippet.**
```yaml
# At top of Hugging Face model card (README.md)
---
license: omla-1.0
omla_address: omla1qxy…
omla_verified_hash: c0ffee…
---
```
```ts
// On HF backend, renders model page
const omla = await fetch(`https://api.omla-ai.org/model/${repo.omla_address}`);
if (omla.compliance === "BLACKLISTED") banner("This model has unresolved royalty obligations.");
```

---

### 3.4 Replicate

**A. Current licensing model.** Per-model license displayed on the model page; terms-of-service explicitly push compliance to the user. Replicate *does* have a creator-revenue mechanism (creators who publish public models on Replicate get a share of the per-prediction revenue for usage of their model), which is the closest existing analogue to OMLA's royalty flow.

**B. Where OMLA slots in.** Inference webhook is the cleanest fit. Replicate already meters predictions per model, already has a revenue-share ledger, and a creator payout system. Routing "this model has an OMLA address — send a signed report quarterly + fork X% to the address rather than the creator's Replicate-registered bank account" is a ledger-row change, not an architectural one.

**C. Integration shape.**
```http
POST https://api.omla-ai.org/hoster/report
{ "quarter": "2026Q1",
  "hoster_id": "omla1replicate…",
  "reports": [ { "model_hash": "…", "predictions": 98_421,
                 "total_run_cost_usd": 12_481.04 } ] }
```
Plus a registry lookup on model push from the `cog push` CLI.

**D. Friction: 3.** Technical fit is excellent; the political cost is that Replicate would need to pick a side on which licensing framework to integrate first, and every framework author (RAIL, OpenRAIL, Fair AI licenses, etc.) is going to ask for the same shelf space. **Likely adopter if given a light-touch pilot.**

**E. Does hoster-fork fit?** **Yes.** Replicate's existing economics already accept that creator royalties come out of the user's bill; OMLA is just a formalisation and an external settlement system. The hoster-fork carve-out maps cleanly to "Replicate keeps its current margin, OMLA's 30% comes from the creator-allocated slice, and Replicate gets a small hoster reimbursement for the reporting work." This is the cleanest fit of any platform on the list.

**F. Reference snippet.**
```python
# replicate: in prediction billing pipeline
if model.license == "OMLA-1.0":
    omla_ledger.record(
        quarter=current_quarter(),
        hoster_id=REPLICATE_OMLA_ID,
        model_hash=model.weights_sha256,
        predictions=1,
        run_cost_usd=prediction.billed_usd,
    )
    # Commercial callers additionally get flagged for quarterly invoice reconciliation
```

---

### 3.5 Fireworks AI

**A. Current licensing model.** Zero license surfacing. Fireworks' product is "here's the Llama / Qwen / DeepSeek catalog at aggressive per-token prices." They rely on the *model's own license* (Llama Community License, Apache 2.0, etc.) covering them, and they don't think of the creator as an economic party. Serverless pricing starts at $0.10 / M tokens for sub-4B models — the margin is already thin.

**B. Where OMLA slots in.** Only the inference webhook, conceptually. In practice, their unit economics actively *repel* the integration.

**C. Integration shape.** Same `/hoster/report` POST as Replicate and OpenRouter.

**D. Friction: 4.** Not technical — commercial. Fireworks' model of the world is "open weights = free input; our margin is the inference stack." Anything that carves a royalty out of the token price is an existential threat to their per-token pricing. Expect open resistance until (a) a model creator they care about (say, Meta for Llama 5) mandates OMLA, or (b) one of their competitors (Together, Cloudflare) differentiates on "OMLA-compliant" and starts winning enterprise deals. Until one of those happens, integrating is a pure cost.

**E. Does hoster-fork fit?** **No.** If a Llama-class model's token price floor is ~$0.20 / M tokens and a 30% royalty goes to the creator, Fireworks is now running at cost. The carve-out *must* be additional to, not subtractive from, their margin — or there's no business. This is the core tension with the current OMLA design: the platforms where OMLA matters most (commodity inference hosts) are exactly the platforms where "carve-out from the 30%" is economically infeasible.

The alternative framing — "OMLA royalty is paid by the *commercial user*, not the hoster; hoster gets a flat reimbursement for reporting work" — is the only one that survives here. This is what OMLA's spec already says on paper (commercial user pays), but the hoster carve-out from the *creator's* 30% is confusing: if the commercial user pays, the hoster reimbursement should come from OMLA's operational budget (which is zero), not from the creator's royalty. See §5.

**F. Reference snippet.**
```python
# fireworks: in serverless billing metadata write
meta["omla_hash"] = MODEL_REGISTRY[model_id].weights_sha256
meta["omla_licensed"] = MODEL_REGISTRY[model_id].omla_registered
# nightly cron aggregates meta[] and POSTs to api.omla-ai.org/hoster/report
```

---

### 3.6 Together AI

Structurally identical to Fireworks: same commodity-inference model, same per-token margin, same economic objection to a creator-side carve-out. Slightly more enterprise-sales-heavy, which means they *might* pilot an OMLA-compliant tier as a differentiator for regulated customers — but only as a marketing position, not a default policy. Together has been louder in public about "open-source AI as infrastructure," which rhymes with OMLA's framing. Pitch at their BD team on the regulatory-inevitability angle (EU AI Act Article 53 downstream obligations), not on creator-goodwill grounds.

---

## 4. GDPR, Data Residency, and Sanctions Screening

Three compliance concerns cut across the whole integration:

1. **GDPR for commercial user registration.** OMLA's registry stores company-level identifying info (contact email, tax ID, wallet, quarterly revenue). When HF or Replicate hands OMLA a user fingerprint at upload (§3.1, §3.3), that's a cross-border personal-data transfer under GDPR Article 44 without UK/EU representation and an SCC-style transfer mechanism. OMLA needs a DPA template before any EU-based hoster integrates. Matters *immediately* for HF (France-HQ).

2. **Data residency for quarterly usage reports.** Reports from Fireworks or OpenRouter contain per-customer inference volume by model — commercially sensitive for the hoster's *customers*. OMLA's Supabase instance needs a documented data-residency story (EU mirror, customer-held row-level keys) before enterprise hosters hand the data over. Civitai and Replicate are less exposed (mostly hobbyist users).

3. **Sanctions screening.** Ed25519/Bech32m identities are pseudonymous by design. Royalty paid from a US commercial user to a sanctioned-entity `omla1…` exposes both the user and OMLA to OFAC liability. **Required before launch:** sanctions screening (Chainalysis / TRM Labs, or minimum SDN-list cross-check) in the `REGISTERED` state transition and at each payout. Non-negotiable for any US hoster to legally remit through OMLA rails.

---

## 5. Who Adopts First (and Who Breaks the Design)

**Easy wins, in order:**

1. **Civitai** — existing license UI + existing royalty ledger. Pitch as "OMLA is your commercial-tier license upgrade path." Medium-high adoption odds within 6 months if approached at the VP level. Their on-platform generator is a ready-made hoster-fork pilot.
2. **Replicate** — creator-revenue flows already exist; OMLA is a plug-in settlement provider. Small-but-meaningful adoption as an opt-in creator feature.
3. **Hugging Face** — will add `omla-1.0` as a license identifier almost immediately with the right GitHub PR; enforcement comes later, driven by a brand-name creator requiring it.

**Hard sells:** OpenRouter, Fireworks, Together, Groq. These integrate only when pushed by either (a) a major model creator mandating OMLA or (b) regulatory pressure (EU AI Act downstream-obligations clauses). None of them are adopting on goodwill.

**Won't integrate:** Modal, RunPod, Anyscale, Baseten. Architecturally wrong audience.

---

## 6. Does the Hoster-Fork Economics Need Revision?

**YES.**

The current design — "hoster_share_pct is a carve-out from the creator's 30% royalty" — has a structural problem that only surfaces when you look at it from the commodity-inference host's side:

- For **Replicate and Civitai**, the carve-out works because they already deduct a platform fee from what the creator would otherwise earn. OMLA is just replacing or supplementing their internal share. No revision needed.

- For **Fireworks, Together, OpenRouter, Groq** — the commodity inference hosters, where *the 30% royalty comes out of their per-token gross revenue entirely* — a carve-out from the creator's 30% is economically nonsensical. The creator isn't in their revenue stack today. Asking these platforms to "give us a cut of the creator's 30% in exchange for reporting" is offering them a share of a royalty their customers don't currently pay, and that they'd now have to collect on behalf of the creator — a net-new liability with a small reward.

The design that works for both classes of hoster is one of:

**Option A (recommended):** Keep royalty at 30% of the commercial user's revenue/cost, paid *directly by the commercial user* to the `omla1…` address. The hoster's job is purely *reporting* — and the reimbursement for reporting is a flat per-report fee (say, 1% of reported volume, or a fixed USD amount per report) paid from a separate operational pool, funded by a small (e.g., 2%) tax on royalty flow that OMLA *does* take — which contradicts the "OMLA takes 0%" claim but is honest about who funds compliance infrastructure.

**Option B:** Keep "OMLA takes 0%," but fund hoster reimbursement entirely from the creator's royalty (i.e., the current design). Accept that commodity inference platforms won't integrate, and position OMLA as a creator-first / curator-first framework for the Civitai / Replicate tier only. The downside: the biggest commercial-use volume (Fireworks, Together) stays unregulated.

**Option C:** Split the royalty into "creator share" and "hoster share" that are *both* paid by the commercial user, not carved from each other. E.g., "30% to creator + 2% to hoster of record, both owed by the commercial user when using an OMLA-licensed model commercially." This is the cleanest economically but raises the commercial user's total obligation above 30%, which breaks OMLA's marketing.

My recommendation: **Option A**, and revise the "OMLA takes 0%" claim to "OMLA takes a small operational levy (e.g., 2%) from royalty flow to fund compliance infrastructure and hoster reimbursement." Honesty about running costs is worth the narrative hit; the alternative is a design that excludes the platforms that host the majority of commercial open-weight inference.

---

*Open questions for next revision: reference hoster SDK (TS + Python) to cut integration from a week to an afternoon; monthly or streaming reporting cadence instead of quarterly for commodity inference hosters; whether `hoster_share_pct = 0` as default actively discourages adoption (for commodity hosters, a non-zero default is the whole point).*

---

## 7. Addendum — Multi-currency reimbursement (post-review addition)

The original draft above assumes payout is in USD and implicitly asks platforms to pay US dollars out of band. That's wrong for Civitai (Buzz is their internal currency and creators already accept it), wrong for OpenRouter (Credits), and awkward for Replicate. The `010_currency` migration adds the mechanism described below.

### 7.1 Creator-declared currency preferences

Every registered model carries an ordered list of accepted currencies in `creator_currency_prefs`. At least one entry must be marked `is_backup = true`, and that entry's currency must have `kind = 'crypto'` in the `currency_registry`. This is enforced by a deferred constraint trigger so batch-setting prefs at registration doesn't trip mid-way.

### 7.2 Platform-declared native currencies and conversion fees

Each commercial user / platform (and each hoster) declares what currencies it can natively remit via `remitter_currencies` and what it charges to convert any of its natives into a given crypto via `remitter_conversion_fees`. Fees are capped at 50%, publicly readable, and must be declared ahead of time — no silent skim.

### 7.3 Resolution algorithm

`resolve_payout_currency(model_id, remitter_kind, remitter_id)` returns `(currency_code, fee_pct, route)` per payout:

1. Walk the creator's prefs in order.
2. First intersection with remitter's native list → return with 0% fee (route = `native`).
3. If no intersection, return the backup crypto with the remitter's declared conversion fee (route = `backup_with_fee`).
4. If remitter declared no natives, return the backup with 0% (route = `backup_direct`).

### 7.4 How this lands on each platform

- **Civitai.** Registers `BUZZ` and `USD` as natives. Declares `BUZZ → BTC` at (say) 30% to account for their real conversion overhead and liquidity cost. Most Civitai creators will list `BUZZ` as their top preference, so the 30% fee rarely fires — most payouts stay inside the Civitai economy with zero conversion loss. That directly addresses the §3.1 friction complaint: Civitai isn't asked to give up its internal economy; it's asked to settle that economy against OMLA's ledger.
- **OpenRouter.** Registers `CREDITS-OR` (their internal denomination) plus `USD`. Creators that want to accept OpenRouter credits get a 0% native match; others fall through to `USD` or the backup crypto.
- **Hugging Face.** No internal currency. Registers `USD` + `EUR`. Likely pays the backup crypto through a conversion partner and declares a sensible (≤ 3%) fee.
- **Replicate, Fireworks, Together.** Register `USD` (and whatever their payout operations actually support — Stripe Connect supports many). Most creators list `USD` early enough that the backup crypto rarely fires.

### 7.5 The upside for commodity inference hosters

The §3 analysis was pessimistic about Fireworks/Together/Groq because carving from the 30% creator royalty is an odd ask. The currency mechanism softens this: a commodity inference host can offer 0% conversion overhead in their preferred currency in exchange for a creator opting them in as a verified hoster. Whether the commodity host accepts the hoster-pool carve-out at all is still an open question (see §6 above). But the currency flexibility moves the conversation from "30% of your margin, forever" to "a currency you already settle in, with declared and capped conversion fees."

### 7.6 What the platform integration actually looks like with currency

```http
# Platform declares currency config once.
POST /api/remitter/currency-config
{
  "commercial_user_id": "<civitai-uuid>",
  "native_currencies":  ["BUZZ", "USD"],
  "conversion_fees":    [{ "from": "BUZZ", "to": "BTC", "fee_pct": 30.00 }]
}

# At payout time the resolver runs per creator payout:
SELECT * FROM resolve_payout_currency(
  model_id       := '<model-uuid>',
  remitter_kind  := 'commercial_user',
  remitter_id    := '<civitai-uuid>'
);
-- -> currency_code='BUZZ', fee_pct=0, route='native'
```

### 7.7 Revised fairness claim

The prior draft flagged "OMLA takes 0% is probably unsustainable." That tension is unchanged — operational costs still need funding somewhere, and optional donations (Task 7) are the current plan. The currency mechanism doesn't alter that. It does, however, remove a fairness complaint from the platforms: they're no longer forced into USD-denominated settlement with an off-platform liquidity provider they didn't pick.
