We don’t wrap ChatGPT. Here’s what we actually build.
The state of AI services in 2026
There are roughly three kinds of “AI agencies” in the market right now.
The first kind sells you a ChatGPT wrapper with their logo on it. You pay them a five-figure retainer to give you something you could have built yourself with a weekend and the OpenAI dashboard. The interface is theirs. The intelligence is rented. When the API price changes or the model deprecates, your product changes with it, and you find out by reading the same release notes everyone else does.
The second kind sells you “prompt engineering.” Which is a real discipline, mostly the wrong one. They tune system prompts and string together API calls and call the result a custom solution. It is a custom workflow. It is not a custom model. The difference matters more than the marketing makes it sound.
The third kind — vanishingly rare — builds proprietary pipelines. Trains on data the client owns or generates. Fine-tunes models that didn’t exist before the engagement started. Operates infrastructure that produces outputs no off-the-shelf API can replicate, because the model behind the output isn’t an off-the-shelf model.
Riamona is the third kind. This post is about what that actually means, why it costs more, and why the difference shows in the first frame of output.
What “wrapping” looks like
Let’s be specific about the failure mode, because the language has gotten slippery.
A wrapper is a thin layer on top of someone else’s intelligence. You send a prompt to a hosted model, you get a response, you display it inside your branded interface. The intelligence is doing the work. You are doing the styling.
This is a legitimate product category for some things. A customer-support chatbot built on a hosted LLM, deployed in three days, costing nothing to maintain — that’s a wrapper, and it’s the right answer when the use case is generic and the data is public. We have built wrappers ourselves when the brief called for one. There’s no shame in fitting the solution to the problem.
The shame is in pricing a wrapper like a build, and in selling a wrapper as a moat. Because a wrapper isn’t a moat. The moment a competitor wants to copy your product, they wrap the same model and ship the same thing on a Tuesday. Your differentiation lasts as long as the competitor’s procurement cycle.
This is the trap most “AI for X” companies are sitting in right now without knowing it. The product feels like magic on launch day. Eighteen months later, the magic is commodity, and the question becomes: what did we actually build?
What we build instead
When a Riamona project calls for AI — and not every project does — the work happens at a layer beneath the prompt.
We start with the question of where the proprietary value lives. Sometimes it’s in the data: a client has years of catalog photography, internal documents, customer interactions, or production knowledge that no foundation model has ever seen. Sometimes it’s in the workflow: the client has a sequence of decisions that compound, where the third step depends on the second in ways a single model call can’t capture. Sometimes it’s in the output specification: the result has to look or sound a specific way that generic models miss by a mile.
Then we build accordingly.
Custom-trained models. When the data exists, we fine-tune. Sometimes a full fine-tune of an open-weights image model on a client’s product photography. Sometimes a LoRA adapter that captures a brand’s visual signature without retraining the base. Sometimes a fine-tuned LLM on a client’s writing style, technical documentation, or domain language. The point is that the model itself — not just the prompt — has been changed by the engagement. After we leave, the model still belongs to the client, still runs on their infrastructure, still produces outputs that get better the longer it lives.
Production pipelines. A single model call almost never solves a real problem. A pipeline does. We chain image generators with computer-vision validators with layout engines with copy models with quality-control filters. Each stage feeds the next. Each stage can be tuned independently. Each stage’s output gets logged and re-trained on. The pipeline is the product, not the model — and the pipeline is what no competitor can copy with a weekend of prompt engineering.
Domain-specific evaluators. This is the unglamorous secret of production AI. Models hallucinate. Generators produce slop. The difference between a demo and a production system is the evaluator that sits between them and the user — a model or rule set that knows what good looks like for this specific domain and rejects anything that isn’t. We build evaluators trained on real client data, calibrated to the client’s quality bar, deployed as gates inside the pipeline. The user never sees the rejected outputs. They only see the work that cleared the bar.
Full ecosystem platforms. When the engagement justifies it, the result is a complete platform. Internal interfaces for operators. APIs for downstream systems. Dashboards for the client to see what their AI is doing and why. Audit logs. Cost monitoring. Failover. The boring infrastructure that makes the difference between an experiment and a system.
What this looks like on the work
Our own catalog pipeline is the proof-of-life. We run it on our own work every day, and we run the same architecture for client engagements.
A single product photo enters the pipeline. By the time it exits, there are anywhere from twelve to fifty assets attached to that SKU — photoreal lifestyle scenes in different rooms, 360° spins, comparative scale shots, detail crops, designed tear sheets, SEO-tuned copy in three voices, channel-ready exports for Shopify and Amazon and Etsy and the brand’s own catalog PDF.
The pipeline runs in days, not weeks. The outputs are consistent, because the models behind them have been trained on real rug fibre and real jewelry geometry, not on Pinterest. The cost per SKU drops as the pipeline runs more times on similar material, because every output becomes training data for the next iteration.
A competitor with the same brief and a ChatGPT wrapper would produce something that looks plausible at thumbnail size and falls apart at full resolution. We’ve seen the comparisons. We invite them.
Why this costs more — and why it’s worth more
A wrapper costs what a wrapper costs: a markup over the API calls plus a fee for the interface. Cheap to start, cheap to scale, cheap to copy.
A proprietary pipeline costs what real engineering costs: research time, training compute, evaluator data labeling, infrastructure setup, ongoing tuning. The first project carries the brunt of that cost. Every subsequent project benefits from the foundation. The third client to use the same pipeline pays meaningfully less than the first.
The economics are inverted from the wrapper model. Wrappers are cheap upfront and expensive forever — you keep paying the API tax. Pipelines are expensive upfront and cheap forever — you own the system. Over twenty-four months, the math flips so hard that the cheap option becomes the expensive one.
But the real reason it’s worth more isn’t the cost curve. It’s the output. A wrapped product looks like every other wrapped product. A proprietary pipeline produces work that has a signature — outputs a competitor cannot replicate by signing up for the same API, because the API isn’t where the value lives.
In a category where every brand is suddenly using AI, the brands with proprietary pipelines look different. The brands with wrappers look like each other.
What we won’t build
For honesty: we turn down AI engagements often. When the use case is genuinely generic — when a hosted model on a wrapper would actually serve the client better than a custom pipeline — we say so. When the client’s data isn’t sufficient to train on, we say that too. When the budget can’t carry the foundational work, we recommend a different path rather than sell a worse version of the right answer.
The shortest way to ruin a proprietary-pipeline reputation is to ship one that should have been a wrapper. So we don’t.
The first frame test
Here’s the simplest way to know whether your AI vendor is building or wrapping. Ask them to show you the first frame of output — not the carousel, not the highlight reel, the first frame the system produces before any human cleanup.
A wrapper’s first frame is generic. It looks like the model. It could have come from anyone running the same prompt.
A proprietary pipeline’s first frame looks like the brand. It has the materials right. The light right. The proportions right. The voice right. It looks made, not generated.
That difference is what we sell. Generic AI gives you templates. We give you outputs nothing else can produce. The first frame is the test, and we pass it every time — because we built the system that makes the first frame look that way.
Everything else is hope, and a recurring API bill.