Updates

v2.65.0

This release brings powerful filtering and sorting to the jobs page, lets you open input images in a lightbox, and adds support for element inputs when running jobs directly from the feed.

Multi-select filters and sorting for jobs

You can now filter your job history by multiple statuses, models, output media types, and folders simultaneously. Tag-based filtering is also available for including or excluding specific tags. A new sort menu lets you order jobs by newest, oldest, shortest duration, or longest duration. All filter and sort selections are saved in the URL, so your view persists across page reloads.

View input images at full size

Clicking on an input image thumbnail within a job card now opens a full-size lightbox directly on the page, allowing you to inspect the source material without navigating away. You can cycle through all media inputs for that job using the previous and next buttons.

Run element-based nodes from the jobs page

Nodes that require an element input, such as Kling reference or video generation, can now be configured and run directly from the jobs feed. A new element picker lets you select from your existing library or create a new element inline.

Quality of life

  • Group node background colors are now more distinct for easier visual identification.
  • The Expand button on job output media is consistently positioned in the top-right corner.
  • 3D model and Gaussian splat outputs now display as thumbnail cards in the jobs feed.
  • Nodes created near a reference node that is inside a group are now placed at the correct absolute position.
  • Copy and paste operations now preserve a node’s output preview, matching the behavior of duplication.
  • Group node titles now scale with zoom level consistently with other node titles.

Object placement in perspective space

The problem with most AI generation workflows is that they treat each output as a standalone artifact. You generate, you evaluate, you discard or keep — and then you start again from scratch. What gets lost is the system: the relationship between prompt structure, model behavior, and the specific visual language you’re trying to develop.

Building a production-level workflow means treating your prompts, references, and outputs as a connected body of work — not a series of isolated experiments. The canvas is where that system lives.

Start with the structure, not the detail

The most common mistake is front-loading specificity. Writers know this problem as trying to write perfect sentences before you have a working outline. In generative work, it shows up as obsessing over lighting descriptors when the composition itself isn’t right yet.

Work in passes. Your first generation should answer only the compositional question — does the subject read clearly against the background? Is the perspective plausible? Only once the structural read is right do you start layering in lighting, texture, and material detail.

The best creative directors in this space don’t think in individual prompts. They think in systems — a vocabulary of references, constraints, and combinatorial rules that produce consistent results at scale.

Perspective as a first-class constraint

Object placement is fundamentally a perspective problem. A generated scene has an implied camera position — focal length, height, angle — and any object you introduce needs to be consistent with that implied camera, or it will read as wrong even if the viewer can’t immediately articulate why.

The practical approach: identify the vanishing points in your scene before you try to place anything. Describe the camera position in your prompt as concretely as you describe the subject — “shot at knee height, 35mm equivalent, slight upward tilt” gives the model something to anchor against.

Grounding with reference geometry

For product placement and architectural integration — the cases where technical accuracy matters most — a geometry pass before the generation pass makes a significant difference. Rough 3D blockouts, even at low fidelity, give the model a structural skeleton to work against.

OTOY Canvas supports this with its 3D input nodes: you can bring in a rough mesh or a splat scene and use it as structural reference, then pipe the result into an image refinement model. The perspective comes from real geometry, not from the model’s guess.

The shadow test

A quick diagnostic for placement accuracy: does the shadow match? Shadows encode the light direction, the camera angle, and the relationship between the object and the ground plane. If any of those are inconsistent, the shadow will look wrong before anything else does. Use it as an early-exit check.

Building a consistent pipeline

Once you have a placement approach that works, the goal is to make it reproducible. That means saving the prompt structure, the reference set, the model configuration, and the node graph as a named workflow — not just the final output.

The workflows that scale are the ones where a new team member can open the graph, understand the intent from the structure, and produce a consistent result without asking what settings were used. That’s the difference between a creative system and a lucky generation.