v0.7.0 — Asynchronous requests
New features
- Asynchronous mode on
/edit,/generate,/enhance, and/remix. Passasync: true(withresponse_mode: "urls") and the call returns202with ajob_idinstead of blocking; pollGET /v1/phota/jobs/{job_id}for the result. Your client submits the job and polls (or receives a callback) when it’s ready — no long-lived connection to hold open, and less overhead for your integration. See Asynchronous requests. - Job callbacks — pass an optional
callback_urlto be notified when a job finishes. Callback bodies are HMAC-signed (X-Phota-Signature); fetch your signing secret fromGET /v1/phota/webhook-secret. Polling remains authoritative. - Idempotent submits — pass an optional
client_request_id; a retried submit with the same value returns the original job instead of creating (or charging for) a duplicate.
Behavior changes
- Graceful handling of long synchronous requests. A synchronous request that runs past its time limit now
returns a clean
504and automatically refunds the reserved cost, instead of dropping the connection. For long-running edits, preferasync: true— it runs the work in the background with no connection to hold open.
v0.6.0 — Remix is generally available
New features
POST /remix— restyle an image to match a reference image. Provide aninput_image(the content) and areference_image(the style); the output keeps the input’s content rendered in the reference’s look. Optionalprofile_idspreserve recognized identities through the restyle, and an optionalpromptsteers the result. Output resolution is fixed at 2K. Billed like Enhance: $0.12 per output image, $0.15 with identity. See the Remix guide.
v0.5.0 — Quick Train tier for profiles
New features
training_tierparameter onPOST /profiles/add. Optional, accepts"standard"(default) or"fast"."standard"(Full Train) — – images, $2.90 per profile, highest fidelity."fast"(Quick Train) — – images, $1.49 per profile, faster turnaround at lower cost.- Image count is validated per tier; out-of-range counts return HTTP 422.
Behavior changes
- Full Train (
standard) minimum image count lowered from 30 to . Existing requests sending 30–50 images are unaffected. - Quick Train completes in ~3 minutes, Full Train in ~8 minutes (both excluding queue time). Full Train length now scales with image count (previously fixed at the -image equivalent) — requests sending 30– images will see slightly shorter, proportional training. For the strongest profile, send images.
v0.4.0 — Quality selection on GPT Image 2
New features
qualityparameter on/editand/generate. Optional, accepts"auto","low","medium", or"high". Only supported whenbase_modelisgpt-image-2; sendingqualityfor any other model returns HTTP 400. Settlement remains token-driven, so partners pay for the actual output tokens regardless of tier.
Behavior changes
- Default quality on
gpt-image-2is now"auto"— Requests that omitqualitypreviously landed on the"high"tier; they now land on OpenAI’s auto tier. Pass"quality": "high"explicitly if you want the previous behavior.
v0.3.0 — Base model selection
New features
-
base_modelparameter on/editand/generate. Pick betweennb2(Nano Banana 2, default),gpt-image-2(GPT Image 2),qwen-image-2,flux-2, andreve. Phota’s identity layer composes on top of any of them. Unknown ids return HTTP 422; unsupported resolutions (e.g. 4K onflux-2orqwen-image-2) return HTTP 400. See Base model selection. - Per-model pricing — Each base model has its own per-image rate plus a with-identity rate that applies when a trained profile is recognized in the output. GPT Image 2 is billed per token. See Pricing.
Improvements
- Less wardrobe and background leakage on
/generate— Clothing, accessories, and background from the profile’s training photos are less likely to bleed into/generateoutputs. Prompts that specify wardrobe or setting are followed more reliably.
v0.2.1
New features
- Enhance prompts — The
/enhanceendpoint now accepts an optionalpromptfield with text instructions to guide the enhancement. When omitted, enhancement parameters continue to be inferred automatically.
v0.2.0
New features
-
Output format selection — New
output_formatparameter on edit, generate, and enhance endpoints. Choose between"png"(lossless) and"jpg"(smaller files). Default:"png". -
Response mode — New
response_modeparameter ("bytes"or"urls") on edit, generate, and enhance endpoints. Default:"bytes"(base64-encoded images). Set to"urls"to receive signed CDN download URLs (24-hour expiry) instead. Each mode populates its respective response field; the other is returned empty.
Deprecation notices
- Default format change — The default
output_formatwill change from"png"to"jpg"on 2026-05-08. Setoutput_formatexplicitly in your requests to avoid disruption. Responses includeX-Phota-Deprecation-Date: 2026-05-08header.
v0.1.1
New features
- Profile tags — Profiles now accept an optional
tagfield on creation, allowing you to namespace profiles by your own identifiers (e.g., end-user IDs).GET /profiles/idsaccepts a?tag=query parameter to filter profiles by tag.
Improvements
- Image format handling — Input images exceeding 4096px are automatically resized to 4K. Supported formats (JPEG, PNG, WebP, HEIC/HEIF) are now documented in the OpenAPI spec field descriptions. PNGs are automatically transcoded to optimized JPG.
-
Request tracing — Every response now includes an
X-Request-Idheader, and error responses include arequest_idfield. Use these when reporting issues for faster debugging. - Request timeout — Maximum request duration increased from 360s to 600s, fixing 504 timeouts on long-running generation requests.
-
Content moderation errors — Content moderation blocks now return HTTP 400 with error type
CONTENT_MODERATIONinstead of HTTP 500.
Bug fixes
- Training error messages — When no subject is detected in training images, the API now returns a descriptive HTTP 400 error instead of a generic 500 error.
v0.1.0 — Initial release
This is the first public release of the Phota API.
Endpoints
StudioPOST /edit— Edit images using a text prompt and optional profile references. Supports base64 and URL input images.POST /generate— Generate new images from a text prompt alone, without any input image. Supports profile references for identity preservation.POST /enhance— Automatically enhance a photo without a text prompt.
POST /profiles/add— Create a new profile from – reference image URLs. Training runs asynchronously.GET /profiles/{profile_id}/status— Poll the training status of a profile (IN_PROGRESS, READY, or ERROR).GET /profiles/ids— List all profile IDs for the authenticated account.GET /profiles/{profile_id}/profile_picture— Retrieve a profile picture as a JPEG image.DELETE /profiles/{profile_id}— Permanently delete a profile and all associated data.
Features
- Identity preservation — Train profiles from reference photos and use them across edits and generations. Reference
profiles in prompts with the
[[profile_id]]syntax to control where specific people appear. - Pro mode — Available on edit and generate endpoints. Enables better instruction following and quality at the cost of higher costs. Unlocks aspect ratio and resolution control.
- Multi-image output — Generate up to 4 output images per request.
- Flexible input — Accept images as raw base64 strings or publicly accessible URLs.
- Aspect ratio and resolution control — Choose from 11 aspect ratios and up to 4K resolution when pro mode is enabled.
- Billing metadata — Responses include
known_subjectswith counts of known subjects generated per profile, used for billing purposes.
Authentication
- All endpoints require the
X-API-Keyheader. See Authentication for details.
