Back to Blog
·Tokensmart Team·2 min read

Image Generation Is Here: From a Sentence to a Picture

announcementproductimage

From chatting to painting, one sentence away

Until today, your Tokensmart API key could call GPT, Claude, and Gemini. Now it can paint too — same account, same balance, same request log.

What shipped

  • A new Image Generation page, ready to use once you log in
  • The gpt-image family of models, with more to come
  • OpenAI-compatible POST /v1/images/generations endpoint — your existing code works as-is
  • Per-call billing, itemized invoices, and request logs fully unified with the text-model pipeline

Using the web page

Open the dashboard and click "Images" in the sidebar:

  1. Type a one-sentence description — "A cozy orange cat in a wizard's hat, sitting on a rooftop under moonlight"
  2. Size, quality, background, and batch count default to Auto. To tweak, click the chip pills above the input
  3. Enter to send, Shift+Enter for a new line. IME-friendly — pressing Enter during Chinese pinyin input never accidentally submits

Within seconds the image lands. The result card shows elapsed time and cost, click to zoom, hit the top-right button to download. History is archived automatically, with single-image or batch-ZIP downloads.

Using the API

The protocol is fully OpenAI-compatible. Just point base_url at https://api.tokensmart.ai/v1:

from openai import OpenAI

client = OpenAI(
    api_key="pk_live_...",
    base_url="https://api.tokensmart.ai/v1",
)

result = client.images.generate(
    model="gpt-image-2",
    prompt="A cozy cabin in a snowy forest at dusk",
    size="1024x1024",
)
print(result.data[0].url)

Or plain curl:

curl https://api.tokensmart.ai/v1/images/generations \
  -H "Authorization: Bearer pk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-image-2","prompt":"A cozy cabin in a snowy forest at dusk","size":"1024x1024"}'

Why per-call billing

Text models are priced per token, but the cost structure for image models is different — for the same prompt, output resolution and quality drive upstream GPU time, not "word count." So image models use per-call pricing: one call = one charge, in line with industry standard image APIs.

Rates live in the "Per Generation" column on the pricing page, and each model's price is printed right next to its name in the composer's model dropdown — no guessing.

Blocked prompts never cost you

If your prompt is flagged by the upstream policy check (sexual content, real-person likeness, violence, etc.), we:

  • Show a friendly inline yellow notice above the composer rather than a red 500 error
  • Never deduct balance — the pre-held amount is refunded within seconds

In short: if your description fails the check, just rewrite it. You pay nothing.

What's next

Upcoming capabilities on our roadmap, prioritized by feedback:

  • Image-to-image — start from a reference image to generate variants
  • Inpainting — select a region and regenerate just that part
  • More image models (Flux, SDXL, etc.) plus early work on video models

Ideas or issues? Drop us a note in the enterprise WeChat group or email support@tokensmart.ai 🎨