IMA Voice AI Creation
Scope & Dependencies (Declared for Transparency)
- Credentials: This skill requires an IMA API key at runtime (
IMA_API_KEYor--api-key). The key is sent only to api.imastudio.com. Obtain keys at https://imastudio.com. Declared in registry as required. - Optional dependency: When ima-knowledge-ai is installed, this skill may instruct the agent to read that skill's reference files (
~/.openclaw/skills/ima-knowledge-ai/references/*) for workflow and model-selection guidance. This skill is self-contained — it works fully without ima-knowledge-ai. Reading another skill's files is optional and only for complex or multi-step tasks; users who do not have or trust ima-knowledge-ai can ignore those steps and use this skill's built-in defaults and 📥 User Input Parsing tables. - Local paths: This skill reads/writes
~/.openclaw/memory/ima_prefs.json(preferences) and~/.openclaw/logs/ima_skills/(logs; auto-deleted after 7 days). User can delete these anytime.
Optional: Read Knowledge Base (When ima-knowledge-ai Is Installed)
If ima-knowledge-ai is not installed: Skip this section. Use only this SKILL's default models and the 📥 User Input Parsing tables for model_id and parameters.
When ima-knowledge-ai is installed and the task is complex, you may optionally read its reference files for better workflow and model choice:
Workflow complexity — Read
ima-knowledge-ai/references/workflow-design.mdif:- User mentions: "MV"、"配乐"、"完整作品"、"多步骤"、"soundtrack"
- Task involves: video + music coordination, multi-track production, integrated workflows
- Complex requirements that need task decomposition
Model selection — Read
ima-knowledge-ai/references/model-selection.mdif:- Unsure which model to use (Suno vs DouBao BGM vs DouBao Song)
- Need cost/quality trade-off guidance
- User specifies budget or quality requirements
Why this is optional:
- Music generation is often part of a larger workflow (video + music, story + soundtrack)
- For simple single-track requests, proceed directly with this skill's defaults
- For complex workflows, reading the knowledge base can improve task decomposition and model choice
Example workflow case (when using optional knowledge base):
User: "帮我做个产品宣传MV,有背景音乐"
❌ Wrong: 直接生成音乐 (music alone, no coordination with video)
✅ Right (if ima-knowledge-ai available):
1. Read workflow-design.md
2. Decompose: Script → Video shots → Background music (matching video duration/mood)
3. Generate video first (get duration)
4. Generate BGM with matching duration and style
How to check (optional):
# Only if ima-knowledge-ai is installed and task is complex
if ima_knowledge_ai_installed and (complex_workflow or multi_step):
read("~/.openclaw/skills/ima-knowledge-ai/references/workflow-design.md")
if ima_knowledge_ai_installed and unsure_model_choice:
read("~/.openclaw/skills/ima-knowledge-ai/references/model-selection.md")
# Choose model (this skill's logic works with or without knowledge base)
if "background music" or "BGM" or "instrumental":
use_doubao_bgm() # 30pts, pure instrumental
elif "song" or "lyrics" or "vocals":
use_suno_sonic() # 25pts, full-featured with lyrics
else:
use_suno_sonic() # Default: most versatile
For simple requests: Proceed directly with this skill's defaults. No need to read other skills' files.
📥 User Input Parsing (Model & Parameter Recognition)
Purpose: So that any agent (Claude or other models) parses user intent consistently, follow these rules when deriving model_id and task type from natural language. Normalize first, then map.
1. User phrasing → model selection (model_id)
| User intent / phrasing | model_id | Notes |
|---|---|---|
| BGM / 背景音乐 / 纯音乐 / 无人声 / instrumental / 配乐 | GenBGM |
DouBao BGM, 30 pts, ~30s |
| 歌 / 歌曲 / 带歌词 / 人声 / song / lyrics / 有唱 | sonic or GenSong |
Suno (25 pts, ~2min) or DouBao Song (30 pts, ~30s) |
| Suno / 苏诺 / sonic | sonic |
Full-featured, lyrics, vocal_gender, 25 pts |
| 豆包 BGM / DouBao BGM / BGM | GenBGM |
30 pts |
| 豆包 歌曲 / DouBao Song / 豆包歌 | GenSong |
30 pts |
| 最便宜 / 最省钱 / cheapest / budget | GenBGM or GenSong (6 pts 档 if available) |
Only if user explicitly asks for cheapest |
| 最好 / 最全功能 / best / 带歌词可调 | sonic |
Suno default |
If the user does not specify, default to Suno (sonic) for versatility. For "背景音乐"/"BGM"/"配乐" only → use DouBao BGM (GenBGM).
2. Music-specific parameters (Suno)
| User says (examples) | Parameter | Action |
|---|---|---|
| 无人声 / 纯音乐 / no vocals / instrumental | make_instrumental | true |
| 女声 / 女声演唱 / female vocals | vocal_gender | "female" (custom_mode true) |
| 男声 / 男声演唱 / male vocals | vocal_gender | "male" (custom_mode true) |
| 我写歌词 / 自定义歌词 / custom lyrics | custom_mode + lyrics | Provide lyrics in request |
When using Suno with lyrics or vocal control, set custom_mode: true and pass lyrics / vocal_gender per API docs.
⚙️ How This Skill Works
For transparency: This skill uses a bundled Python script (scripts/ima_voice_create.py) to call the IMA Open API. The script:
- Sends your prompt to
https://api.imastudio.com(IMA's servers) - Uses
--user-idonly locally as a key for storing your model preferences - Returns a music URL when generation is complete
- NEW (v1.1.0): Automatic reflection mechanism — if generation fails, the script automatically retries up to 3 times with smart parameter adjustments
🧠 Reflection Mechanism (Automatic Error Recovery)
This skill now includes an intelligent reflection system that automatically recovers from common errors:
3-Layer Retry Strategy:
Attempt 1: Original Parameters
- Uses your provided parameters with smart credit_rule selection
- Most tasks succeed on first try
Attempt 2: Strict Match (Error 6009 Fix)
- Automatically removes unsupported parameters
- Only keeps parameters in
credit_rules.attributes - Example: Removes unsupported Suno parameters if not in model config
Attempt 3: Fallback to Default (Error 6010 Fix)
- Uses model's default configuration
- Uses
credit_rules[0](first rule = safest default) - Guarantees maximum compatibility
Common Errors Fixed Automatically:
- Error 6009: "No exact rule match found for parameters" → removes unsupported params
- Error 6010: "Attribute ID does not match" → corrects attribute_id to match params
- Invalid product attribute: Uses default rule configuration
User Experience:
- ✅ Transparent: Shows reflection log when retries happen
- ✅ Fast: Most tasks succeed on first attempt (no delay)
- ✅ Smart: Learns from errors and adjusts automatically
- ✅ User-friendly: Provides helpful suggestions if all 3 attempts fail
Example Output (with reflection):
🚀 Creating music generation task…
🧠 反省日志 (2 次尝试):
❌ [尝试 1] Invalid product attribute → 移除不支持的参数: ['unsupported_param']
✅ [尝试 2] ✅ 成功(尝试 2)
✅ Task created: task_abc123xyz
What gets sent to IMA servers:
- ✅ Your music prompt/description
- ✅ Model selection (Suno/DouBao)
- ❌ NO API key in prompts (key is used for authentication only)
- ❌ NO user_id (it's only used locally)
What's stored locally:
~/.openclaw/memory/ima_prefs.json- Your model preferences (< 1 KB)~/.openclaw/logs/ima_skills/- Generation logs (auto-deleted after 7 days)
Agent Execution (Internal Reference)
Note for users: You can review the script source at
scripts/ima_voice_create.pyanytime.
This skill uses onlyapi.imastudio.com(no image upload needed for music generation, unlike image/video skills which also useimapi.liveme.com).
Use the bundled script internally:
# Generate music — Suno sonic-v5
python3 {baseDir}/scripts/ima_voice_create.py \
--api-key $IMA_API_KEY \
--task-type text_to_music \
--model-id sonic \
--prompt "upbeat lo-fi hip hop, 90 BPM, no vocals" \
--user-id {user_id} \
--output-json
# DouBao BGM
python3 {baseDir}/scripts/ima_voice_create.py \
--api-key $IMA_API_KEY \
--model-id GenBGM \
--prompt "calm ambient piano background" \
--user-id {user_id} \
--output-json
The script outputs JSON — parse it to get the result URL and pass it to the user via the UX protocol messages below.
Overview
Call IMA Open API to create AI-generated music/audio. All endpoints require an ima_* API key. The core flow is: query products → create task → poll until done.
🔒 Security & Transparency Policy
This skill is community-maintained and open for inspection.
🌐 Network Architecture
This skill uses a simpler network architecture than image/video skills:
| Skill Type | Domains Used | Why |
|---|---|---|
| ima-voice-ai (this skill) | ✅ api.imastudio.com only |
Music generation doesn't require image uploads |
| ima-image-ai, ima-video-ai | api.imastudio.com + imapi.liveme.com |
Image/video tasks need image upload service |
Why the difference?
- Music generation (text_to_music) only needs text prompts → single API endpoint
- Image/video generation (i2i, i2v tasks) needs image file uploads → requires separate upload service
Security verification:
# Verify this skill only uses api.imastudio.com:
grep -n "https://" scripts/ima_voice_create.py
# Expected output:
# Only https://api.imastudio.com (no imapi.liveme.com)
✅ What Users CAN Do
Full transparency:
- ✅ Review all source code: Check
scripts/ima_voice_create.pyandima_logger.pyanytime - ✅ Verify network calls: This skill uses only
api.imastudio.com(music generation doesn't require image uploads). Verify by running:grep -n "https://" scripts/ima_voice_create.py - ✅ Inspect local data: View
~/.openclaw/memory/ima_prefs.jsonand log files - ✅ Control privacy: Delete preferences/logs anytime, or disable file writes (see below)
Configuration allowed:
- ✅ Set API key in environment or agent config:
- Environment variable:
export IMA_API_KEY=ima_your_key_here - OpenClaw/MCP config: Add
IMA_API_KEYto agent's environment configuration - Get your key at: https://imastudio.com
- Environment variable:
- ✅ Use scoped/test keys: Test with limited API keys, rotate after testing
- ✅ Disable file writes: Make prefs/logs read-only or symlink to
/dev/null
Data control:
- ✅ View stored data:
cat ~/.openclaw/memory/ima_prefs.json - ✅ Delete preferences:
rm ~/.openclaw/memory/ima_prefs.json(resets to defaults) - ✅ Delete logs:
rm -rf ~/.openclaw/logs/ima_skills/(auto-cleanup after 7 days anyway)
⚠️ Advanced Users: Fork & Modify
If you need to modify this skill for your use case:
- Fork the repository (don't modify the original)
- Update your fork with your changes
- Test thoroughly with limited API keys
- Document your changes for troubleshooting
Note: Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version.
❌ What to AVOID (Security Risks)
Actions that could compromise security:
- ❌ Sharing API keys publicly or in skill files
- ❌ Modifying API endpoints to unknown servers
- ❌ Disabling SSL/TLS certificate verification
- ❌ Logging sensitive user data (prompts, IDs, etc.)
- ❌ Bypassing authentication or billing mechanisms
Why this matters:
- API Compatibility: Skill logic aligns with IMA Open API schema
- Security: Malicious modifications could leak credentials or bypass billing
- Support: Modified skills may not be supported
- Community: Breaking changes affect all users
📁 File System Access (Declared)
This skill reads/writes the following files:
| Path | Purpose | Size | Auto-cleanup | User Control |
|---|---|---|---|---|
~/.openclaw/memory/ima_prefs.json |
User model preferences | < 1 KB | No | Delete anytime |
~/.openclaw/logs/ima_skills/ |
Generation logs | ~10-50 KB/day | 7 days | Delete anytime |
What's stored:
- ✅ Model preferences (e.g., "last used: Suno sonic-v5")
- ✅ Timestamps (e.g., "2026-02-27 12:34:56")
- ✅ Task IDs and HTTP status codes
- ❌ NO API keys
- ❌ NO personal data
- ❌ NO prompts or generated content
Full transparency: See the complete data flow and privacy policy in the skill documentation above.
📋 Privacy & Data Handling Summary
What this skill does with your data:
| Data Type | Sent to IMA? | Stored Locally? | User Control |
|---|---|---|---|
| Music prompts | ✅ Yes (required for generation) | ❌ No | None (required) |
| API key | ✅ Yes (authentication header) | ❌ No | Set via env var |
| user_id (optional CLI arg) | ❌ Never (local preference key only) | ✅ Yes (as prefs file key) | Change --user-id value |
| Model preferences | ❌ No | ✅ Yes (~/.openclaw) | Delete anytime |
| Generation logs | ❌ No | ✅ Yes (~/.openclaw) | Auto-cleanup 7 days |
Privacy recommendations:
- Use test/scoped API keys for initial testing
- Note:
--user-idis never sent to IMA servers - it's only used locally as a key for storing preferences in~/.openclaw/memory/ima_prefs.json - Review source code at
scripts/ima_voice_create.pyto verify network calls (search forcreate_taskfunction) - Rotate API keys after testing or if compromised
Get your IMA API key: Visit https://imastudio.com to register and get started.
🔧 For Skill Maintainers Only
Version control:
- All changes must go through Git with proper version bumps (semver)
- CHANGELOG.md must document all changes
- Production deployments require code review
File checksums (optional):
# Verify skill integrity
sha256sum SKILL.md scripts/ima_voice_create.py
If users report issues, verify file integrity first.
🧠 User Preference Memory
User preferences override recommended defaults. If a user has generated before, use their preferred model — not the system default.
Storage: ~/.openclaw/memory/ima_prefs.json
{
"user_{user_id}": {
"text_to_music": { "model_id": "sonic", "model_name": "Suno", "credit": 25, "last_used": "..." }
}
}
If the file or key doesn't exist, fall back to the ⭐ Recommended Defaults below.
When to Read (Before Every Generation)
- Load
~/.openclaw/memory/ima_prefs.json(silently, no error if missing) - Look up
user_{user_id}.text_to_music - If found → use that model; mention it:
🎵 根据你的使用习惯,将用 [Model Name] 帮你生成音乐… • 模型:[Model Name](你的常用模型) • 预计耗时:[X ~ Y 秒] • 消耗积分:[N pts] - If not found → use the ⭐ Recommended Default (Suno sonic-v5)
When to Write (After Every Successful Generation)
Save the used model to ~/.openclaw/memory/ima_prefs.json under user_{user_id}.text_to_music.
See ima-image-ai/SKILL.md → "User Preference Memory" for the full Python write snippet.
When to Update (User Explicitly Changes Model)
| Trigger | Action |
|---|---|
用XXX / 换成XXX |
Switch + save as new preference |
以后都用XXX / always use XXX |
Save + confirm: ✅ 已记住!以后音乐生成默认用 [XXX] |
用便宜的 / cheapest |
Use DouBao BGM/Song; do NOT save unless user says "以后都用" |
⭐ Recommended Defaults
These are fallback defaults — only used when no user preference exists.
Always default to the newest and most popular model. Do NOT default to the cheapest.
| Task | Default Model | model_id | model_version | Cost | Why |
|---|---|---|---|---|---|
| text_to_music | Suno (sonic-v5) | sonic |
sonic |
25 pts | Latest Suno engine, best quality |
| text_to_music (BGM only) | DouBao BGM | GenBGM |
GenBGM |
30 pts | Background music |
| text_to_music (song) | DouBao Song | GenSong |
GenSong |
30 pts | Song generation |
Selection guide by use case:
- Custom song with lyrics, vocals, style → Suno sonic-v5 (default)
- Background music / ambient loop → DouBao BGM
- Simple song generation → DouBao Song
- User explicitly asks for cheapest → DouBao BGM/Song (6pts each) — only if explicitly requested
⚠️ For Suno:
model_versioninsideparameters(e.g.sonic-v5) is different from the outermodel_versionfield (which issonic). Always set both.
💬 User Experience Protocol (IM / Feishu / Discord) v1.1 🆕
v1.1 Update: Added Step 0 to ensure correct message ordering in group chats (learned from ima-image-ai v1.2).
Music generation completes in 10~45 seconds. Never let users wait in silence.
Always follow all 5 steps below, every single time.
🚫 Never Say to Users
| ❌ Never say | ✅ What users care about |
|---|---|
ima_voice_create.py / 脚本 / script |
— |
| 自动化脚本 / automation | — |
| 自动处理产品列表 / 查询接口 | — |
| 自动解析参数 / 智能轮询 | — |
| attribute_id / model_version / form_config | — |
| API 调用 / HTTP 请求 / 任何技术参数名 | — |
Only tell users: model name · estimated time · credits · result (audio file/player) · plain-language status.
Estimated Generation Time per Model
| Model | Estimated Time | Poll Every | Send Progress Every |
|---|---|---|---|
| DouBao BGM | 10~25s | 5s | 10s |
| DouBao Song | 10~25s | 5s | 10s |
| Suno (sonic-v5) | 20~45s | 5s | 15s |
estimated_max_seconds = upper bound (e.g. 45 for Suno, 25 for DouBao BGM/Song).
Step 0 — Initial Acknowledgment Reply (Normal Reply) 🆕
⚠️ CRITICAL: This step ensures correct message ordering in group chats.
Before doing anything else, reply to the user with a friendly acknowledgment message using your normal reply (not message tool). This reply will automatically appear FIRST in the conversation.
Example acknowledgment messages:
好的!马上帮你生成轻松的背景音乐 🎵
收到!用 Suno 为你创作一首歌 🎶
OK! Starting music generation with DouBao BGM 🎵
Rules:
- Keep it short and warm (< 15 words)
- Match the user's language (Chinese/English)
- Include relevant emoji (🎵/🎶/🎸/🎹)
- This is your ONLY normal reply — all subsequent updates use
messagetool
Why this matters:
- Normal replies automatically appear FIRST in the conversation thread
messagetool pushes appear in chronological order AFTER your initial reply- This ensures users see: "好的!" → "🎵 开始生成..." → "✅ 成功!" (correct order)
Step 1 — Pre-Generation Notification (Push via message tool)
After Step 0 reply, use the message tool to push a notification immediately to the group/channel:
# Agent implementation
message(
action="send",
target=group_id, # or channel_id for Discord
message="""🎵 开始生成音乐,请稍候…
• 模型:[Model Name]
• 预计耗时:[X ~ Y 秒]
• 消耗积分:[N pts]"""
)
User-facing message template:
🎵 开始生成音乐,请稍候…
• 模型:[Model Name]
• 预计耗时:[X ~ Y 秒]
• 消耗积分:[N pts]
Cost transparency:
- Balanced (Suno 25 pts): "使用 Suno(25 积分,功能最全)"
- DouBao alternatives (30 pts each): "使用 DouBao BGM(30 积分)" — only if user explicitly requests DouBao or background music type
Adapt language to match the user. English →
🎵 Starting music generation, please wait [X~Y] seconds…
Step 2 — Progress Updates
Poll the task detail API every 5s.
Send a progress update every [Send Progress Every] seconds per the table above.
⏳ 音乐生成中… [P]%
已等待 [elapsed]s,预计最长 [max]s
Progress formula:
P = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))
- Cap at 95% — never show 100% until the API returns
success - If
elapsed > estimated_max: keep P at 95% and append「快好了,稍等…」
Step 3 — Success Notification (Push audio via message tool)
When task status = success, use the message tool to send the generated audio directly (not as a text URL):
Agent implementation:
# Get result URL from script output or task detail API
result = get_task_result(task_id)
audio_url = result["medias"][0]["url"]
# Push audio + caption to group/channel
message(
action="send",
target=group_id,
media=audio_url, # Feishu/Discord will render the audio
caption=f"""✅ 音乐生成成功!
• 模型:[Model Name]
• 耗时:预计 [X~Y]s,实际 [actual]s
• 消耗积分:[N pts]
🔗 原始链接:{audio_url}"""
)
User-facing message:
✅ 音乐生成成功!
• 模型:[Model Name]
• 耗时:预计 [X~Y]s,实际 [actual]s
• 消耗积分:[N pts]
🔗 原始链接:https://ws.esxscloud.com/.../audio.wav
[音频直接显示为文件卡片,可点击播放]
Platform-specific notes:
- Feishu:
message(action=send, media=url, caption="...")— caption appears with audio file card - Discord: Audio embeds automatically from URL; caption can be in message text
- Telegram: Use
message(action=send, media=url, caption="...")
⚠️ Important:
- Always send audio via
mediaparameter (file card/player) + include URL in caption text - Do NOT use local file paths like
/tmp/audio.wav— use HTTP URL from API - Users expect: (1) clickable audio file card + (2) raw URL link for sharing/downloading
- Format:
media=audio_url+caption="...🔗 原始链接:{audio_url}"
Step 4 — Failure Notification (Push via message tool)
When task status = failed or any API/network error, push a failure message with alternative suggestions:
Agent implementation:
message(
action="send",
target=group_id,
message="""❌ 音乐生成失败
• 原因:[natural_language_error_message]
• 建议改用:
- [Alt Model 1]([特点],[N pts])
- [Alt Model 2]([特点],[N pts])
需要我帮你用其他模型重试吗?"""
)
⚠️ CRITICAL: Error Message Translation
NEVER show technical error messages to users. Always translate API errors into natural language.
API key & credits: 密钥与积分管理入口为 imaclaw.ai(与 imastudio.com 同属 IMA 平台)。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).
| Technical Error | ❌ Never Say | ✅ Say Instead (Chinese) | ✅ Say Instead (English) |
|---|---|---|---|
401 Unauthorized 🆕 |
Invalid API key / 401 Unauthorized | ❌ API密钥无效或未授权 💡 生成新密钥: https://www.imaclaw.ai/imaclaw/apikey |
❌ API key is invalid or unauthorized 💡 Generate API Key: https://www.imaclaw.ai/imaclaw/apikey |
4008 Insufficient points 🆕 |
Insufficient points / Error 4008 | ❌ 积分不足,无法创建任务 💡 购买积分: https://www.imaclaw.ai/imaclaw/subscription |
❌ Insufficient points to create this task 💡 Buy Credits: https://www.imaclaw.ai/imaclaw/subscription |
"Invalid product attribute" / "Insufficient points" |
Invalid product attribute | 生成参数配置异常,请稍后重试 | Configuration error, please try again later |
Error 6006 (credit mismatch) |
Error 6006 | 积分计算异常,系统正在修复 | Points calculation error, system is fixing |
Error 6010 (attribute_id mismatch) |
Attribute ID does not match | 模型参数不匹配,请尝试其他模型 | Model parameters incompatible, try another model |
error 400 (bad request) |
error 400 / Bad request | 音乐参数设置有误,请调整描述后重试 | Music parameter error, adjust description and retry |
resource_status == 2 |
Resource status 2 / Failed | 音乐生成遇到问题,建议换个模型试试 | Music generation failed, try another model |
status == "failed" (no details) |
Task failed | 这次生成没成功,要不换个模型试试? | Generation unsuccessful, try a different model? |
timeout |
Task timed out / Timeout error | 音乐生成时间过长已超时,建议用更快的模型 | Music generation took too long, try a faster model |
| Network error / Connection refused | Connection refused / Network error | 网络连接不稳定,请检查网络后重试 | Network connection unstable, check network and retry |
| Rate limit exceeded | 429 Too Many Requests / Rate limit | 请求过于频繁,请稍等片刻再试 | Too many requests, please wait a moment |
| Model unavailable | Model not available / 503 Service Unavailable | 当前模型暂时不可用,建议换个模型 | Model temporarily unavailable, try another model |
| Lyrics format error (Suno only) | Invalid lyrics format | 歌词格式有误,请调整后重试 | Lyrics format error, adjust and retry |
| Prompt too short/long | Prompt length invalid | 音乐描述过短或过长,请调整到合适长度 | Music description too short or long, adjust length |
Generic fallback (when error is unknown):
- Chinese:
音乐生成遇到问题,请稍后重试或换个模型试试 - English:
Music generation encountered an issue, please try again or use another model
Best Practices:
- Focus on user action: Tell users what to do next, not what went wrong technically
- Be reassuring: Use phrases like "建议换个模型试试" instead of "生成失败了"
- Avoid blame: Never say "你的描述有问题" → say "描述需要调整一下"
- Provide alternatives: Always suggest 1-2 alternative models in the failure message
- Music-specific:
- For Suno lyrics errors, suggest simplifying lyrics or using auto-generated lyrics
- For prompt length errors, give example length (e.g., "建议20-100字")
- For BGM requests, recommend DouBao BGM over Suno
- 🆕 Include actionable links (v1.0.8+): For 401/4008 errors, provide clickable links to API key generation or credit purchase pages
🆕 Enhanced Error Handling (v1.0.8):
Music generation uses direct error handling (no Reflection mechanism due to simpler parameters):
- 401 Unauthorized: System provides clickable link to API key generation page
- 4008 Insufficient Points: System provides clickable link to credit purchase page
- Other errors: Clear natural language explanations with alternative model suggestions
Error messages are user-friendly and actionable — users receive clear next steps for resolution.
Failure fallback table:
| Failed Model | First Alt | Second Alt |
|---|---|---|
| Suno | DouBao BGM(30pts,背景音乐) | DouBao Song(30pts,歌曲生成) |
| DouBao BGM | DouBao Song(30pts) | Suno(25pts,功能最强) |
| DouBao Song | DouBao BGM(30pts) | Suno(25pts,功能最强) |
Step 5 — Done (No Further Action Needed) 🆕
v1.1 Note: After completing Steps 0-4:
- ✅ Step 0 already sent your normal reply (appears FIRST in chat)
- ✅ Steps 1-4 pushed all updates via
messagetool (appear in order) - ✅ No further action needed — conversation is complete
Do NOT:
- ❌ Reply again with
NO_REPLY(you already replied in Step 0) - ❌ Send duplicate confirmation messages
- ❌ Use
messagetool to send the same content twice
Why this works:
User: "帮我生成一段轻松的背景音乐"
↓
[Step 0] Your normal reply: "好的!马上帮你生成轻松的背景音乐 🎵" ← Appears FIRST
↓
[Step 1] message tool push: "🎵 开始生成音乐..." ← Appears SECOND
↓
[Step 2] message tool push: "⏳ 正在生成中… 45%" ← (if task takes >15s)
↓
[Step 3] message tool push: "✅ 音乐生成成功! [Audio File]" ← Appears LAST
↓
[Step 5] Done. No further replies.
Supported Models
text_to_music (3 models)
| Name | model_id | version_id | Cost | Key form_config |
|---|---|---|---|---|
| Suno | sonic |
sonic |
25 pts | model_version=sonic-v5 (latest), custom_mode=true, make_instrumental, auto_lyrics, tags, negative_tags, vocal_gender, title |
| DouBao BGM | GenBGM |
GenBGM |
30 pts | — |
| DouBao Song | GenSong |
GenSong |
30 pts | — |
Model guidance:
- Suno: Most powerful option. Supports full custom mode with genre tags, explicit instrumental toggle, vocal gender selection, and negative tags to exclude unwanted styles.
- DouBao BGM: Lightweight background music generation. Ideal for ambient / background tracks.
- DouBao Song: Song generation. Good for structured vocal compositions.
What you can generate:
- Background music (lo-fi, ambient, cinematic, electronic, jazz, classical…)
- Custom jingles or theme songs with specific BPM and key
- Vocal or instrumental tracks with mood direction
- Short loops or full-length compositions
Prompt writing tips (for Suno gpt_description_prompt):
- Genre:
"lo-fi hip hop","orchestral cinematic","upbeat pop","dark ambient" - Tempo:
"80 BPM","fast tempo","slow ballad" - Vocals:
"no vocals"→ setmake_instrumental=true;"female vocals"→vocal_gender="female" - Mood:
"happy and energetic","melancholic","tense and dramatic" - Negative:
negative_tags="heavy metal, distortion"to exclude styles - Duration hint:
"60 seconds","30 second loop"
Environment
Base URL: https://api.imastudio.com
Required/recommended headers for all /open/v1/ endpoints:
| Header | Required | Value | Notes |
|---|---|---|---|
Authorization |
✅ | Bearer ima_your_api_key_here |
API key authentication |
x-app-source |
✅ | ima_skills |
Fixed value — identifies skill-originated requests |
x_app_language |
recommended | en / zh |
Product label language; defaults to en if omitted |
Authorization: Bearer ima_your_api_key_here
x-app-source: ima_skills
x_app_language: en
⚠️ MANDATORY: Always Query Product List First
CRITICAL: You MUST call
/open/v1/product/listBEFORE creating any task.
Theattribute_idfield is REQUIRED in the create request. If it is0or missing, you get:"Invalid product attribute"→"Insufficient points"→ task fails completely.
NEVER construct a create request from the model table alone. Always fetch the product first.
How to get attribute_id
# Step 1: Query product list
GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
# Step 2: Walk the tree to find your model
for group in response["data"]:
for version in group.get("children", []):
if version["type"] == "3" and version["model_id"] == target_model_id:
attribute_id = version["credit_rules"][0]["attribute_id"]
credit = version["credit_rules"][0]["points"]
model_version = version["id"]
model_name = version["name"]
Quick Reference: Known attribute_ids
⚠️ Production warning: attribute_id and credit values change frequently. Always call /open/v1/product/list at runtime; table below is pre-queried reference (2026-02-27).
| Model | model_id | attribute_id | credit | Notes |
|---|---|---|---|---|
| Suno (sonic-v4) | sonic |
2370 | 25 pts | Default |
| DouBao BGM | GenBGM |
4399 | 30 pts | BGM专用 |
| DouBao Song | GenSong |
4398 | 30 pts | 歌曲专用 |
| All others | — | → query /open/v1/product/list |
— | Always runtime query |
Common Mistakes (and resulting errors)
| Mistake | Error |
|---|---|
attribute_id is 0 or missing |
"Invalid product attribute" → Insufficient points |
attribute_id outdated (production changed) |
Same errors; always query product list first |
prompt at outer level |
Prompt ignored |
cast missing from inner parameters |
Billing failure |
Suno: model_version in parameters not set to sonic-v5 |
Wrong engine used |
Core Flow
1. GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
→ REQUIRED: Get attribute_id, credit, model_version, form_config defaults
2. POST /open/v1/tasks/create
→ Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)
3. POST /open/v1/tasks/detail {task_id: "..."}
→ Poll every 3–5s until medias[].resource_status == 1
→ Extract url from completed media (mp3)
Supported Task Types
| category | Capability | Input |
|---|---|---|
text_to_music |
Text → Music | prompt |
Detail API status values
| Field | Type | Values |
|---|---|---|
resource_status |
int or null |
0=处理中, 1=可用, 2=失败, 3=已删除;null 当作 0 |
status |
string | "pending", "processing", "success", "failed" |
resource_status |
status |
Action |
|---|---|---|
0 or null |
pending / processing |
Keep polling |
1 |
success (or completed) |
Stop when all medias are 1; read url |
1 |
failed |
Stop, handle error |
2 / 3 |
any | Stop, handle error |
Important: Treat
resource_status: nullas 0. Stop only when all medias haveresource_status == 1. Checkstatus != "failed"when rs=1.
API 1: Product List
GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
Returns a V2 tree structure: type=2 nodes are model groups, type=3 nodes are versions (leaves). Only type=3 nodes contain credit_rules and form_config.
How to pick a version:
- Traverse nodes to find
type=3leaves - Use
model_idandid(=model_version) from the leaf - Pick
credit_rules[].attribute_id - Use
form_config[].valueas defaultparametersvalues
API 2: Create Task
POST /open/v1/tasks/create
text_to_music
No image input. src_img_url: [], input_images: [].
{
"task_type": "text_to_music",
"enable_multi_model": false,
"src_img_url": [],
"parameters": [{
"attribute_id": "<from credit_rules>",
"model_id": "<model_id>",
"model_name": "<model_name>",
"model_version": "<version_id>",
"app": "ima",
"platform": "web",
"category": "text_to_music",
"credit": "<points>",
"parameters": {
"prompt": "upbeat electronic, 120 BPM, no vocals",
"n": 1,
"input_images": [],
"cast": {"points": "<points>", "attribute_id": "<attribute_id>"}
}
}]
}
Prompt tips for music generation:
- Genre:
"upbeat electronic","classical piano","ambient chill" - Tempo:
"120 BPM","slow tempo" - Vocals:
"no vocals","male vocals","female vocals" - Mood:
"happy","melancholic","energetic" - Duration hint:
"60 seconds","short loop"
Key fields:
| Field | Required | Description |
|---|---|---|
parameters[].credit |
✅ | Must equal credit_rules[].points. Error 6006 if wrong. |
parameters[].parameters.prompt |
✅ | Prompt must be nested here, NOT at top level. |
parameters[].parameters.cast |
✅ | {"points": N, "attribute_id": N} — mirror of credit. |
parameters[].parameters.n |
✅ | Number of outputs (usually 1). |
Response: data.id = task ID for polling.
API 3: Task Detail (Poll)
POST /open/v1/tasks/detail
{"task_id": "<id from create response>"}
Poll every 3–5s. Completed response:
{
"id": "task_abc",
"medias": [{
"resource_status": 1,
"url": "https://cdn.../output.mp3",
"duration_str": "60s",
"format": "mp3"
}]
}
Output fields: url (mp3), duration_str, format.
Common Mistakes
| Mistake | Fix |
|---|---|
Placing prompt at param top-level |
prompt must be inside parameters[].parameters |
Wrong credit value |
Must exactly match credit_rules[].points (error 6006) |
Missing app / platform in parameters |
Required — use ima / web |
| Single-poll instead of loop | Poll until resource_status == 1 for ALL medias |
Not checking status != "failed" |
resource_status=1 + status="failed" = actual failure |
Python Example
import time
import requests
BASE_URL = "https://api.imastudio.com"
API_KEY = "ima_your_key_here"
HEADERS = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
"x-app-source": "ima_skills",
"x_app_language": "en",
}
def get_products(category: str) -> list:
"""Returns flat list of type=3 version nodes from V2 tree."""
r = requests.get(
f"{BASE_URL}/open/v1/product/list",
headers=HEADERS,
params={"app": "ima", "platform": "web", "category": category},
)
r.raise_for_status()
nodes = r.json()["data"]
versions = []
for node in nodes:
for child in node.get("children") or []:
if child.get("type") == "3":
versions.append(child)
for gc in child.get("children") or []:
if gc.get("type") == "3":
versions.append(gc)
return versions
def create_music_task(prompt: str, product: dict) -> str:
"""Returns task_id."""
rule = product["credit_rules"][0]
form_defaults = {f["field"]: f["value"] for f in product.get("form_config", []) if f.get("value") is not None}
nested_params = {
"prompt": prompt,
"n": 1,
"input_images": [],
"cast": {"points": rule["points"], "attribute_id": rule["attribute_id"]},
**form_defaults,
}
body = {
"task_type": "text_to_music",
"enable_multi_model": False,
"src_img_url": [],
"parameters": [{
"attribute_id": rule["attribute_id"],
"model_id": product["model_id"],
"model_name": product["name"],
"model_version": product["id"],
"app": "ima",
"platform": "web",
"category": "text_to_music",
"credit": rule["points"],
"parameters": nested_params,
}],
}
r = requests.post(f"{BASE_URL}/open/v1/tasks/create", headers=HEADERS, json=body)
r.raise_for_status()
return r.json()["data"]["id"]
def poll(task_id: str, interval: int = 3, timeout: int = 300) -> dict:
deadline = time.time() + timeout
while time.time() < deadline:
r = requests.post(f"{BASE_URL}/open/v1/tasks/detail", headers=HEADERS, json={"task_id": task_id})
r.raise_for_status()
task = r.json()["data"]
medias = task.get("medias", [])
if medias:
if any(m.get("status") == "failed" for m in medias):
raise RuntimeError(f"Task failed: {task_id}")
rs = lambda m: m.get("resource_status") if m.get("resource_status") is not None else 0
if any(rs(m) == 2 for m in medias):
raise RuntimeError(f"Task failed: {task_id}")
if all(rs(m) == 1 for m in medias):
return task
time.sleep(interval)
raise TimeoutError(f"Task timed out: {task_id}")
# text_to_music
products = get_products("text_to_music")
task_id = create_music_task("upbeat electronic, 120 BPM, no vocals", products[0])
result = poll(task_id)
print(result["medias"][0]["url"]) # mp3 URL
print(result["medias"][0]["duration_str"]) # e.g. "60s"
Supported Models & Search Terms
Models: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong)
Capabilities: music generation, text-to-music, AI music, background music, BGM, soundtrack, jingle, song with lyrics, vocal, instrumental, ambient music, audio generation