Skip to main content
Generaldvcrn

IMA Studio Music Generation

Top-tier AI music generation with models: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong). One-stop text-to-music with custom mode, lyrics, vocal control, and style tags. Requires IMA API key. Optional: ima-knowledge-ai for workflow and model guidance when installed. Use for: music generation, text-to-music, background music, songs with lyrics, soundtrack, jingles, ambient music, vocal and instrumental tracks. Output: MP3/WAV.

Stars
15
Source
dvcrn/openclaw-skills-marketplace
Updated
2026-05-29
Slug
dvcrn--openclaw-skills-marketplace--ima-studio-music-generation
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/dvcrn/openclaw-skills-marketplace/HEAD/plugins/allenfancy-gan--ima-voice-ai/skills/ima-studio-music-generation/SKILL.md -o .claude/skills/ima-studio-music-generation.md

Drops the SKILL.md into .claude/skills/ima-studio-music-generation.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

IMA Voice AI Creation

Scope & Dependencies (Declared for Transparency)

  • Credentials: This skill requires an IMA API key at runtime (IMA_API_KEY or --api-key). The key is sent only to api.imastudio.com. Obtain keys at https://imastudio.com. Declared in registry as required.
  • Optional dependency: When ima-knowledge-ai is installed, this skill may instruct the agent to read that skill's reference files (~/.openclaw/skills/ima-knowledge-ai/references/*) for workflow and model-selection guidance. This skill is self-contained — it works fully without ima-knowledge-ai. Reading another skill's files is optional and only for complex or multi-step tasks; users who do not have or trust ima-knowledge-ai can ignore those steps and use this skill's built-in defaults and 📥 User Input Parsing tables.
  • Local paths: This skill reads/writes ~/.openclaw/memory/ima_prefs.json (preferences) and ~/.openclaw/logs/ima_skills/ (logs; auto-deleted after 7 days). User can delete these anytime.

Optional: Read Knowledge Base (When ima-knowledge-ai Is Installed)

If ima-knowledge-ai is not installed: Skip this section. Use only this SKILL's default models and the 📥 User Input Parsing tables for model_id and parameters.

When ima-knowledge-ai is installed and the task is complex, you may optionally read its reference files for better workflow and model choice:

  1. Workflow complexity — Read ima-knowledge-ai/references/workflow-design.md if:

    • User mentions: "MV"、"配乐"、"完整作品"、"多步骤"、"soundtrack"
    • Task involves: video + music coordination, multi-track production, integrated workflows
    • Complex requirements that need task decomposition
  2. Model selection — Read ima-knowledge-ai/references/model-selection.md if:

    • Unsure which model to use (Suno vs DouBao BGM vs DouBao Song)
    • Need cost/quality trade-off guidance
    • User specifies budget or quality requirements

Why this is optional:

  • Music generation is often part of a larger workflow (video + music, story + soundtrack)
  • For simple single-track requests, proceed directly with this skill's defaults
  • For complex workflows, reading the knowledge base can improve task decomposition and model choice

Example workflow case (when using optional knowledge base):

User: "帮我做个产品宣传MV,有背景音乐"

❌ Wrong: 直接生成音乐 (music alone, no coordination with video)

✅ Right (if ima-knowledge-ai available): 
  1. Read workflow-design.md
  2. Decompose: Script → Video shots → Background music (matching video duration/mood)
  3. Generate video first (get duration)
  4. Generate BGM with matching duration and style

How to check (optional):

# Only if ima-knowledge-ai is installed and task is complex
if ima_knowledge_ai_installed and (complex_workflow or multi_step):
    read("~/.openclaw/skills/ima-knowledge-ai/references/workflow-design.md")

if ima_knowledge_ai_installed and unsure_model_choice:
    read("~/.openclaw/skills/ima-knowledge-ai/references/model-selection.md")

# Choose model (this skill's logic works with or without knowledge base)
if "background music" or "BGM" or "instrumental":
    use_doubao_bgm()  # 30pts, pure instrumental
elif "song" or "lyrics" or "vocals":
    use_suno_sonic()  # 25pts, full-featured with lyrics
else:
    use_suno_sonic()  # Default: most versatile

For simple requests: Proceed directly with this skill's defaults. No need to read other skills' files.


📥 User Input Parsing (Model & Parameter Recognition)

Purpose: So that any agent (Claude or other models) parses user intent consistently, follow these rules when deriving model_id and task type from natural language. Normalize first, then map.

1. User phrasing → model selection (model_id)

User intent / phrasing model_id Notes
BGM / 背景音乐 / 纯音乐 / 无人声 / instrumental / 配乐 GenBGM DouBao BGM, 30 pts, ~30s
歌 / 歌曲 / 带歌词 / 人声 / song / lyrics / 有唱 sonic or GenSong Suno (25 pts, ~2min) or DouBao Song (30 pts, ~30s)
Suno / 苏诺 / sonic sonic Full-featured, lyrics, vocal_gender, 25 pts
豆包 BGM / DouBao BGM / BGM GenBGM 30 pts
豆包 歌曲 / DouBao Song / 豆包歌 GenSong 30 pts
最便宜 / 最省钱 / cheapest / budget GenBGM or GenSong (6 pts 档 if available) Only if user explicitly asks for cheapest
最好 / 最全功能 / best / 带歌词可调 sonic Suno default

If the user does not specify, default to Suno (sonic) for versatility. For "背景音乐"/"BGM"/"配乐" only → use DouBao BGM (GenBGM).

2. Music-specific parameters (Suno)

User says (examples) Parameter Action
无人声 / 纯音乐 / no vocals / instrumental make_instrumental true
女声 / 女声演唱 / female vocals vocal_gender "female" (custom_mode true)
男声 / 男声演唱 / male vocals vocal_gender "male" (custom_mode true)
我写歌词 / 自定义歌词 / custom lyrics custom_mode + lyrics Provide lyrics in request

When using Suno with lyrics or vocal control, set custom_mode: true and pass lyrics / vocal_gender per API docs.


⚙️ How This Skill Works

For transparency: This skill uses a bundled Python script (scripts/ima_voice_create.py) to call the IMA Open API. The script:

  • Sends your prompt to https://api.imastudio.com (IMA's servers)
  • Uses --user-id only locally as a key for storing your model preferences
  • Returns a music URL when generation is complete
  • NEW (v1.1.0): Automatic reflection mechanism — if generation fails, the script automatically retries up to 3 times with smart parameter adjustments

🧠 Reflection Mechanism (Automatic Error Recovery)

This skill now includes an intelligent reflection system that automatically recovers from common errors:

3-Layer Retry Strategy:

  1. Attempt 1: Original Parameters

    • Uses your provided parameters with smart credit_rule selection
    • Most tasks succeed on first try
  2. Attempt 2: Strict Match (Error 6009 Fix)

    • Automatically removes unsupported parameters
    • Only keeps parameters in credit_rules.attributes
    • Example: Removes unsupported Suno parameters if not in model config
  3. Attempt 3: Fallback to Default (Error 6010 Fix)

    • Uses model's default configuration
    • Uses credit_rules[0] (first rule = safest default)
    • Guarantees maximum compatibility

Common Errors Fixed Automatically:

  • Error 6009: "No exact rule match found for parameters" → removes unsupported params
  • Error 6010: "Attribute ID does not match" → corrects attribute_id to match params
  • Invalid product attribute: Uses default rule configuration

User Experience:

  • Transparent: Shows reflection log when retries happen
  • Fast: Most tasks succeed on first attempt (no delay)
  • Smart: Learns from errors and adjusts automatically
  • User-friendly: Provides helpful suggestions if all 3 attempts fail

Example Output (with reflection):

🚀 Creating music generation task…

🧠 反省日志 (2 次尝试):
   ❌ [尝试 1] Invalid product attribute → 移除不支持的参数: ['unsupported_param']
   ✅ [尝试 2] ✅ 成功(尝试 2)

✅ Task created: task_abc123xyz

What gets sent to IMA servers:

  • ✅ Your music prompt/description
  • ✅ Model selection (Suno/DouBao)
  • ❌ NO API key in prompts (key is used for authentication only)
  • ❌ NO user_id (it's only used locally)

What's stored locally:

  • ~/.openclaw/memory/ima_prefs.json - Your model preferences (< 1 KB)
  • ~/.openclaw/logs/ima_skills/ - Generation logs (auto-deleted after 7 days)

Agent Execution (Internal Reference)

Note for users: You can review the script source at scripts/ima_voice_create.py anytime.
This skill uses only api.imastudio.com (no image upload needed for music generation, unlike image/video skills which also use imapi.liveme.com).

Use the bundled script internally:

# Generate music — Suno sonic-v5
python3 {baseDir}/scripts/ima_voice_create.py \
  --api-key  $IMA_API_KEY \
  --task-type text_to_music \
  --model-id  sonic \
  --prompt   "upbeat lo-fi hip hop, 90 BPM, no vocals" \
  --user-id  {user_id} \
  --output-json

# DouBao BGM
python3 {baseDir}/scripts/ima_voice_create.py \
  --api-key  $IMA_API_KEY \
  --model-id  GenBGM \
  --prompt   "calm ambient piano background" \
  --user-id  {user_id} \
  --output-json

The script outputs JSON — parse it to get the result URL and pass it to the user via the UX protocol messages below.


Overview

Call IMA Open API to create AI-generated music/audio. All endpoints require an ima_* API key. The core flow is: query products → create task → poll until done.


🔒 Security & Transparency Policy

This skill is community-maintained and open for inspection.

🌐 Network Architecture

This skill uses a simpler network architecture than image/video skills:

Skill Type Domains Used Why
ima-voice-ai (this skill) api.imastudio.com only Music generation doesn't require image uploads
ima-image-ai, ima-video-ai api.imastudio.com + imapi.liveme.com Image/video tasks need image upload service

Why the difference?

  • Music generation (text_to_music) only needs text prompts → single API endpoint
  • Image/video generation (i2i, i2v tasks) needs image file uploads → requires separate upload service

Security verification:

# Verify this skill only uses api.imastudio.com:
grep -n "https://" scripts/ima_voice_create.py

# Expected output:
# Only https://api.imastudio.com (no imapi.liveme.com)

✅ What Users CAN Do

Full transparency:

  • Review all source code: Check scripts/ima_voice_create.py and ima_logger.py anytime
  • Verify network calls: This skill uses only api.imastudio.com (music generation doesn't require image uploads). Verify by running: grep -n "https://" scripts/ima_voice_create.py
  • Inspect local data: View ~/.openclaw/memory/ima_prefs.json and log files
  • Control privacy: Delete preferences/logs anytime, or disable file writes (see below)

Configuration allowed:

  • Set API key in environment or agent config:
    • Environment variable: export IMA_API_KEY=ima_your_key_here
    • OpenClaw/MCP config: Add IMA_API_KEY to agent's environment configuration
    • Get your key at: https://imastudio.com
  • Use scoped/test keys: Test with limited API keys, rotate after testing
  • Disable file writes: Make prefs/logs read-only or symlink to /dev/null

Data control:

  • View stored data: cat ~/.openclaw/memory/ima_prefs.json
  • Delete preferences: rm ~/.openclaw/memory/ima_prefs.json (resets to defaults)
  • Delete logs: rm -rf ~/.openclaw/logs/ima_skills/ (auto-cleanup after 7 days anyway)

⚠️ Advanced Users: Fork & Modify

If you need to modify this skill for your use case:

  1. Fork the repository (don't modify the original)
  2. Update your fork with your changes
  3. Test thoroughly with limited API keys
  4. Document your changes for troubleshooting

Note: Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version.

❌ What to AVOID (Security Risks)

Actions that could compromise security:

  • ❌ Sharing API keys publicly or in skill files
  • ❌ Modifying API endpoints to unknown servers
  • ❌ Disabling SSL/TLS certificate verification
  • ❌ Logging sensitive user data (prompts, IDs, etc.)
  • ❌ Bypassing authentication or billing mechanisms

Why this matters:

  1. API Compatibility: Skill logic aligns with IMA Open API schema
  2. Security: Malicious modifications could leak credentials or bypass billing
  3. Support: Modified skills may not be supported
  4. Community: Breaking changes affect all users

📁 File System Access (Declared)

This skill reads/writes the following files:

Path Purpose Size Auto-cleanup User Control
~/.openclaw/memory/ima_prefs.json User model preferences < 1 KB No Delete anytime
~/.openclaw/logs/ima_skills/ Generation logs ~10-50 KB/day 7 days Delete anytime

What's stored:

  • ✅ Model preferences (e.g., "last used: Suno sonic-v5")
  • ✅ Timestamps (e.g., "2026-02-27 12:34:56")
  • ✅ Task IDs and HTTP status codes
  • ❌ NO API keys
  • ❌ NO personal data
  • ❌ NO prompts or generated content

Full transparency: See the complete data flow and privacy policy in the skill documentation above.

📋 Privacy & Data Handling Summary

What this skill does with your data:

Data Type Sent to IMA? Stored Locally? User Control
Music prompts ✅ Yes (required for generation) ❌ No None (required)
API key ✅ Yes (authentication header) ❌ No Set via env var
user_id (optional CLI arg) Never (local preference key only) ✅ Yes (as prefs file key) Change --user-id value
Model preferences ❌ No ✅ Yes (~/.openclaw) Delete anytime
Generation logs ❌ No ✅ Yes (~/.openclaw) Auto-cleanup 7 days

Privacy recommendations:

  1. Use test/scoped API keys for initial testing
  2. Note: --user-id is never sent to IMA servers - it's only used locally as a key for storing preferences in ~/.openclaw/memory/ima_prefs.json
  3. Review source code at scripts/ima_voice_create.py to verify network calls (search for create_task function)
  4. Rotate API keys after testing or if compromised

Get your IMA API key: Visit https://imastudio.com to register and get started.

🔧 For Skill Maintainers Only

Version control:

  • All changes must go through Git with proper version bumps (semver)
  • CHANGELOG.md must document all changes
  • Production deployments require code review

File checksums (optional):

# Verify skill integrity
sha256sum SKILL.md scripts/ima_voice_create.py

If users report issues, verify file integrity first.


🧠 User Preference Memory

User preferences override recommended defaults. If a user has generated before, use their preferred model — not the system default.

Storage: ~/.openclaw/memory/ima_prefs.json

{
  "user_{user_id}": {
    "text_to_music": { "model_id": "sonic", "model_name": "Suno", "credit": 25, "last_used": "..." }
  }
}

If the file or key doesn't exist, fall back to the ⭐ Recommended Defaults below.

When to Read (Before Every Generation)

  1. Load ~/.openclaw/memory/ima_prefs.json (silently, no error if missing)
  2. Look up user_{user_id}.text_to_music
  3. If found → use that model; mention it:
    🎵 根据你的使用习惯,将用 [Model Name] 帮你生成音乐…
    • 模型:[Model Name](你的常用模型)
    • 预计耗时:[X ~ Y 秒]
    • 消耗积分:[N pts]
    
  4. If not found → use the ⭐ Recommended Default (Suno sonic-v5)

When to Write (After Every Successful Generation)

Save the used model to ~/.openclaw/memory/ima_prefs.json under user_{user_id}.text_to_music.
See ima-image-ai/SKILL.md → "User Preference Memory" for the full Python write snippet.

When to Update (User Explicitly Changes Model)

Trigger Action
用XXX / 换成XXX Switch + save as new preference
以后都用XXX / always use XXX Save + confirm: ✅ 已记住!以后音乐生成默认用 [XXX]
用便宜的 / cheapest Use DouBao BGM/Song; do NOT save unless user says "以后都用"

⭐ Recommended Defaults

These are fallback defaults — only used when no user preference exists.
Always default to the newest and most popular model. Do NOT default to the cheapest.

Task Default Model model_id model_version Cost Why
text_to_music Suno (sonic-v5) sonic sonic 25 pts Latest Suno engine, best quality
text_to_music (BGM only) DouBao BGM GenBGM GenBGM 30 pts Background music
text_to_music (song) DouBao Song GenSong GenSong 30 pts Song generation

Selection guide by use case:

  • Custom song with lyrics, vocals, style → Suno sonic-v5 (default)
  • Background music / ambient loop → DouBao BGM
  • Simple song generation → DouBao Song
  • User explicitly asks for cheapest → DouBao BGM/Song (6pts each) — only if explicitly requested

⚠️ For Suno: model_version inside parameters (e.g. sonic-v5) is different from the outer model_version field (which is sonic). Always set both.


💬 User Experience Protocol (IM / Feishu / Discord) v1.1 🆕

v1.1 Update: Added Step 0 to ensure correct message ordering in group chats (learned from ima-image-ai v1.2).

Music generation completes in 10~45 seconds. Never let users wait in silence.
Always follow all 5 steps below, every single time.

🚫 Never Say to Users

❌ Never say ✅ What users care about
ima_voice_create.py / 脚本 / script
自动化脚本 / automation
自动处理产品列表 / 查询接口
自动解析参数 / 智能轮询
attribute_id / model_version / form_config
API 调用 / HTTP 请求 / 任何技术参数名

Only tell users: model name · estimated time · credits · result (audio file/player) · plain-language status.


Estimated Generation Time per Model

Model Estimated Time Poll Every Send Progress Every
DouBao BGM 10~25s 5s 10s
DouBao Song 10~25s 5s 10s
Suno (sonic-v5) 20~45s 5s 15s

estimated_max_seconds = upper bound (e.g. 45 for Suno, 25 for DouBao BGM/Song).


Step 0 — Initial Acknowledgment Reply (Normal Reply) 🆕

⚠️ CRITICAL: This step ensures correct message ordering in group chats.

Before doing anything else, reply to the user with a friendly acknowledgment message using your normal reply (not message tool). This reply will automatically appear FIRST in the conversation.

Example acknowledgment messages:

好的!马上帮你生成轻松的背景音乐 🎵
收到!用 Suno 为你创作一首歌 🎶
OK! Starting music generation with DouBao BGM 🎵

Rules:

  • Keep it short and warm (< 15 words)
  • Match the user's language (Chinese/English)
  • Include relevant emoji (🎵/🎶/🎸/🎹)
  • This is your ONLY normal reply — all subsequent updates use message tool

Why this matters:

  • Normal replies automatically appear FIRST in the conversation thread
  • message tool pushes appear in chronological order AFTER your initial reply
  • This ensures users see: "好的!" → "🎵 开始生成..." → "✅ 成功!" (correct order)

Step 1 — Pre-Generation Notification (Push via message tool)

After Step 0 reply, use the message tool to push a notification immediately to the group/channel:

# Agent implementation
message(
    action="send",
    target=group_id,  # or channel_id for Discord
    message="""🎵 开始生成音乐,请稍候…
• 模型:[Model Name]
• 预计耗时:[X ~ Y 秒]
• 消耗积分:[N pts]"""
)

User-facing message template:

🎵 开始生成音乐,请稍候…
• 模型:[Model Name]
• 预计耗时:[X ~ Y 秒]
• 消耗积分:[N pts]

Cost transparency:

  • Balanced (Suno 25 pts): "使用 Suno(25 积分,功能最全)"
  • DouBao alternatives (30 pts each): "使用 DouBao BGM(30 积分)" — only if user explicitly requests DouBao or background music type

Adapt language to match the user. English → 🎵 Starting music generation, please wait [X~Y] seconds…


Step 2 — Progress Updates

Poll the task detail API every 5s.
Send a progress update every [Send Progress Every] seconds per the table above.

⏳ 音乐生成中… [P]%
已等待 [elapsed]s,预计最长 [max]s

Progress formula:

P = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))
  • Cap at 95% — never show 100% until the API returns success
  • If elapsed > estimated_max: keep P at 95% and append 「快好了,稍等…」

Step 3 — Success Notification (Push audio via message tool)

When task status = success, use the message tool to send the generated audio directly (not as a text URL):

Agent implementation:

# Get result URL from script output or task detail API
result = get_task_result(task_id)
audio_url = result["medias"][0]["url"]

# Push audio + caption to group/channel
message(
    action="send",
    target=group_id,
    media=audio_url,  # Feishu/Discord will render the audio
    caption=f"""✅ 音乐生成成功!
• 模型:[Model Name]
• 耗时:预计 [X~Y]s,实际 [actual]s
• 消耗积分:[N pts]

🔗 原始链接:{audio_url}"""
)

User-facing message:

✅ 音乐生成成功!
• 模型:[Model Name]
• 耗时:预计 [X~Y]s,实际 [actual]s
• 消耗积分:[N pts]

🔗 原始链接:https://ws.esxscloud.com/.../audio.wav

[音频直接显示为文件卡片,可点击播放]

Platform-specific notes:

  • Feishu: message(action=send, media=url, caption="...") — caption appears with audio file card
  • Discord: Audio embeds automatically from URL; caption can be in message text
  • Telegram: Use message(action=send, media=url, caption="...")

⚠️ Important:

  • Always send audio via media parameter (file card/player) + include URL in caption text
  • Do NOT use local file paths like /tmp/audio.wav — use HTTP URL from API
  • Users expect: (1) clickable audio file card + (2) raw URL link for sharing/downloading
  • Format: media=audio_url + caption="...🔗 原始链接:{audio_url}"

Step 4 — Failure Notification (Push via message tool)

When task status = failed or any API/network error, push a failure message with alternative suggestions:

Agent implementation:

message(
    action="send",
    target=group_id,
    message="""❌ 音乐生成失败
• 原因:[natural_language_error_message]
• 建议改用:
  - [Alt Model 1]([特点],[N pts])
  - [Alt Model 2]([特点],[N pts])

需要我帮你用其他模型重试吗?"""
)

⚠️ CRITICAL: Error Message Translation

NEVER show technical error messages to users. Always translate API errors into natural language.
API key & credits: 密钥与积分管理入口为 imaclaw.ai(与 imastudio.com 同属 IMA 平台)。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).

Technical Error ❌ Never Say ✅ Say Instead (Chinese) ✅ Say Instead (English)
401 Unauthorized 🆕 Invalid API key / 401 Unauthorized ❌ API密钥无效或未授权
💡 生成新密钥: https://www.imaclaw.ai/imaclaw/apikey
❌ API key is invalid or unauthorized
💡 Generate API Key: https://www.imaclaw.ai/imaclaw/apikey
4008 Insufficient points 🆕 Insufficient points / Error 4008 ❌ 积分不足,无法创建任务
💡 购买积分: https://www.imaclaw.ai/imaclaw/subscription
❌ Insufficient points to create this task
💡 Buy Credits: https://www.imaclaw.ai/imaclaw/subscription
"Invalid product attribute" / "Insufficient points" Invalid product attribute 生成参数配置异常,请稍后重试 Configuration error, please try again later
Error 6006 (credit mismatch) Error 6006 积分计算异常,系统正在修复 Points calculation error, system is fixing
Error 6010 (attribute_id mismatch) Attribute ID does not match 模型参数不匹配,请尝试其他模型 Model parameters incompatible, try another model
error 400 (bad request) error 400 / Bad request 音乐参数设置有误,请调整描述后重试 Music parameter error, adjust description and retry
resource_status == 2 Resource status 2 / Failed 音乐生成遇到问题,建议换个模型试试 Music generation failed, try another model
status == "failed" (no details) Task failed 这次生成没成功,要不换个模型试试? Generation unsuccessful, try a different model?
timeout Task timed out / Timeout error 音乐生成时间过长已超时,建议用更快的模型 Music generation took too long, try a faster model
Network error / Connection refused Connection refused / Network error 网络连接不稳定,请检查网络后重试 Network connection unstable, check network and retry
Rate limit exceeded 429 Too Many Requests / Rate limit 请求过于频繁,请稍等片刻再试 Too many requests, please wait a moment
Model unavailable Model not available / 503 Service Unavailable 当前模型暂时不可用,建议换个模型 Model temporarily unavailable, try another model
Lyrics format error (Suno only) Invalid lyrics format 歌词格式有误,请调整后重试 Lyrics format error, adjust and retry
Prompt too short/long Prompt length invalid 音乐描述过短或过长,请调整到合适长度 Music description too short or long, adjust length

Generic fallback (when error is unknown):

  • Chinese: 音乐生成遇到问题,请稍后重试或换个模型试试
  • English: Music generation encountered an issue, please try again or use another model

Best Practices:

  1. Focus on user action: Tell users what to do next, not what went wrong technically
  2. Be reassuring: Use phrases like "建议换个模型试试" instead of "生成失败了"
  3. Avoid blame: Never say "你的描述有问题" → say "描述需要调整一下"
  4. Provide alternatives: Always suggest 1-2 alternative models in the failure message
  5. Music-specific:
    • For Suno lyrics errors, suggest simplifying lyrics or using auto-generated lyrics
    • For prompt length errors, give example length (e.g., "建议20-100字")
    • For BGM requests, recommend DouBao BGM over Suno
  6. 🆕 Include actionable links (v1.0.8+): For 401/4008 errors, provide clickable links to API key generation or credit purchase pages

🆕 Enhanced Error Handling (v1.0.8):

Music generation uses direct error handling (no Reflection mechanism due to simpler parameters):

  • 401 Unauthorized: System provides clickable link to API key generation page
  • 4008 Insufficient Points: System provides clickable link to credit purchase page
  • Other errors: Clear natural language explanations with alternative model suggestions

Error messages are user-friendly and actionable — users receive clear next steps for resolution.

Failure fallback table:

Failed Model First Alt Second Alt
Suno DouBao BGM(30pts,背景音乐) DouBao Song(30pts,歌曲生成)
DouBao BGM DouBao Song(30pts) Suno(25pts,功能最强)
DouBao Song DouBao BGM(30pts) Suno(25pts,功能最强)

Step 5 — Done (No Further Action Needed) 🆕

v1.1 Note: After completing Steps 0-4:

  • Step 0 already sent your normal reply (appears FIRST in chat)
  • Steps 1-4 pushed all updates via message tool (appear in order)
  • No further action needed — conversation is complete

Do NOT:

  • ❌ Reply again with NO_REPLY (you already replied in Step 0)
  • ❌ Send duplicate confirmation messages
  • ❌ Use message tool to send the same content twice

Why this works:

User: "帮我生成一段轻松的背景音乐"
  ↓
[Step 0] Your normal reply:  "好的!马上帮你生成轻松的背景音乐 🎵"  ← Appears FIRST
  ↓
[Step 1] message tool push:  "🎵 开始生成音乐..."  ← Appears SECOND
  ↓
[Step 2] message tool push:  "⏳ 正在生成中… 45%"  ← (if task takes >15s)
  ↓
[Step 3] message tool push:  "✅ 音乐生成成功! [Audio File]"  ← Appears LAST
  ↓
[Step 5] Done. No further replies.

Supported Models

text_to_music (3 models)

Name model_id version_id Cost Key form_config
Suno sonic sonic 25 pts model_version=sonic-v5 (latest), custom_mode=true, make_instrumental, auto_lyrics, tags, negative_tags, vocal_gender, title
DouBao BGM GenBGM GenBGM 30 pts
DouBao Song GenSong GenSong 30 pts

Model guidance:

  • Suno: Most powerful option. Supports full custom mode with genre tags, explicit instrumental toggle, vocal gender selection, and negative tags to exclude unwanted styles.
  • DouBao BGM: Lightweight background music generation. Ideal for ambient / background tracks.
  • DouBao Song: Song generation. Good for structured vocal compositions.

What you can generate:

  • Background music (lo-fi, ambient, cinematic, electronic, jazz, classical…)
  • Custom jingles or theme songs with specific BPM and key
  • Vocal or instrumental tracks with mood direction
  • Short loops or full-length compositions

Prompt writing tips (for Suno gpt_description_prompt):

  • Genre: "lo-fi hip hop", "orchestral cinematic", "upbeat pop", "dark ambient"
  • Tempo: "80 BPM", "fast tempo", "slow ballad"
  • Vocals: "no vocals" → set make_instrumental=true; "female vocals"vocal_gender="female"
  • Mood: "happy and energetic", "melancholic", "tense and dramatic"
  • Negative: negative_tags="heavy metal, distortion" to exclude styles
  • Duration hint: "60 seconds", "30 second loop"

Environment

Base URL: https://api.imastudio.com

Required/recommended headers for all /open/v1/ endpoints:

Header Required Value Notes
Authorization Bearer ima_your_api_key_here API key authentication
x-app-source ima_skills Fixed value — identifies skill-originated requests
x_app_language recommended en / zh Product label language; defaults to en if omitted
Authorization: Bearer ima_your_api_key_here
x-app-source: ima_skills
x_app_language: en

⚠️ MANDATORY: Always Query Product List First

CRITICAL: You MUST call /open/v1/product/list BEFORE creating any task.
The attribute_id field is REQUIRED in the create request. If it is 0 or missing, you get:
"Invalid product attribute""Insufficient points" → task fails completely.
NEVER construct a create request from the model table alone. Always fetch the product first.

How to get attribute_id

# Step 1: Query product list
GET /open/v1/product/list?app=ima&platform=web&category=text_to_music

# Step 2: Walk the tree to find your model
for group in response["data"]:
    for version in group.get("children", []):
        if version["type"] == "3" and version["model_id"] == target_model_id:
            attribute_id  = version["credit_rules"][0]["attribute_id"]
            credit        = version["credit_rules"][0]["points"]
            model_version = version["id"]
            model_name    = version["name"]

Quick Reference: Known attribute_ids

⚠️ Production warning: attribute_id and credit values change frequently. Always call /open/v1/product/list at runtime; table below is pre-queried reference (2026-02-27).

Model model_id attribute_id credit Notes
Suno (sonic-v4) sonic 2370 25 pts Default
DouBao BGM GenBGM 4399 30 pts BGM专用
DouBao Song GenSong 4398 30 pts 歌曲专用
All others → query /open/v1/product/list Always runtime query

Common Mistakes (and resulting errors)

Mistake Error
attribute_id is 0 or missing "Invalid product attribute" → Insufficient points
attribute_id outdated (production changed) Same errors; always query product list first
prompt at outer level Prompt ignored
cast missing from inner parameters Billing failure
Suno: model_version in parameters not set to sonic-v5 Wrong engine used

Core Flow

1. GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
   → REQUIRED: Get attribute_id, credit, model_version, form_config defaults

2. POST /open/v1/tasks/create
   → Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)

3. POST /open/v1/tasks/detail  {task_id: "..."}
   → Poll every 3–5s until medias[].resource_status == 1
   → Extract url from completed media (mp3)

Supported Task Types

category Capability Input
text_to_music Text → Music prompt

Detail API status values

Field Type Values
resource_status int or null 0=处理中, 1=可用, 2=失败, 3=已删除;null 当作 0
status string "pending", "processing", "success", "failed"
resource_status status Action
0 or null pending / processing Keep polling
1 success (or completed) Stop when all medias are 1; read url
1 failed Stop, handle error
2 / 3 any Stop, handle error

Important: Treat resource_status: null as 0. Stop only when all medias have resource_status == 1. Check status != "failed" when rs=1.


API 1: Product List

GET /open/v1/product/list?app=ima&platform=web&category=text_to_music

Returns a V2 tree structure: type=2 nodes are model groups, type=3 nodes are versions (leaves). Only type=3 nodes contain credit_rules and form_config.

How to pick a version:

  1. Traverse nodes to find type=3 leaves
  2. Use model_id and id (= model_version) from the leaf
  3. Pick credit_rules[].attribute_id
  4. Use form_config[].value as default parameters values

API 2: Create Task

POST /open/v1/tasks/create

text_to_music

No image input. src_img_url: [], input_images: [].

{
  "task_type": "text_to_music",
  "enable_multi_model": false,
  "src_img_url": [],
  "parameters": [{
    "attribute_id":  "<from credit_rules>",
    "model_id":      "<model_id>",
    "model_name":    "<model_name>",
    "model_version": "<version_id>",
    "app":           "ima",
    "platform":      "web",
    "category":      "text_to_music",
    "credit":        "<points>",
    "parameters": {
      "prompt":       "upbeat electronic, 120 BPM, no vocals",
      "n":            1,
      "input_images": [],
      "cast":         {"points": "<points>", "attribute_id": "<attribute_id>"}
    }
  }]
}

Prompt tips for music generation:

  • Genre: "upbeat electronic", "classical piano", "ambient chill"
  • Tempo: "120 BPM", "slow tempo"
  • Vocals: "no vocals", "male vocals", "female vocals"
  • Mood: "happy", "melancholic", "energetic"
  • Duration hint: "60 seconds", "short loop"

Key fields:

Field Required Description
parameters[].credit Must equal credit_rules[].points. Error 6006 if wrong.
parameters[].parameters.prompt Prompt must be nested here, NOT at top level.
parameters[].parameters.cast {"points": N, "attribute_id": N} — mirror of credit.
parameters[].parameters.n Number of outputs (usually 1).

Response: data.id = task ID for polling.


API 3: Task Detail (Poll)

POST /open/v1/tasks/detail
{"task_id": "<id from create response>"}

Poll every 3–5s. Completed response:

{
  "id": "task_abc",
  "medias": [{
    "resource_status": 1,
    "url":          "https://cdn.../output.mp3",
    "duration_str": "60s",
    "format":       "mp3"
  }]
}

Output fields: url (mp3), duration_str, format.


Common Mistakes

Mistake Fix
Placing prompt at param top-level prompt must be inside parameters[].parameters
Wrong credit value Must exactly match credit_rules[].points (error 6006)
Missing app / platform in parameters Required — use ima / web
Single-poll instead of loop Poll until resource_status == 1 for ALL medias
Not checking status != "failed" resource_status=1 + status="failed" = actual failure

Python Example

import time
import requests

BASE_URL = "https://api.imastudio.com"
API_KEY  = "ima_your_key_here"
HEADERS  = {
    "Authorization":  f"Bearer {API_KEY}",
    "Content-Type":   "application/json",
    "x-app-source":   "ima_skills",
    "x_app_language": "en",
}


def get_products(category: str) -> list:
    """Returns flat list of type=3 version nodes from V2 tree."""
    r = requests.get(
        f"{BASE_URL}/open/v1/product/list",
        headers=HEADERS,
        params={"app": "ima", "platform": "web", "category": category},
    )
    r.raise_for_status()
    nodes = r.json()["data"]
    versions = []
    for node in nodes:
        for child in node.get("children") or []:
            if child.get("type") == "3":
                versions.append(child)
            for gc in child.get("children") or []:
                if gc.get("type") == "3":
                    versions.append(gc)
    return versions


def create_music_task(prompt: str, product: dict) -> str:
    """Returns task_id."""
    rule = product["credit_rules"][0]
    form_defaults = {f["field"]: f["value"] for f in product.get("form_config", []) if f.get("value") is not None}

    nested_params = {
        "prompt": prompt,
        "n":      1,
        "input_images": [],
        "cast":   {"points": rule["points"], "attribute_id": rule["attribute_id"]},
        **form_defaults,
    }

    body = {
        "task_type":          "text_to_music",
        "enable_multi_model": False,
        "src_img_url":        [],
        "parameters": [{
            "attribute_id":  rule["attribute_id"],
            "model_id":      product["model_id"],
            "model_name":    product["name"],
            "model_version": product["id"],
            "app":           "ima",
            "platform":      "web",
            "category":      "text_to_music",
            "credit":        rule["points"],
            "parameters":    nested_params,
        }],
    }
    r = requests.post(f"{BASE_URL}/open/v1/tasks/create", headers=HEADERS, json=body)
    r.raise_for_status()
    return r.json()["data"]["id"]


def poll(task_id: str, interval: int = 3, timeout: int = 300) -> dict:
    deadline = time.time() + timeout
    while time.time() < deadline:
        r = requests.post(f"{BASE_URL}/open/v1/tasks/detail", headers=HEADERS, json={"task_id": task_id})
        r.raise_for_status()
        task   = r.json()["data"]
        medias = task.get("medias", [])
        if medias:
            if any(m.get("status") == "failed" for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            rs = lambda m: m.get("resource_status") if m.get("resource_status") is not None else 0
            if any(rs(m) == 2 for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            if all(rs(m) == 1 for m in medias):
                return task
        time.sleep(interval)
    raise TimeoutError(f"Task timed out: {task_id}")


# text_to_music
products = get_products("text_to_music")
task_id  = create_music_task("upbeat electronic, 120 BPM, no vocals", products[0])
result   = poll(task_id)
print(result["medias"][0]["url"])          # mp3 URL
print(result["medias"][0]["duration_str"]) # e.g. "60s"

Supported Models & Search Terms

Models: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong)

Capabilities: music generation, text-to-music, AI music, background music, BGM, soundtrack, jingle, song with lyrics, vocal, instrumental, ambient music, audio generation