byted-mediakit-videoedit
AI Video Intelligent Editing Skill. Input video file paths (supports multiple), optional danmaku file paths, optional subtitle file paths, combine danmaku and subtitle content to understand video context, automatically extract corresponding time segments based on user editing requests (such as extract all highlight moments, cut out the part explaining xxx), splice and add transition effects, and finally synthesize the output video using FFmpeg.
Saker · 用法翻译
Install · 安装
claude skill add byted-mediakit-videoedit --from https://github.com/volcengine/agentkit-samples/tree/main/skills/byted-mediakit-videoedit
Skill 文档 · 原文
AI Video Intelligent Editing
This Skill helps users understand video context by analyzing danmaku and subtitle content, automatically extracts and splices video clips based on editing requests, and uses FFmpeg to complete transition effects and final synthesis.
Input Specifications
- Video files (required, supports multiple): Local video file paths, supports formats like
.mp4,.flv,.mkv, etc. - Danmaku files (optional, one per video): XML format danmaku files (supports Bilibili format), corresponding to video files in order one-to-one
- Subtitle files (optional, one per video):
.srt/.ass/.jsonformat subtitle files, corresponding to video files in order one-to-one; leave empty for videos without subtitles
Note: Subtitles and danmaku are the only basis for understanding video content. If neither is provided for a video, its content cannot be understood, and only explicit time segment instructions from the user can be executed.
Workflow
Step 0: Dependency Verification
Before performing any operations, verify that the runtime environment meets the requirements.
Verification Commands:
python --version
ffmpeg -version 2>&1 | head -1
ffprobe -version 2>&1 | head -1
node --versionAcceptance Criteria and Fixing Guidelines:
| Dependency | Minimum Requirement | Verification Method | Installation Command When Not Met | | ---------- | -------------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------- | | Python | 3.9+ | python --version | macOS: brew install python@3.11 · Linux: sudo apt install python3.11 | | ffmpeg | Any version | ffmpeg -version | macOS: brew install ffmpeg · Linux: sudo apt install ffmpeg | | ffprobe | Included with ffmpeg | ffprobe -version | Installed with ffmpeg, no separate operation needed | | Node.js | 18+ | node --version | macOS: brew install node · or nodejs.org |
Text Effect Rendering Dependency (Remotion) Installation:
Check if byted-mediakit-videoedit/template/node_modules exists:
ls byted-mediakit-videoedit/template/node_modules/@remotion/renderer 2>/dev/null && echo "Already installed" || echo "Need to install"If not installed, execute:
cd byted-mediakit-videoedit/template && npm installStep 1: Check Understandability of Each Video
Before any analysis, confirm individually whether each video has subtitles or danmaku:
| Video | Has Subtitles | Has Danmaku | Understandability | | ----------- | ------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | | video_A.mp4 | Yes | -- | Intelligent Analysis: Can understand video based on content semantics, supports content requests like "find highlights" | | video_B.mp4 | -- | Yes | Intelligent Analysis (downgraded): Infer content through danmaku, accuracy lower than subtitles | | video_C.mp4 | -- | -- | Only Explicit Commands: Cannot understand content, can only cut according to precise time segments provided by user |
Step 3: Parse Danmaku and Subtitles
Run the parsing script to convert danmaku and subtitles into timeline-based text summaries for video content analysis.
Single Video:
python byted-mediakit-videoedit/scripts/parse_media_info.py \
--video ep1.mp4 \
--danmaku ep1.xml \
--subtitle ep1.srt \
--output /tmp/media_timeline.jsonMultiple Videos:
python byted-mediakit-videoedit/scripts/parse_media_info.py \
--video ep1.mp4 --danmaku ep1.xml --subtitle ep1.srt \
--video ep2.mp4 --danmaku ep2.xml \
--output /tmp/media_timeline.jsonStep 4: Analyze Video Content, Understand Editing Requests
Read /tmp/media_timeline.json, combine with user's editing needs, and infer the list of time segments to be extracted.
With Subtitles (Main Path):
- Understand video content through subtitle text, locate topics, key sentences, paragraphs that users are interested in
- Clip boundaries strictly aligned with subtitle sentence boundaries
Without Subtitles (Downgraded Path):
- Read
type=danmaku_summaryentries, judge popularity bydensity(counts per minute) density_peakshas already pre-identified intervals with highest density
Step 5: Select Transition Effects
If the user has explicitly specified transition effects, use directly. If not specified, infer the most appropriate effect based on understanding of danmaku and subtitles.
| Content Feature | Recommended Transition | Reason | | ------------------------------------------- | --------------------- | ------------------------------- | | Excited danmaku (哈哈/666/牛), fast pace | none | Hard cut matches highlight moments | | Game/sports, danmaku with 冲/gkd | wipeleft | Horizontal wipe has dynamic feel | | Knowledge讲解/tutorial, rational content | dissolve | Smooth transition for info-dense | | Emotional/vlog/life content | fade | Soft, emotional continuity | | Mixed content or difficult to judge | fade | Most universal fallback |
Step 6: Present Editing Plan to User and Wait for Confirmation
After completing analysis and transition selection, must present the complete plan to the user in table form, stop and wait for user's explicit confirmation before continuing execution.
Step 7: Execute Editing
After user confirms the plan, write the final plan to /tmp/clips.json, run the editing script:
python byted-mediakit-videoedit/scripts/cut_and_merge.py \
--clips-json /tmp/clips.json \
--output /path/to/output.mp4Step 8 (Optional): Add Text Effects
After video editing is complete, ask the user if they need to add text effects (danmaku burst animation, chapter titles, golden sentence cards, etc.).
Effect types include: chapterTitles, keyPhrases, danmakuBursts, lowerThirds, quotes.
Optional Themes: douyin (default), notion, cyberpunk, aurora, apple.
Step 9: Display Results
Inform the user of the final output file path, explain which segments were cut and total duration. If there are effects, explain which text effects were added.
Error Handling
- Danmaku file parsing failure: Check if it's standard XML format (supports Bilibili format)
- Video file not found: Prompt user to check the path
- FFmpeg not installed: Prompt installation command
- Segment time exceeds video duration: Automatically crop to video end
- Remotion rendering failure: Check if
node_modulesexists, confirm Node.js >= 18
Related · 同类技能
byted-las-video-edit
Extracts and clips video segments from long videos using natural language descriptions. AI-powered smart video editing, video trimming, and …
byted-las-video-inpaint
Removes unwanted visual elements from videos using AI-powered inpainting via Volcengine LAS. Video watermark removal, subtitle removal, logo…
byted-mediakit-voiceover-editing
Volcano Engine AI MediaKit talking-head video editing Skill: a one-stop workflow from environment setup through media management, audio proc…