Research Engineer & Tech Lead, Skywork AI
I've been involved in the full pipeline of foundation model development — pre-training, supervised fine-tuning, RL alignment, evaluation, and production deployment. My work spans:
| SkyReels V4 Video | Multimodal video generation with full-modal RL. #1 on Artificial Analysis for text-to-video with audio. |
| Matrix-Game 3.0 World | Memory-augmented interactive world model for real-time streaming. [arXiv] |
| SkyClaw-v1.0 Agent | Agent model with million-token context for tool use and code generation. |
| R1V Series VLM | 38B VLM with multimodal chain-of-thought reasoning. [arXiv] |
| UniPic Series Unified | 1.5B unified model for image understanding, generation, and editing. [arXiv] |
| VL Reward Model Reward | Multimodal reward model for RL alignment of MLLMs. [arXiv] |
| Super Agents Agent | End-to-end agent system for autonomous task execution. |
| RED Recommendation RecSys | Large-scale ranking, retrieval, and multi-objective optimization at Xiaohongshu. |