Some of the best video models like tencent/hunyuan-video are open-source, and the community has been hard at work building on top of them. We've adapted the Musubi Tuner by @kohya_tech to run on Replicate, so you can fine-tune HunyuanVideo on your own visual content.
HunyuanVideo is good at capturing the style of the training data, not only in the visual appearance of the imagery and the color grading, but also in the motion of the camera and the way the characters move.
This in-motion style transfer is unique to this implementation: other video models that are trained only on images cannot capture it.
This process can be time-consuming, so we've created a model to make it easier: zsxkib/create-video-dataset takes a video file or YouTube URL as input, slices it into smaller clips, and generates captions for each clip.
Fine-tuning video models is in its early days, so we don't really know yet what is possible, and what might be able to be built on top of it.