This post details a quick test merging Google’s NotebookLM with HeyGen’s lip-sync capabilities. While the overall concept is exciting, there are still notable challenges. HeyGen’s video editing tool, for instance, could benefit from features like support for multiple audio tracks or the ability to assign specific avatars to individual audio tracks.
Despite these hurdles, the journey towards fully automated, „press-a-button“ AI production continues. We might not be there yet, but progress is steady.
Workflow Breakdown:
- Podcast created using NotebookLM
- Male voice track processed locally using RVC
- Audio editing performed in Audacity to realign both female and male voice tracks
- Finished audio uploaded to HeyGen
- Video generated by mixing and matching two of HeyGen’s standard avatars
- The final video is complete!
„And that’s why it’s not a ‚Press-a-button‘ AI world we live in….yet.“