git clone https://github.com/microsoft/VibeVoice.git
cd VibeVoice/
py.exe -3.11 -m venv venv
.\venv\Scripts\activate
python.exe -m pip install --upgrade pip
pip install -e .
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip install https://github.com/kingbri1/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu124torch2.6.0cxx11abiFALSE-cp311-cp311-win_amd64.whl
pip install xformers==0.0.29.post3 --index-url=https://download.pytorch.org/whl/cu124
pip install https://github.com/woct0rdho/triton-windows/releases/download/v3.1.0-windows.post9/triton-3.1.0-cp311-cp311-win_amd64.whl
pip install huggingface_hub[hf_xet]
python demo/gradio_demo.py --model_path microsoft/VibeVoice-1.5B --share

다운이 완료 되었다.
http://localhost:7860으로 들어간다.


1.5B 모델임에도 불구하고 VRAM을 꽤나 먹는다.
생성중에 리얼타임으로 재생이 가능했다.


현재 영어, 중국어, 인도어 화자 모델을 지원하는 모양이다.
최대 4명이 대화하는 Podcasting를 구현할 수 있었다.
Speaker 1: Hello everyone, and welcome to the VibeVoice podcast. I’m your host, Linda, and today we’re getting into one of the biggest debates in all of sports: who’s the greatest basketball player of all time? I’m so excited to have Thomas here to talk about it with me.
Speaker 2: Thanks so much for having me, Linda. You’re absolutely right—this question always brings out some seriously strong feelings.
Speaker 1: Okay, so let’s get right into it. For me, it has to be Michael Jordan. Six trips to the Finals, six championships. That kind of perfection is just incredible.
Speaker 2: Oh man, the first thing that always pops into my head is that shot against the Cleveland Cavaliers back in ’89. Jordan just rises, hangs in the air forever, and just… sinks it. I remember jumping off my couch and yelling, “Oh man, is that true? That’s Unbelievable!”
Speaker 1: Right?! That moment showed just how cold-blooded he was. And let’s not forget the “flu game.” He was so sick he could barely stand, but he still found a way to win.
Speaker 2: Yeah, that game was pure willpower. He just made winning feel so inevitable, like no matter how bad the situation looked, you just knew he’d figure it out.
Speaker 1: But then you have to talk about LeBron James. What always gets me is his longevity. I mean, twenty years and he’s still playing at the highest level! It’s insane.
Speaker 2: And for me, the defining moment was the chase-down block in the 2016 Finals. He did it for Cleveland, ending their 52-year championship drought. You know, he’s basically the basketball equivalent of a Swiss Army knife, which is a big reason why he’s the unquestionable vice goat.
Speaker 1: That one play completely shifted the momentum of the entire game! It’s the kind of highlight people are going to be talking about forever.
Speaker 2: And that’s the thing with LeBron—he’s not just a scorer. He’s a passer, a rebounder, a leader. He influences the game in every single way.
Speaker 1: That’s so true. Jordan brought fear to his opponents, but LeBron brings this sense of trust. His teammates just know he’s going to make the right play.
Speaker 2: What a great way to put it! They’re two totally different kinds of greatness, but both are so incredibly effective.
Speaker 1: And then, of course, you have to talk about Kobe Bryant. To me, he was the one who carried Jordan’s spirit into a new generation.
Speaker 2: Absolutely. Kobe was all about obsession. His Mamba Mentality was so intense, I bet he practiced free throws in his sleep.
Speaker 1: What I’ll always remember is his final game. Sixty points! What a way to go out. That was pure Kobe—competitive right up until the very last second.
Speaker 2: It felt like a farewell masterpiece. He gave everything he had to the game, and that night, he gave it one last time.
Speaker 1: And twenty years with a single team! That kind of loyalty is just so rare these days.
Speaker 2: It really is. That’s what separates him. Jordan defined dominance, LeBron defined versatility, but Kobe brought both that fire and that incredible loyalty.
Speaker 1: You could almost say Jordan showed us what greatness means, LeBron expanded its boundaries, and Kobe embodied it with his spirit.
Speaker 2: Yes, exactly! Three different paths, but all with that same single-minded obsession with victory.
Speaker 1: And that’s why this conversation is so much fun. Greatness doesn’t have just one face—it comes in all different forms.
Speaker 2: It sure does. And we were lucky enough to witness all three.
아주 잘된다.
Speaker 1: 안녕하세요, 여러분! 테크 웨이브 팟캐스트에 오신 걸 환영합니다! 저는 진행자입니다. 오늘은 AI의 핵심, 딥러닝 기술의 최신 동향과 미래에 대해 이야기해보려고 해요. 제 옆에는 딥러닝 전문가인 Speaker 2님이 함께해 주셨습니다. 안녕하세요!
Speaker 2: 안녕하세요! 청취자 여러분, 반갑습니다. 딥러닝은 지금 정말 뜨거운 분야라 오늘 대화가 엄청 기대되네요.
Speaker 1: 맞아요, 딥러닝 없으면 AI가 지금처럼 발전했을 리 없죠! 자, 바로 시작합시다. 최근 딥러닝에서 가장 눈에 띄는 트렌드는 뭐라고 생각하시나요?
Speaker 2: 음, 최근엔 트랜스포머 모델이 여전히 주도하고 있지만, 특히 효율성에 초점이 맞춰지고 있어요. 대형 언어 모델(LLM)이나 비전 트랜스포머 같은 거대 모델들이 성능은 좋지만, 에너지 소모와 계산 비용이 어마어마하잖아요. 그래서 경량화된 모델, 예를 들어 DistilBERT나 EfficientNet 같은 게 주목받고 있죠.
Speaker 1: 와, 그러니까 AI를 더 가볍고 빠르게 만들자는 거네요? 그럼 예를 들어, 스마트폰 같은 디바이스에서도 딥러닝을 더 쉽게 돌릴 수 있는 거죠?
Speaker 2: 정확해요! 예를 들어, 모바일 디바이스에서 실시간 음성 인식이나 이미지 처리 같은 걸 구현하려면, 이런 경량 모델이 필수예요. VibeVoice 같은 기술도 이런 트렌드 덕분에 더 강력해지고 있는 거죠.
Speaker 1: 오, VibeVoice도 그 영향을 받았군요! 그럼 혹시, VibeVoice에서 이런 경량화 기술이 어떻게 적용되고 있는지 살짝 알려주실 수 있나요?
Speaker 2: 당연하죠. VibeVoice는 음성 데이터를 실시간으로 처리하면서도 감정 분석까지 해야 하니까, 효율적인 모델 구조가 중요해요. 트랜스포머 기반의 소형화된 아키텍처를 사용해서 저전력 디바이스에서도 부드럽게 동작하도록 설계됐어요.
한국어 배운 외국인끼리 대화하는 느낌이 강하지만 정말 결과는 훌륭했다.
Speaker 1:こんにちは、皆さん! テックウェーブポッドキャストへようこそ! 私は進行役です。 今日はAIの核心、ディープラーニング技術の最新動向と未来についてお話したいと思います。 私の隣にはディープラーニング専門家のSpeaker2さんが来てくれました。 こんにちは!
Speaker 2:こんにちは!リスナーの皆さん、お会いできて嬉しいです。 ディープラーニングは今本当に熱い分野なので、今日の会話がとても楽しみですね。
Speaker 1: そうです、ディープラーニングがなければAIが今のように発展したはずがありません! さあ、さっそく始めましょう。 最近のディープラーニングで最も目立つトレンドは何だと思いますか?
Speaker 2: まあ、最近はトランスフォーマーモデルが依然として主導していますが、特に効率性に焦点が当てられています。 大型言語モデル(LLM)やビジョントランスフォーマーのような巨大モデルが性能は良いが、エネルギー消耗と計算費用が莫大ですよね。 それで軽量化されたモデル、例えばDistilBERTとかEfficientNetとかが注目されていますよね。
Speaker 1: わあ、つまりAIをもっと軽くて早く作ろうということですね? では例えば、スマートフォンのようなデバイスでもディープラーニングをより簡単に回せるんですよね?
Speaker 2:正確です! 例えば、モバイルデバイスでリアルタイム音声認識や画像処理のようなものを実現するには、このような軽量モデルが必須です。 VibeVoiceのような技術もこのようなトレンドのおかげで、より強力になっているのです。
Speaker 1: お、VibeVoiceもその影響を受けたんですね! では、VibeVoiceでこのような軽量化技術がどのように適用されているのか教えていただけますか?
Speaker 2:もちろんです。 VibeVoiceは音声データをリアルタイムで処理しながらも感情分析までしなければならないので、効率的なモデル構造が重要です。 トランスフォーマーベースの小型化されたアーキテクチャを使用して、低電力デバイスでもスムーズに動作するように設計されました。
이거 꽤 재밌다.