Runtime Text To Speech (Real-Time, Offline, Streaming TTS)

Georgy Dev

Unreal Engine Versions

5.5-5.8

Distribution Method

Plugin

Asset Version

N/A

You need to register first

Description

Transform your game with real-time, offline, cross-platform text-to-speech synthesis! No internet, no subscriptions, no privacy risks.

Add powerful offline text-to-speech capabilities to your project with 51 languages and 2800+ voices featuring 75 voice qualities. Synthesize speech in real-time without internet connectivity, powered by Piper, Kokoro and ONNX Runtime.

Now featuring Kokoro voice models – high-quality, open-source TTS architectures with studio-level voice synthesis. Includes 151 models across 8 languages, offering natural and expressive speech output.

Key features:

🎯 Core Capabilities:

Complete offline text-to-speech synthesis
51 languages supported
2800+ unique voices available
75 voice qualities
Cross-platform support: Windows, Linux, Mac, Android (including Meta Quest), iOS
Experimental support for Apple Vision Pro

⚡ Voice System:

One-click voice model downloads through editor interface
In-editor voice preview and testing
Runtime voice model selection
Custom voice model importing (Piper & Kokoro formats)
Raw PCM float audio output
Flexible integration with any audio playback solution
Built-in compatibility with Runtime Audio Importer

🛠️ Development Features:

Full Blueprint and C++ API support
Regular and streaming synthesis modes
Real-time audio chunk processing
Synthesis cancellation support
Easy voice model management and packaging
Comprehensive voice metadata access
Simple voice model selection via dropdown
Automated voice model packaging with projects

🌍 Supported languages:

Some voice models also support multiple speakers, which significantly increases the variety of available voices - for example, English LibriTTS alone includes more than 900 different speakers.

🇺🇸 English (United States)
🇬🇧 English (British)
🇨🇳 Simplified Chinese (简体中文)
🇲🇽 Spanish (Mexican / Español Mexicano)
🇪🇸 Spanish (European / Español Europeo)
🇦🇷 Spanish Argentinian (Español Argentino)
🇨🇴 Spanish Colombian (Español Colombiano)
🇰🇷 Korean (한국어)
🇷🇺 Russian (Русский)
🇧🇷 Portuguese (Brazil / Português do Brasil)
🇵🇹 Portuguese (Portugal / Português de Portugal)
🇮🇳 Hindi (हिन्दी)
🇮🇳 Malayalam (മലയാളം)
🇮🇳 Telugu (తెలుగు)
🇩🇪 German (Deutsch)
🇫🇷 French (Français)
🇸🇦 Arabic (العربية)
🇹🇷 Turkish (Türkçe)
🇵🇱 Polish (Polski)
🇮🇹 Italian (Italiano)
🇺🇦 Ukrainian (Украї́нська мо́ва)
🇦🇩 Catalan (Català)
🇧🇬 Bulgarian (Български
🇨🇿 Czech (Čeština)
🏴󠁧󠁢󠁷󠁬󠁳󠁿 Welsh (Cymraeg)
🇩🇰 Danish (Dansk)
🇬🇷 Greek (Ελληνικά)
🇵🇰 Urdu (اردو)
🇮🇷 Persian / Farsi (فارسی)
🇫🇮 Finnish (Suomi)
🇪🇸 Basque (Euskara)
🇭🇺 Hungarian (Magyar)
🇮🇸 Icelandic (Íslenska)
🇮🇩 Indonesian (Bahasa Indonesia)
🇬🇪 Georgian (ქართული ენა)
🇰🇿 Kazakh (Қазақша)
🇱🇺 Luxembourgish (Lëtzebuergesch)
🇱🇻 Latvian (Latviešu)
🇳🇵 Nepali (नेपाली)
🇧🇪 Dutch (Belgium / Vlaams)
🇳🇱 Dutch (Netherlands / Nederlands)
🇳🇴 Norwegian (Bokmål / Nynorsk)
🇷🇴 Romanian (Română)
🇸🇰 Slovak (Slovenčina)
🇸🇮 Slovenian (Slovenščina)
🇷🇸 Serbian (Srpski)
🇸🇪 Swedish (Svenska)
🇦🇱 Albanian (Shqip)
🇰🇪 Swahili (Kiswahili)
🇹🇷 Kurdish Kurmanji (Kurdî)
🇻🇳 Vietnamese (Tiếng Việt)

🎮 Perfect for:

Accessible game interfaces
Dynamic NPC conversations
Voice-driven tutorials and hints
Procedurally generated content
Localization solutions
Assistive technologies
Interactive storytelling
Educational applications

Technical details

Features:

Simple, intuitive setup
High-quality synthesized speech
Rapid synthesis speed
Support for 51 languages, 2800+ voices, and 75 voice qualities
Automatic downloading and packaging of voice models via the editor
Easy management of voice models in the editor (e.g. download and delete)
Voice model preview via the editor tool
Cross-platform compatibility (Windows, Linux, Mac, Android (including Meta Quest), iOS)

Documentation: https://docs.georgy.dev/runtime-text-to-speech/overview