Skip to main content

MOSS-TTSD: Bilingual Dialogue Speech Synthesis

Product Introduction

MOSS-TTSD is an advanced open-source model designed for bilingual (Chinese-English) dialogue speech synthesis. It transforms dialogue scripts into high-quality, expressive audio, making it ideal for podcast production and AI-driven conversational applications. The model leverages large-scale language and speech datasets to ensure naturalness and accuracy in generated speech.

Key Features

  • Bilingual Support: Generates speech in both Chinese and English.
  • Zero-Shot Voice Cloning: Accurately clones voices without prior training.
  • Long-Duration Speech: Suitable for extended audio like podcasts.
  • High Expressiveness: Delivers human-like conversational tones.
  • Flexible Deployment: Supports local and API-based inference.
  • Batch Processing: Handles multiple generation requests simultaneously.
  • Podcast Tools: Converts long texts or web content into audio.
  • Customization: Includes fine-tuning scripts for model adaptation.

Product Data

  • Target Audience: Developers, content creators, and researchers in voice synthesis and podcasting.
  • Use Cases: Podcasts, online education, entertainment applications.
  • Technical Requirements: Python environment, JSONL input files, XY Tokenizer weights.

For more details, visit MOSS-TTSD.

Image

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

Turn Spotify Podcasts Into Searchable Text Instantly
Products

Turn Spotify Podcasts Into Searchable Text Instantly

SpotScribe transforms your favorite Spotify podcasts into readable transcripts with a single click. Perfect for students, content creators, and busy professionals, it saves hours by converting audio to searchable text. Beyond basic transcription, it offers smart summaries and an AI chat feature to dive deeper into episodes. With pricing starting at $9.99/month and a free trial available, it's designed for anyone who wants to get more from their podcast listening.

November 11, 2025
podcast toolstranscription softwareproductivity apps
SQLBot: Your Conversational Data Analyst
Products

SQLBot: Your Conversational Data Analyst

Meet SQLBot, an intelligent data query system that turns natural language into actionable insights. Developed by FeiZhiYun, this open-source tool combines large language models with RAG technology to make data analysis as easy as having a conversation. Perfect for analysts drowning in spreadsheets or executives needing quick answers, SQLBot offers instant setup, multi-source connectivity, and robust security—all wrapped in a user-friendly package that learns from your questions.

November 7, 2025
data-analysisnatural-language-processingbusiness-intelligence
Kat Dev: AI Code Generation Solution
Products

Kat Dev: AI Code Generation Solution

Kat Dev is an advanced AI code generation solution developed by Kwaipilot team at Kuaishou. It's a family of large language models specialized in software engineering and coding tasks, offering powerful capabilities like code generation, optimization, and error fixing. With high performance (74.6 score on SWE Bench), multi-language support, and open-source availability under Apache 2.0 license, it significantly boosts developer productivity.

October 13, 2025
AI codinglarge language modelsoftware development
Sora 2 Video Watermark Remover
Products

Sora 2 Video Watermark Remover

Sora 2 Video Watermark Remover is an open-source tool designed to efficiently remove watermarks from videos while preserving quality. It uses advanced algorithms and supports multiple video formats, making it ideal for content creators, video editors, and students.

October 9, 2025
video editingwatermark removalopen-source
QuQu: Open-Source Chinese Voice Input Tool
Products

QuQu: Open-Source Chinese Voice Input Tool

QuQu is a free, open-source desktop voice input and text processing tool designed for Chinese users. It offers privacy protection and local processing, integrating the FunASR model for accurate Chinese speech recognition. Ideal for students, developers, and professionals, it enhances productivity with features like smart language optimization, programming syntax support, and compatibility with multiple AI models.

September 28, 2025
voice-recognitionopen-sourceprivacy-tools
Katalog: AI-Powered Article Voice Reader
Products

Katalog: AI-Powered Article Voice Reader

Katalog is an innovative AI tool that converts saved articles into high-quality voice narrations. It uses ultra-realistic AI voices to provide an exceptional listening experience, ideal for consuming content hands-free. Currently in public beta with free access, Katalog offers features like article saving, semantic search, and note-taking capabilities. Perfect for multitaskers, commuters, or anyone preferring audio content consumption.

September 10, 2025
AI voicearticle readercontent consumption