This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Tongyi Tingwu
Font Size:
Introduction
Key Features
- Multilingual and Dialect Transcription: The tool transcribes audio and video files in multiple languages, including English, Chinese, and Cantonese. It also supports the transcription of mixed-language speech, such as a combination of Chinese and English, providing word-level timestamps for precision.
- Real-Time Transcription and Translation: Tongyi Tingwu offers instantaneous transcription during live events like meetings and lectures. It can also provide real-time translation between languages, displaying the output as parallel subtitles to facilitate cross-language communication.
- AI-Powered Summarisation: The system uses a large language model to generate several types of summaries from transcribed content. These summaries include chapter overviews, key point reviews, and analyses of speaker contributions, and can identify action items and implicit questions from the dialogue.
- Speaker Diarisation: The tool automatically identifies and separates different speakers within an audio or video file, a process known as speaker diarisation. This feature is critical for accurately documenting multi-person conversations, and users can manually edit the assigned speaker labels for correctness.
Uniqueness
Frequently Asked Questions
Specifications
Video Demonstration
User Guide
This guide outlines how to transcribe audio and video files, perform speaker diarisation, generate AI-powered summaries, and export the final content using Tongyi Tingwu.
(Access Date: 22 July 2025)
Access Tongyi Tingwu Website
| a. | Navigate to the website: https://tingwu.aliyun.com/home; |
| b. | Select to log in or register with an Alibaba Cloud account or phone number; |
| c. | Enter your credentials to access the main dashboard. |
Transcribe Audio and Video Files
| a. | Click 首頁 (Home Page) from the left-side menu on the main dashboard; |
| b. | Click 上傳音視頻 (Upload Audio/Video File) and select either 上傳本地音視頻文件 (Upload from local) or 導入阿里雲盤文件 (Import from Alibaba Cloud) (Note: You can upload up to 50 files); |
| c. | Select the original language of the file; |
| d. | Select the option to translate the original language if required; |
| e. | Select the number of speakers for diarisation; |
| f. | Click the 開始轉寫 (Start Transcription) button; |
| g. | Wait for the transcription process to complete. |
Perform Speaker Diarisation
| a. | Click 我的記錄 (My History) from the left-side menu on the main dashboard; |
| b. | Click the processed task to open the file's detail page; |
| c. | Review the automatically applied speaker labels; |
| d. | Click on the speaker labels to manually edit and rename them. |
Generate AI Summaries
| a. | Select the 章節速覽 (Chapter Overview), 發言總結 (AI Summary), or 要點回顧 (Key Points Review) tab on the transcript page; |
| b. | Review the generated breakdown, overview, keywords, and action items; |
| c. | Click the copy button in the upper-right corner of any tab to copy the content to the notes area (Note: You can copy all content to the notes area by clicking the 批量提取 (Batch Extract) button in the top panel.). |
Export and Share
| a. | Click the 保存 (Save) button in the left-side panel; |
| b. | Click 導出 (Export) to export the file in various formats. |
Educational Scenarios
Lecture Summary
A professor records guest lectures in English for a multilingual class and Tongyi Tingwu generates synchronized bilingual transcripts with speaker labels. By providing bilingual transcripts, the tool supports linguistic diversity and ensures that all students, regardless of their language proficiency, can fully engage with the lecture content. The bilingual transcripts serve as a scaffold, allowing students to toggle between languages and reinforce their understanding of complex concepts. Speaker labels further enhance understanding by clearly attributing specific insights and perspectives to individual speakers, which is particularly valuable in discussions involving multiple contributors. This level of detail aids in the development of critical listening skills and helps students contextualize information within the broader scope of the lecture. Additionally, the availability of transcripts supports differentiated instruction by allowing students to review content at their own pace, thus accommodating various learning styles and preferences.
Research Interview Analysis
A sociology team conducts fieldwork interviews in regional dialects. Tongyi Tingwu transcribes and translates the recordings. Timestamped notes help correlate quotes with observational data. By transcribing and translating interviews from regional dialects, the tool facilitates the inclusion of diverse voices and perspectives, which is essential for comprehensive sociological analysis. The use of timestamped notes enables researchers to efficiently cross-reference interview data with field observations, enhancing the accuracy and depth of their analysis. By correlating quotes with observational data, researchers can construct a more comprehensive and nuanced understanding of the social contexts they are studying. Furthermore, by streamlining the transcription and translation process, the tool allows researchers to focus more on data interpretation and theory development, thus advancing the overall quality of their work.
Conference Summary
An academic committee uses the tool to transcribe panel discussions from an international conference. Summaries are distributed to attendees, and keyword trends inform next year's agenda planning. By providing accurate transcripts, the tool ensures that participants can revisit and reflect on the discussions, fostering a deeper understanding of the presented topics. The distribution of summaries makes the conference content accessible to those unable to attend, thereby extending the event's reach and impact. These summaries serve as cognitive artifacts, extending the conference experience beyond its temporal and spatial boundaries and allowing for continued engagement with the ideas presented. Additionally, analyzing keyword trends from the discussions provides valuable insights into emerging academic interests and concerns, which can guide strategic planning for future events. The analysis of keyword trends for future agenda planning reflects the growing importance of data-driven decision-making in academic contexts. By identifying emerging trends and topics of interest, conference organizers can ensure that future events remain relevant and responsive to the evolving needs of the academic community. This scenario demonstrates the application of AI in enhancing academic knowledge dissemination and conference planning.
Multilingual Study Groups
International engineering students collaborate on a capstone project. Tongyi Tingwu transcribes brainstorming sessions. Non-native speakers use translated transcripts to clarify tasks and responsibilities. This use of technology in group work addresses the challenges of intercultural communication in global engineering education. By ensuring all team members have clear access to project discussions, Tongyi Tingwu promotes equitable participation and helps mitigate potential misunderstandings arising from language barriers. Tongyi Tingwu transcribing and translating conversations in real-time helps alleviate the language barrier, a common challenge in multicultural education environments. This encourages equitable participation from all group members, enhancing the overall quality and depth of the project. The process also supports language acquisition among non-native speakers, as they gain exposure to technical vocabulary and conversational nuances, fostering both linguistic proficiency and subject matter expertise.
Thesis Defense Practice
PhD candidates practice their defense presentation while Tongyi Tingwu transcribes their speech. The candidates review the transcripts to identify misspoken words. By providing an accurate transcript of their practice presentations, the tool enables PhD candidates to engage in detailed self-assessment, a critical component of metacognitive strategies in learning. The ability to identify misspoken words and review their articulation of complex ideas allows candidates to refine their presentation skills and enhance the clarity of their academic communication. By analyzing both the content and delivery of their presentations, candidates can ensure that their verbal communication effectively complements their visual aids and written work, leading to a more cohesive and impactful defense. This technological aid helps reduce anxiety and build confidence, empowering candidates to deliver their arguments effectively and convincingly. This scenario underscores the importance of technology in refining communication skills crucial for academic success.
Bridging Language Barriers for International Students
An international student attending a university in Hong Kong uses 通義聽悟 to transcribe and translate their Cantonese-speaking professor's lectures into English. After class, the student reviews the transcription, using the keyword highlighting feature to identify key concepts. By providing accurate transcriptions and translations, 通義聽悟 enables students to access and understand course content that might otherwise be challenging due to language differences. The keyword highlighting feature offers an additional level of support by allowing students to focus on essential concepts and terminology, aiding in comprehension and retention. This feature can be particularly beneficial for international students who are simultaneously grappling with content complexity and language challenges. Furthermore, this use of technology promotes learner autonomy, a crucial factor in successful language acquisition and academic achievement. By enabling students to review and study lecture content at their own pace, 通義聽悟 supports independent learning and allows for personalized study strategies.
