使用用戶端程式庫從文字建立音訊

本快速入門導覽課程會逐步說明如何使用用戶端程式庫向 Text-to-Speech 提出要求,從文字建立音訊。

如要進一步瞭解 Text-to-Speech 的基本概念,請參閱「Text-to-Speech 基礎」。如要查看您的語言支援哪些合成語音,請參閱支援的語音和語言頁面

事前準備

您必須先完成下列動作,才能向 Text-to-Speech API 傳送要求。詳情請參閱「事前準備」頁面。

安裝用戶端程式庫

Go

go get cloud.google.com/go/texttospeech/apiv1

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>   <dependencies>     <dependency>       <groupId>com.google.cloud</groupId>       <artifactId>libraries-bom</artifactId>       <version>26.65.0</version>       <type>pom</type>       <scope>import</scope>     </dependency>   </dependencies> </dependencyManagement>  <dependencies>   <dependency>     <groupId>com.google.cloud</groupId>     <artifactId>google-cloud-texttospeech</artifactId>   </dependency> </dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-texttospeech:2.71.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-texttospeech" % "2.71.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

Node.js

安裝程式庫前,請確認您已設定適當的 Node.js 開發環境

npm install @google-cloud/text-to-speech

Python

安裝程式庫前,請確認您已設定適當的 Python 開發環境

pip install --upgrade google-cloud-texttospeech

其他語言

C#: 請按照用戶端程式庫頁面上的 C# 設定說明操作, 然後前往 .NET 適用的 Text-to-Speech 參考說明文件

PHP: 請按照用戶端程式庫頁面上的 PHP 設定說明 操作,然後前往 PHP 適用的 Text-to-Speech 參考文件

Ruby: 請按照用戶端程式庫頁面的 Ruby 設定說明 操作,然後前往 Ruby 適用的 Text-to-Speech 參考說明文件

建立音訊資料

現在您可以使用 Text-to-Speech 建立合成人類語音的音訊檔。使用下列程式碼,將 synthesize 要求傳送至 Text-to-Speech API。

Go

 // Command quickstart generates an audio file with the content "Hello, World!". package main  import ( 	"context" 	"fmt" 	"log" 	"os"  	texttospeech "cloud.google.com/go/texttospeech/apiv1" 	"cloud.google.com/go/texttospeech/apiv1/texttospeechpb" )  func main() { 	// Instantiates a client. 	ctx := context.Background()  	client, err := texttospeech.NewClient(ctx) 	if err != nil { 		log.Fatal(err) 	} 	defer client.Close()  	// Perform the text-to-speech request on the text input with the selected 	// voice parameters and audio file type. 	req := texttospeechpb.SynthesizeSpeechRequest{ 		// Set the text input to be synthesized. 		Input: &texttospeechpb.SynthesisInput{ 			InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"}, 		}, 		// Build the voice request, select the language code ("en-US") and the SSML 		// voice gender ("neutral"). 		Voice: &texttospeechpb.VoiceSelectionParams{ 			LanguageCode: "en-US", 			SsmlGender:   texttospeechpb.SsmlVoiceGender_NEUTRAL, 		}, 		// Select the type of audio file you want returned. 		AudioConfig: &texttospeechpb.AudioConfig{ 			AudioEncoding: texttospeechpb.AudioEncoding_MP3, 		}, 	}  	resp, err := client.SynthesizeSpeech(ctx, &req) 	if err != nil { 		log.Fatal(err) 	}  	// The resp's AudioContent is binary. 	filename := "output.mp3" 	err = os.WriteFile(filename, resp.AudioContent, 0644) 	if err != nil { 		log.Fatal(err) 	} 	fmt.Printf("Audio content written to file: %v\n", filename) } 

Java

// Imports the Google Cloud client library import com.google.cloud.texttospeech.v1.AudioConfig; import com.google.cloud.texttospeech.v1.AudioEncoding; import com.google.cloud.texttospeech.v1.SsmlVoiceGender; import com.google.cloud.texttospeech.v1.SynthesisInput; import com.google.cloud.texttospeech.v1.SynthesizeSpeechResponse; import com.google.cloud.texttospeech.v1.TextToSpeechClient; import com.google.cloud.texttospeech.v1.VoiceSelectionParams; import com.google.protobuf.ByteString; import java.io.FileOutputStream; import java.io.OutputStream;  /**  * Google Cloud TextToSpeech API sample application. Example usage: mvn package exec:java  * -Dexec.mainClass='com.example.texttospeech.QuickstartSample'  */ public class QuickstartSample {    /** Demonstrates using the Text-to-Speech API. */   public static void main(String... args) throws Exception {     // Instantiates a client     try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {       // Set the text input to be synthesized       SynthesisInput input = SynthesisInput.newBuilder().setText("Hello, World!").build();        // Build the voice request, select the language code ("en-US") and the ssml voice gender       // ("neutral")       VoiceSelectionParams voice =           VoiceSelectionParams.newBuilder()               .setLanguageCode("en-US")               .setSsmlGender(SsmlVoiceGender.NEUTRAL)               .build();        // Select the type of audio file you want returned       AudioConfig audioConfig =           AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3).build();        // Perform the text-to-speech request on the text input with the selected voice parameters and       // audio file type       SynthesizeSpeechResponse response =           textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);        // Get the audio contents from the response       ByteString audioContents = response.getAudioContent();        // Write the response to the output file.       try (OutputStream out = new FileOutputStream("output.mp3")) {         out.write(audioContents.toByteArray());         System.out.println("Audio content written to file \"output.mp3\"");       }     }   } }

Node.js

執行範例前,請確認已設定適當的 Node.js 開發環境

// Imports the Google Cloud client library const textToSpeech = require('@google-cloud/text-to-speech');  // Import other required libraries const {writeFile} = require('node:fs/promises');  // Creates a client const client = new textToSpeech.TextToSpeechClient();  async function quickStart() {   // The text to synthesize   const text = 'hello, world!';    // Construct the request   const request = {     input: {text: text},     // Select the language and SSML voice gender (optional)     voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},     // select the type of audio encoding     audioConfig: {audioEncoding: 'MP3'},   };    // Performs the text-to-speech request   const [response] = await client.synthesizeSpeech(request);    // Save the generated binary audio content to a local file   await writeFile('output.mp3', response.audioContent, 'binary');   console.log('Audio content written to file: output.mp3'); }  await quickStart();

Python

執行範例前,請確認已設定適當的 Python 開發環境

"""Synthesizes speech from the input string of text or ssml. Make sure to be working in a virtual environment.  Note: ssml must be well-formed according to:     https://www.w3.org/TR/speech-synthesis/ """ from google.cloud import texttospeech  # Instantiates a client client = texttospeech.TextToSpeechClient()  # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")  # Build the voice request, select the language code ("en-US") and the ssml # voice gender ("neutral") voice = texttospeech.VoiceSelectionParams(     language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL )  # Select the type of audio file you want returned audio_config = texttospeech.AudioConfig(     audio_encoding=texttospeech.AudioEncoding.MP3 )  # Perform the text-to-speech request on the text input with the selected # voice parameters and audio file type response = client.synthesize_speech(     input=synthesis_input, voice=voice, audio_config=audio_config )  # The response's audio_content is binary. with open("output.mp3", "wb") as out:     # Write the response to the output file.     out.write(response.audio_content)     print('Audio content written to file "output.mp3"')

恭喜!您已將第一個要求傳送至 Text-to-Speech。

還順利嗎?

清除所用資源

如要避免系統向您的 Google Cloud 帳戶收取本頁所用資源的費用,請按照下列步驟操作。

後續步驟

  • 如要進一步瞭解 Cloud Text-to-Speech,請參閱基本概念
  • 查看可用於合成語音的可用語音清單。