使用客户端库根据文字创建音频

本快速入门将引导您完成使用客户端库向 Text-to-Speech 发出请求的过程,即根据文字创建音频。

如需详细了解 Text-to-Speech 中的基本概念,请阅读 Text-to-Speech 基础知识。 如需查看可用于您的语言的合成语音,请参阅支持的语音和语言页面

准备工作

您必须先完成以下操作,然后才能向 Text-to-Speech API 发送请求。如需了解详情,请参阅准备工作页面。

安装客户端库

Go

go get cloud.google.com/go/texttospeech/apiv1

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>   <dependencies>     <dependency>       <groupId>com.google.cloud</groupId>       <artifactId>libraries-bom</artifactId>       <version>26.65.0</version>       <type>pom</type>       <scope>import</scope>     </dependency>   </dependencies> </dependencyManagement>  <dependencies>   <dependency>     <groupId>com.google.cloud</groupId>     <artifactId>google-cloud-texttospeech</artifactId>   </dependency> </dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-texttospeech:2.71.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-texttospeech" % "2.71.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

Node.js

在安装库之前,请确保已经为 Node.js 开发准备好环境

npm install @google-cloud/text-to-speech

Python

在安装库之前,请确保已经为 Python 开发准备好环境

pip install --upgrade google-cloud-texttospeech

其他语言

C#:请按照客户端库页面上的 C# 设置说明操作,然后访问 .NET 的 TText-to-Speech 参考文档

PHP:请按照客户端库页面上的 PHP 设置说明 操作,然后访问 PHP 的 Text-to-Speech 参考文档

Ruby:请按照客户端库页面上的 Ruby 设置说明操作,然后访问 Ruby 的 Text-to-Speech 参考文档

创建音频数据

现在,您可以使用 Text-to-Speech 来创建合成人类语音的音频文件。请使用以下代码将 synthesize 请求发送到 Text-to-Speech。

Go

 // Command quickstart generates an audio file with the content "Hello, World!". package main  import ( 	"context" 	"fmt" 	"log" 	"os"  	texttospeech "cloud.google.com/go/texttospeech/apiv1" 	"cloud.google.com/go/texttospeech/apiv1/texttospeechpb" )  func main() { 	// Instantiates a client. 	ctx := context.Background()  	client, err := texttospeech.NewClient(ctx) 	if err != nil { 		log.Fatal(err) 	} 	defer client.Close()  	// Perform the text-to-speech request on the text input with the selected 	// voice parameters and audio file type. 	req := texttospeechpb.SynthesizeSpeechRequest{ 		// Set the text input to be synthesized. 		Input: &texttospeechpb.SynthesisInput{ 			InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"}, 		}, 		// Build the voice request, select the language code ("en-US") and the SSML 		// voice gender ("neutral"). 		Voice: &texttospeechpb.VoiceSelectionParams{ 			LanguageCode: "en-US", 			SsmlGender:   texttospeechpb.SsmlVoiceGender_NEUTRAL, 		}, 		// Select the type of audio file you want returned. 		AudioConfig: &texttospeechpb.AudioConfig{ 			AudioEncoding: texttospeechpb.AudioEncoding_MP3, 		}, 	}  	resp, err := client.SynthesizeSpeech(ctx, &req) 	if err != nil { 		log.Fatal(err) 	}  	// The resp's AudioContent is binary. 	filename := "output.mp3" 	err = os.WriteFile(filename, resp.AudioContent, 0644) 	if err != nil { 		log.Fatal(err) 	} 	fmt.Printf("Audio content written to file: %v\n", filename) } 

Java

// Imports the Google Cloud client library import com.google.cloud.texttospeech.v1.AudioConfig; import com.google.cloud.texttospeech.v1.AudioEncoding; import com.google.cloud.texttospeech.v1.SsmlVoiceGender; import com.google.cloud.texttospeech.v1.SynthesisInput; import com.google.cloud.texttospeech.v1.SynthesizeSpeechResponse; import com.google.cloud.texttospeech.v1.TextToSpeechClient; import com.google.cloud.texttospeech.v1.VoiceSelectionParams; import com.google.protobuf.ByteString; import java.io.FileOutputStream; import java.io.OutputStream;  /**  * Google Cloud TextToSpeech API sample application. Example usage: mvn package exec:java  * -Dexec.mainClass='com.example.texttospeech.QuickstartSample'  */ public class QuickstartSample {    /** Demonstrates using the Text-to-Speech API. */   public static void main(String... args) throws Exception {     // Instantiates a client     try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {       // Set the text input to be synthesized       SynthesisInput input = SynthesisInput.newBuilder().setText("Hello, World!").build();        // Build the voice request, select the language code ("en-US") and the ssml voice gender       // ("neutral")       VoiceSelectionParams voice =           VoiceSelectionParams.newBuilder()               .setLanguageCode("en-US")               .setSsmlGender(SsmlVoiceGender.NEUTRAL)               .build();        // Select the type of audio file you want returned       AudioConfig audioConfig =           AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3).build();        // Perform the text-to-speech request on the text input with the selected voice parameters and       // audio file type       SynthesizeSpeechResponse response =           textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);        // Get the audio contents from the response       ByteString audioContents = response.getAudioContent();        // Write the response to the output file.       try (OutputStream out = new FileOutputStream("output.mp3")) {         out.write(audioContents.toByteArray());         System.out.println("Audio content written to file \"output.mp3\"");       }     }   } }

Node.js

在运行该示例之前,请确保已经为 Node.js 开发准备好环境

// Imports the Google Cloud client library const textToSpeech = require('@google-cloud/text-to-speech');  // Import other required libraries const {writeFile} = require('node:fs/promises');  // Creates a client const client = new textToSpeech.TextToSpeechClient();  async function quickStart() {   // The text to synthesize   const text = 'hello, world!';    // Construct the request   const request = {     input: {text: text},     // Select the language and SSML voice gender (optional)     voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},     // select the type of audio encoding     audioConfig: {audioEncoding: 'MP3'},   };    // Performs the text-to-speech request   const [response] = await client.synthesizeSpeech(request);    // Save the generated binary audio content to a local file   await writeFile('output.mp3', response.audioContent, 'binary');   console.log('Audio content written to file: output.mp3'); }  await quickStart();

Python

在运行该示例之前,请确保已经为 Python 开发准备好环境

"""Synthesizes speech from the input string of text or ssml. Make sure to be working in a virtual environment.  Note: ssml must be well-formed according to:     https://www.w3.org/TR/speech-synthesis/ """ from google.cloud import texttospeech  # Instantiates a client client = texttospeech.TextToSpeechClient()  # Set the text input to be synthesized synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")  # Build the voice request, select the language code ("en-US") and the ssml # voice gender ("neutral") voice = texttospeech.VoiceSelectionParams(     language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL )  # Select the type of audio file you want returned audio_config = texttospeech.AudioConfig(     audio_encoding=texttospeech.AudioEncoding.MP3 )  # Perform the text-to-speech request on the text input with the selected # voice parameters and audio file type response = client.synthesize_speech(     input=synthesis_input, voice=voice, audio_config=audio_config )  # The response's audio_content is binary. with open("output.mp3", "wb") as out:     # Write the response to the output file.     out.write(response.audio_content)     print('Audio content written to file "output.mp3"')

恭喜!您已向 Text-to-Speech 发送了第一个请求。

结果怎么样?

清理

为避免因本页中使用的资源导致您的 Google Cloud 账号产生费用,请按照以下步骤操作。

后续步骤