此页面由 Cloud Translation API 翻译。

发送处理请求

设置 Google Cloud 账号并创建处理器后，您可以向 Document AI 处理器发送请求。

用于发送请求的代码对于所有处理器都是相同的。您会发现，每个处理器输出的信息中都存在处理器功能方面的差异。

使用 v1 版 Document AI 或在 Google Cloud 控制台中，您可以向该特定处理器版本发送处理请求。如果您未指定处理器版本，则系统会使用默认版本。如需了解详情，请参阅管理处理器版本。

在线处理

通过在线（同步）请求，您可以发送单个文档进行处理。 Document AI 会立即处理请求并返回 document。

向数据处理方发送请求

以下代码示例展示了如何向处理器发送请求。

REST

此示例展示了如何在 rawDocument 对象中提供文档内容（以字节为单位的原始文档内容，通过 base64 编码的字符串提供）。

或者，您也可以指定 inlineDocument，这与 Document AI 返回的 Document JSON 格式相同。这样，您便可以通过来回传递相同格式来链接请求（例如，如果您对文档进行分类，然后提取其内容）。

在使用任何请求数据之前，请先进行以下替换：

LOCATION：处理器的位置，例如：
- us - 美国
- eu - 欧盟
PROJECT_ID：您的 Google Cloud 项目 ID。
PROCESSOR_ID：自定义处理器的 ID。
skipHumanReview：一个用于停用人工审核的布尔值（仅受人机协同处理器支持）。
- true - 跳过人工审核
- false - 启用人工审核（默认）
MIME_TYPE^†：有效的 MIME 类型选项之一。
IMAGE_CONTENT^†：有效的内嵌文档内容之一，表示为字节流。对于 JSON 表示形式，二进制图片数据的 base64 编码（ASCII 字符串）。此字符串应类似于以下字符串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需了解详情，请参阅 Base64 编码主题。
FIELD_MASK：指定要在 Document 输出中包含哪些字段。这是完全限定字段名称的逗号分隔列表，格式为 FieldMask。
- 示例：text,entities,pages.pageNumber
INDIVIDUAL_PAGES：要处理的各个网页的列表。
- 或者，提供字段 fromStart 或 fromEnd 以处理文档开头或结尾的指定数量的页面。

† 也可以使用 inlineDocument 对象中的 base64 编码内容指定此内容。

HTTP 方法和网址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process

请求 JSON 正文：

 {   "skipHumanReview": skipHumanReview,   "rawDocument": {     "mimeType": "MIME_TYPE",     "content": "IMAGE_CONTENT"   },   "fieldMask": "FIELD_MASK",   "processOptions": {     "individualPageSelector" {       "pages": [INDIVIDUAL_PAGES]     }   } }

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process" | Select-Object -Expand Content

如果请求成功，服务器将返回一个 200 OK HTTP 状态代码以及 JSON 格式的响应。响应正文包含一个 Document 实例。

向处理器版本发送请求

在使用任何请求数据之前，请先进行以下替换：

LOCATION：处理器的位置，例如：
- us - 美国
- eu - 欧盟
PROJECT_ID：您的 Google Cloud 项目 ID。
PROCESSOR_ID：自定义处理器的 ID。
PROCESSOR_VERSION：处理器版本标识符。如需了解详情，请参阅选择处理器版本。例如：
- pretrained-TYPE-vX.X-YYYY-MM-DD
- stable
- rc
skipHumanReview：一个用于停用人工审核的布尔值（仅受人机协同处理器支持）。
- true - 跳过人工审核
- false - 启用人工审核（默认）
MIME_TYPE^†：有效的 MIME 类型选项之一。
IMAGE_CONTENT^†：有效的内嵌文档内容之一，表示为字节流。对于 JSON 表示形式，二进制图片数据的 base64 编码（ASCII 字符串）。此字符串应类似于以下字符串：
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
如需了解详情，请参阅 Base64 编码主题。
FIELD_MASK：指定要在 Document 输出中包含哪些字段。这是完全限定字段名称的逗号分隔列表，格式为 FieldMask。
- 示例：text,entities,pages.pageNumber

† 也可以使用 inlineDocument 对象中的 base64 编码内容指定此内容。

HTTP 方法和网址：

POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process

请求 JSON 正文：

 {   "skipHumanReview": skipHumanReview,   "rawDocument": {     "mimeType": "MIME_TYPE",     "content": "IMAGE_CONTENT"   },   "fieldMask": "FIELD_MASK" }

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID/processorVersions/PROCESSOR_VERSION:process" | Select-Object -Expand Content

如果请求成功，服务器将返回一个 200 OK HTTP 状态代码以及 JSON 格式的响应。响应正文包含一个 Document 实例。

C#

如需了解详情，请参阅 Document AI C# API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

 using Google.Cloud.DocumentAI.V1; using Google.Protobuf; using System; using System.IO;  public class QuickstartSample {     public Document Quickstart(         string projectId = "your-project-id",         string locationId = "your-processor-location",         string processorId = "your-processor-id",         string localPath = "my-local-path/my-file-name",         string mimeType = "application/pdf"     )     {         // Create client         var client = new DocumentProcessorServiceClientBuilder         {             Endpoint = $"{locationId}-documentai.googleapis.com"         }.Build();          // Read in local file         using var fileStream = File.OpenRead(localPath);         var rawDocument = new RawDocument         {             Content = ByteString.FromStream(fileStream),             MimeType = mimeType         };          // Initialize request argument(s)         var request = new ProcessRequest         {             Name = ProcessorName.FromProjectLocationProcessor(projectId, locationId, processorId).ToString(),             RawDocument = rawDocument         };          // Make the request         var response = client.ProcessDocument(request);          var document = response.Document;         Console.WriteLine(document.Text);         return document;     } }

Java

如需了解详情，请参阅 Document AI Java API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

 import com.google.cloud.documentai.v1.Document; import com.google.cloud.documentai.v1.DocumentProcessorServiceClient; import com.google.cloud.documentai.v1.DocumentProcessorServiceSettings; import com.google.cloud.documentai.v1.ProcessRequest; import com.google.cloud.documentai.v1.ProcessResponse; import com.google.cloud.documentai.v1.RawDocument; import com.google.protobuf.ByteString; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.util.List; import java.util.concurrent.ExecutionException; import java.util.concurrent.TimeoutException;  public class ProcessDocument {   public static void processDocument()       throws IOException, InterruptedException, ExecutionException, TimeoutException {     // TODO(developer): Replace these variables before running the sample.     String projectId = "your-project-id";     String location = "your-project-location"; // Format is "us" or "eu".     String processerId = "your-processor-id";     String filePath = "path/to/input/file.pdf";     processDocument(projectId, location, processerId, filePath);   }    public static void processDocument(       String projectId, String location, String processorId, String filePath)       throws IOException, InterruptedException, ExecutionException, TimeoutException {     // Initialize client that will be used to send requests. This client only needs     // to be created     // once, and can be reused for multiple requests. After completing all of your     // requests, call     // the "close" method on the client to safely clean up any remaining background     // resources.     String endpoint = String.format("%s-documentai.googleapis.com:443", location);     DocumentProcessorServiceSettings settings =         DocumentProcessorServiceSettings.newBuilder().setEndpoint(endpoint).build();     try (DocumentProcessorServiceClient client = DocumentProcessorServiceClient.create(settings)) {       // The full resource name of the processor, e.g.:       // projects/project-id/locations/location/processor/processor-id       // You must create new processors in the Cloud Console first       String name =           String.format("projects/%s/locations/%s/processors/%s", projectId, location, processorId);        // Read the file.       byte[] imageFileData = Files.readAllBytes(Paths.get(filePath));        // Convert the image data to a Buffer and base64 encode it.       ByteString content = ByteString.copyFrom(imageFileData);        RawDocument document =           RawDocument.newBuilder().setContent(content).setMimeType("application/pdf").build();        // Configure the process request.       ProcessRequest request =           ProcessRequest.newBuilder().setName(name).setRawDocument(document).build();        // Recognizes text entities in the PDF document       ProcessResponse result = client.processDocument(request);       Document documentResponse = result.getDocument();        // Get all of the document text as one big string       String text = documentResponse.getText();        // Read the text recognition output from the processor       System.out.println("The document contains the following paragraphs:");       Document.Page firstPage = documentResponse.getPages(0);       List<Document.Page.Paragraph> paragraphs = firstPage.getParagraphsList();        for (Document.Page.Paragraph paragraph : paragraphs) {         String paragraphText = getText(paragraph.getLayout().getTextAnchor(), text);         System.out.printf("Paragraph text:\n%s\n", paragraphText);       }        // Form parsing provides additional output about       // form-formatted PDFs. You must create a form       // processor in the Cloud Console to see full field details.       System.out.println("The following form key/value pairs were detected:");        for (Document.Page.FormField field : firstPage.getFormFieldsList()) {         String fieldName = getText(field.getFieldName().getTextAnchor(), text);         String fieldValue = getText(field.getFieldValue().getTextAnchor(), text);          System.out.println("Extracted form fields pair:");         System.out.printf("\t(%s, %s))\n", fieldName, fieldValue);       }     }   }    // Extract shards from the text field   private static String getText(Document.TextAnchor textAnchor, String text) {     if (textAnchor.getTextSegmentsList().size() > 0) {       int startIdx = (int) textAnchor.getTextSegments(0).getStartIndex();       int endIdx = (int) textAnchor.getTextSegments(0).getEndIndex();       return text.substring(startIdx, endIdx);     }     return "[NO TEXT]";   } }

Node.js

如需了解详情，请参阅 Document AI Node.js API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

/**  * TODO(developer): Uncomment these variables before running the sample.  */ // const projectId = 'YOUR_PROJECT_ID'; // const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu' // const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console // const filePath = '/path/to/local/pdf';  const {DocumentProcessorServiceClient} =   require('@google-cloud/documentai').v1;  // Instantiates a client const client = new DocumentProcessorServiceClient();  async function processDocument() {   // The full resource name of the processor, e.g.:   // projects/project-id/locations/location/processor/processor-id   // You must create new processors in the Cloud Console first   const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;    // Read the file into memory.   const fs = require('fs').promises;   const imageFile = await fs.readFile(filePath);    // Convert the image data to a Buffer and base64 encode it.   const encodedImage = Buffer.from(imageFile).toString('base64');    const request = {     name,     rawDocument: {       content: encodedImage,       mimeType: 'application/pdf',     },   };    // Recognizes text entities in the PDF document   const [result] = await client.processDocument(request);   const {document} = result;    // Get all of the document text as one big string   const {text} = document;    // Extract shards from the text field   const getText = textAnchor => {     if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {       return '';     }      // First shard in document doesn't have startIndex property     const startIndex = textAnchor.textSegments[0].startIndex || 0;     const endIndex = textAnchor.textSegments[0].endIndex;      return text.substring(startIndex, endIndex);   };    // Read the text recognition output from the processor   console.log('The document contains the following paragraphs:');   const [page1] = document.pages;   const {paragraphs} = page1;    for (const paragraph of paragraphs) {     const paragraphText = getText(paragraph.layout.textAnchor);     console.log(`Paragraph text:\n${paragraphText}`);   }    // Form parsing provides additional output about   // form-formatted PDFs. You  must create a form   // processor in the Cloud Console to see full field details.   console.log('\nThe following form key/value pairs were detected:');    const {formFields} = page1;   for (const field of formFields) {     const fieldName = getText(field.fieldName.textAnchor);     const fieldValue = getText(field.fieldValue.textAnchor);      console.log('Extracted key value pair:');     console.log(`\t(${fieldName}, ${fieldValue})`);   } }

Python

如需了解详情，请参阅 Document AI Python API 参考文档。

如需向 Document AI 进行身份验证，请设置应用默认凭据。如需了解详情，请参阅为本地开发环境设置身份验证。

from typing import Optional  from google.api_core.client_options import ClientOptions from google.cloud import documentai  # type: ignore  # TODO(developer): Uncomment these variables before running the sample. # project_id = "YOUR_PROJECT_ID" # location = "YOUR_PROCESSOR_LOCATION" # Format is "us" or "eu" # processor_id = "YOUR_PROCESSOR_ID" # Create processor before running sample # file_path = "/path/to/local/pdf" # mime_type = "application/pdf" # Refer to https://cloud.google.com/document-ai/docs/file-types for supported file types # field_mask = "text,entities,pages.pageNumber"  # Optional. The fields to return in the Document object. # processor_version_id = "YOUR_PROCESSOR_VERSION_ID" # Optional. Processor version to use   def process_document_sample(     project_id: str,     location: str,     processor_id: str,     file_path: str,     mime_type: str,     field_mask: Optional[str] = None,     processor_version_id: Optional[str] = None, ) -> None:     # You must set the `api_endpoint` if you use a location other than "us".     opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")      client = documentai.DocumentProcessorServiceClient(client_options=opts)      if processor_version_id:         # The full resource name of the processor version, e.g.:         # `projects/{project_id}/locations/{location}/processors/{processor_id}/processorVersions/{processor_version_id}`         name = client.processor_version_path(             project_id, location, processor_id, processor_version_id         )     else:         # The full resource name of the processor, e.g.:         # `projects/{project_id}/locations/{location}/processors/{processor_id}`         name = client.processor_path(project_id, location, processor_id)      # Read the file into memory     with open(file_path, "rb") as image:         image_content = image.read()      # Load binary data     raw_document = documentai.RawDocument(content=image_content, mime_type=mime_type)      # For more information: https://cloud.google.com/document-ai/docs/reference/rest/v1/ProcessOptions     # Optional: Additional configurations for processing.     process_options = documentai.ProcessOptions(         # Process only specific pages         individual_page_selector=documentai.ProcessOptions.IndividualPageSelector(             pages=[1]         )     )      # Configure the process request     request = documentai.ProcessRequest(         name=name,         raw_document=raw_document,         field_mask=field_mask,         process_options=process_options,     )      result = client.process_document(request=request)      # For a full list of `Document` object attributes, reference this page:     # https://cloud.google.com/document-ai/docs/reference/rest/v1/Document     document = result.document      # Read the text recognition output from the processor     print("The document contains the following text:")     print(document.text)

批处理

借助批量（异步）请求，您可以在单个请求中发送多个文档。Document AI 会返回一个 operation，您可以轮询该对象以了解请求的状态。此操作完成后，它会包含一个指向存储处理后结果的 Cloud Storage 存储桶的 BatchProcessMetadata。

如果您要访问的输入文件位于其他项目的存储桶中，则必须先授予对该存储桶的访问权限，然后才能访问这些文件。请参阅设置文件访问权限。

向数据处理方发送请求

以下代码示例展示了如何向处理器发送批处理请求。

REST

此示例展示了如何向 batchProcess 方法发送 POST 请求，以进行大型文档异步处理。该示例使用通过 Google Cloud CLI 为项目设置的服务账号的访问令牌。如需了解有关安装 Google Cloud CLI、使用服务账号设置项目以及获取访问令牌的说明，请参阅准备工作。

batchProcess 请求会启动长时间运行的操作，并将结果存储在 Cloud Storage 存储桶中。此示例还展示了如何在长时间运行的操作开始后获取其状态。