本頁面由 Cloud Translation API 翻譯而成。

透過自訂訓練模型取得線上推論

本頁說明如何使用 Google Cloud CLI 或 Vertex AI API，從自訂訓練模型取得線上 (即時) 推論結果。

設定線上推論的輸入格式

本節說明如何將推論輸入例項格式化並編碼為 JSON，如果您使用 predict 或 explain 方法，就必須這麼做。如果您使用rawPredict方法，則可略過這個步驟。如要瞭解該選擇哪種方法，請參閱將要求傳送至端點。

如果您使用 Python 適用的 Vertex AI SDK 傳送推論要求，請指定不含 instances 欄位的執行個體清單。舉例來說，請指定 [ ["the","quick","brown"], ... ]，而不是 { "instances": [ ["the","quick","brown"], ... ] }。

如果模型使用自訂容器，輸入內容必須採用 JSON 格式，且容器可使用額外的 parameters 欄位。進一步瞭解如何使用自訂容器推斷格式。

讓樣本採用 JSON 字串格式

線上推論的基本格式是一份資料樣本清單。視您在訓練應用程式中設定輸入內容的方式而定，樣本可以是簡單的值清單或 JSON 物件的內含元素。TensorFlow 模型可以接受較為複雜的輸入內容，而大部分 scikit-learn 和 XGBoost 模型採用的輸入內容格式都是數字清單。

這個範例顯示了 TensorFlow 模型的輸入張量和樣本鍵：

 {"values": [1, 2, 3, 4], "key": 1}

只要 JSON 字串的格式符合下列規則，其構成內容就可以較為複雜：

樣本資料的頂層必須是 JSON 物件，也就是鍵/值組合的字典。
樣本物件的個別值可以是字串、數字或清單。您無法嵌入 JSON 物件。
清單僅能包含相同類型的項目 (包括其他清單)。不能混合使用字串和數值。

您要將線上推論的輸入樣本做為 projects.locations.endpoints.predict 呼叫的訊息主體傳送。進一步瞭解要求主體的格式需求。

將每個樣本設為 JSON 陣列中的一個項目，並將該陣列做為 JSON 物件的 instances 欄位。例如：

{"instances": [   {"values": [1, 2, 3, 4], "key": 1},   {"values": [5, 6, 7, 8], "key": 2} ]}

編碼二進位資料以用於推論輸入

二進位資料無法採用 JSON 支援的 UTF-8 編碼字串格式。如果您的輸入內容含有二進位資料，則必須使用 base64 編碼來表示。下列是必要的特殊格式設定：

編碼字串必須採用具有 b64 單一索引鍵的 JSON 物件格式。在 Python 3 中，base64 編碼會輸出一個位元組序列。您必須將這個序列轉換成字串，讓它可透過 JSON 序列化：
```
{'image_bytes': {'b64': base64.b64encode(jpeg_data).decode()}} 
```
在 TensorFlow 模型程式碼中，您必須為二進位輸入和輸出張量提供結尾為「_bytes」的別名。

要求和回應範例

本節說明推論要求主體和回應主體的格式，並提供 TensorFlow、scikit-learn 和 XGBoost 的範例。

要求主體詳細資料

TensorFlow

要求主體包含採用下列結構的資料 (JSON 表示法)：

{   "instances": [     <value>|<simple/nested list>|<object>,     ...   ] }

instances[] 物件為必要項目，且必須包含要取得推論結果的執行個體清單。

樣本清單內每項元素的結構取決於您的模型輸入定義。樣本可以包含已命名的輸入 (做為物件)，也可以僅包含未加標籤的值。

只有部分資料會包含已命名的輸入。某些樣本屬於簡易的 JSON 值 (布林值、數值或字串)，但樣本通常是包含簡單值或複雜巢狀結構的清單。

以下是幾個要求主體的範例。

將每一資料列都編碼為字串值的 CSV 資料：

 {"instances": ["1.0,true,\\"x\\"", "-2.0,false,\\"y\\""]}

純文字：

 {"instances": ["the quick brown fox", "the lazy dog"]}

編碼為字詞清單的語句 (字串向量)：

 {   "instances": [     ["the","quick","brown"],     ["the","lazy","dog"],     ...   ] }

浮點純量值：

 {"instances": [0.0, 1.1, 2.2]}

整數向量：

 {   "instances": [     [0, 1, 2],     [3, 4, 5],     ...   ] }

張量 (下列範例是二維張量)：

 {   "instances": [     [       [0, 1, 2],       [3, 4, 5]     ],     ...   ] }

可透過不同方式來表示的圖片。在此編碼配置中，前兩個維度代表圖片的列和欄，第三個維度則包含每個像素的 R、G、B 值的清單 (向量)：

 {   "instances": [     [       [         [138, 30, 66],         [130, 20, 56],         ...       ],       [         [126, 38, 61],         [122, 24, 57],         ...       ],       ...     ],     ...   ] }

資料編碼

JSON 字串的編碼必須為 UTF-8。如要傳送二進位資料，您必須使用 base64 編碼並將資料標示為二進位。如要將 JSON 字串標示為二進位，請將字串替換為具備 b64 單一屬性的 JSON 物件：

{"b64": "..."}

下列範例顯示須採用 base64 編碼兩個序列化 tf.Examples 例項 (此為僅供說明之用的偽資料)：

 {"instances": [{"b64": "X5ad6u"}, {"b64": "IA9j4nx"}]}

下列範例顯示須採用 base64 編碼兩個 JPEG 圖片位元組字串 (此為僅供說明之用的偽資料)：

 {"instances": [{"b64": "ASa8asdf"}, {"b64": "JLK7ljk3"}]}

多個輸入張量

部分模型具有可接受多個輸入張量的基礎 TensorFlow 圖形。此案例採用 JSON 名稱/值組合中的名稱來識別輸入張量。

輸入張量別名為「tag」(字串) 和「image」(base64 編碼字串) 的圖表：

 {   "instances": [     {       "tag": "beach",       "image": {"b64": "ASa8asdf"}     },     {       "tag": "car",       "image": {"b64": "JLK7ljk3"}     }   ] }

輸入張量別名為「tag」(字串) 和「image」(8 位元整數的 3 維陣列) 的圖表：

 {   "instances": [     {       "tag": "beach",       "image": [         [           [138, 30, 66],           [130, 20, 56],           ...         ],         [           [126, 38, 61],           [122, 24, 57],           ...         ],         ...       ]     },     {       "tag": "car",       "image": [         [           [255, 0, 102],           [255, 0, 97],           ...         ],         [           [254, 1, 101],           [254, 2, 93],           ...         ],         ...       ]     },     ...   ] }

scikit-learn

要求主體包含採用下列結構的資料 (JSON 表示法)：

{   "instances": [     <simple list>,     ...   ] }

instances[] 是必要物件，而且必須包含要取得推論結果的樣本清單。在下列範例中，每個輸入樣本都是浮點清單：

{   "instances": [     [0.0, 1.1, 2.2],     [3.3, 4.4, 5.5],     ...   ] }

輸入樣本的維度必須與模型預期的維度相符。舉例來說，如果模型須具備三個特徵，則每個輸入樣本的長度必須為 3。

XGBoost

要求主體包含採用下列結構的資料 (JSON 表示法)：

{   "instances": [     <simple list>,     ...   ] }

instances[] 是必要物件，而且必須包含要取得推論結果的樣本清單。在下列範例中，每個輸入樣本都是浮點清單：

{   "instances": [     [0.0, 1.1, 2.2],     [3.3, 4.4, 5.5],     ...   ] }

輸入樣本的維度必須與模型預期的維度相符。舉例來說，如果模型須具備三個特徵，則每個輸入樣本的長度必須為 3。

Vertex AI 不支援讓 XGBoost 輸入內容項目採用稀疏表示法。

線上推論服務對零和 NaN 有不同的解釋。如果某個特徵的值為零，請在對應的輸入中使用 0.0。如果特徵的值遺失，請在對應的輸入中使用 "NaN"。

下列範例表示具有單一輸入樣本的推論要求，其中第一個特徵的值是 0.0，第二個特徵值的是 1.1，沒有第三個特徵的值：

{"instances": [[0.0, 1.1, "NaN"]]}

PyTorch

如果模型使用 PyTorch 預建容器，TorchServe 的預設處理常式會將每個執行個體包裝在 data 欄位中。例如：

{   "instances": [     { "data": , <value> },     { "data": , <value> }   ] }

回應主體詳細資料

如果呼叫成功，要求主體中的每個樣本都會在回應主體中產生一個推論項目，提供順序如下所示：

{   "predictions": [     {       object     }   ],   "deployedModelId": string }

如果任何執行個體的推論失敗，回應主體就不會包含任何推論結果。而是包含一個錯誤項目：

{   "error": string }

predictions[] 物件包含推論清單，要求中的每個例項都有一個對應的推論。

如果發生錯誤，error 字串將包含一個描述問題的訊息。如果處理任何樣本時發生錯誤，服務會傳回錯誤，而不是傳回推論清單。

雖然每個樣本都會有一個推論，但是推論的格式與樣本的格式沒有直接的關聯。推論的格式是由模型中定義的輸出集合所指定。推論集合會以 JSON 清單的形式傳回。清單的每個成員可為簡單值、清單或任意複雜度的 JSON 物件。如果模型擁有的輸出張量不只一個，每個推論就會是包含每個輸出名稱/值組合的 JSON 物件。這些名稱可識別圖表中的輸出別名。

回應主體範例

TensorFlow

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
 {"predictions":    [5, 4, 3],    "deployedModelId": 123456789012345678 } 
```
一組較為複雜的預測，每個預測都包含兩個已命名的值。這些值對應到名為 label 和 scores 的輸出張量。label 的值為預測類別 (「car」或「beach」)，而 scores 則包含該樣本在各可能類別的機率清單。
```
 {   "predictions": [     {       "label": "beach",       "scores": [0.1, 0.9]     },     {       "label": "car",       "scores": [0.75, 0.25]     }   ],   "deployedModelId": 123456789012345678 } 
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
 {"error": "Divide by zero"} 
```

scikit-learn

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
 {"predictions":    [5, 4, 3],    "deployedModelId": 123456789012345678 } 
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
 {"error": "Divide by zero"} 
```

XGBoost

下列範例顯示幾種可能的回應：

針對三個輸入樣本產生一組簡單預測，其中每個預測都是整數值：
```
 {"predictions":    [5, 4, 3],    "deployedModelId": 123456789012345678 } 
```
如果處理輸入樣本時發生錯誤，則回應如下：
```
 {"error": "Divide by zero"} 
```

向端點傳送要求

你可以透過下列三種方式傳送要求：

推論要求：將要求傳送至 predict，取得線上推論結果。
原始推論要求：將要求傳送至 rawPredict，讓您使用任意 HTTP 酬載，不必遵循本頁「設定輸入內容格式」一節所述的準則。在下列情況下，您可能會想取得原始推論結果：
- 您使用自訂容器接收要求並傳送與指南不同的回應。
- 您需要較低的延遲時間。rawPredict 會略過序列化步驟，直接將要求轉送至推論容器。
- 您使用 NVIDIA Triton 放送推論。
說明要求：將要求傳送至 explain。如果您已為 Vertex Explainable AI Model設定，即可取得線上說明。線上說明要求與線上推論要求格式相同，且會傳回類似的回應；唯一不同的是，線上說明回應會包含特徵歸因和推論。

將線上推論要求傳送至專屬公開端點

專屬端點可使用 HTTP 和 gRPC 通訊協定進行通訊。如果是 gRPC 要求，必須加入 x-vertex-ai-endpoint-id 標頭，才能確保正確識別端點。下列 API 可透過這些專屬端點使用：

預測
RawPredict
StreamRawPredict
對話完成 (僅限 Model Garden)

專屬端點會使用新的網址路徑。您可以從 REST API 的 dedicatedEndpointDns 欄位，或 Vertex AI SDK for Python 的 Endpoint.dedicated_endpoint_dns 中擷取這個路徑。您也可以使用下列程式碼，手動建構端點路徑：

f"https://ENDPOINT_ID.LOCATION_ID-PROJECT_NUMBER.prediction.vertexai.goog/v1/projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

更改下列內容：

ENDPOINT_ID：端點的 ID。
LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_NUMBER：專案編號。這與專案 ID 不同。您可以在 Google Cloud 控制台的專案「Project Settings」(專案設定) 頁面中找到專案編號。

如要使用 Python 適用的 Vertex AI SDK，將推論傳送至專屬端點，請將 use_dedicated_endpoint 參數設為 True：

endpoint.predict(instances=instances, use_dedicated_endpoint=True)

將線上推論要求傳送至共用的公開端點

gcloud

下列範例使用 gcloud ai endpoints predict 指令：

在您的本機環境中，將下列 JSON 物件寫入檔案。檔案名稱不重要，但以這個範例來說，請將檔案命名為 request.json。
```
{  "instances": INSTANCES } 
```
更改下列內容：
- INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。
執行下列指令：
```
gcloud ai endpoints predict ENDPOINT_ID \   --region=LOCATION_ID \   --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：您的專案 ID
ENDPOINT_ID：端點的 ID。
INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict

JSON 要求主體：

 {   "instances": INSTANCES }

如要傳送要求，請選擇以下其中一個選項：

curl

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:predict" | Select-Object -Expand Content

如果成功，您會收到類似下列內容的 JSON 回應。在回覆中，預期會看到以下取代內容：

PREDICTIONS：預測結果的 JSON 陣列，其中一個預測結果對應至您在要求主體中加入的每個執行個體。
DEPLOYED_MODEL_ID：提供預測結果的DeployedModel ID。

 {   "predictions": PREDICTIONS,   "deployedModelId": "DEPLOYED_MODEL_ID" }

Java

在試用這個範例之前，請先按照Java使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Java API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

 import com.google.cloud.aiplatform.v1.EndpointName; import com.google.cloud.aiplatform.v1.PredictRequest; import com.google.cloud.aiplatform.v1.PredictResponse; import com.google.cloud.aiplatform.v1.PredictionServiceClient; import com.google.cloud.aiplatform.v1.PredictionServiceSettings; import com.google.protobuf.ListValue; import com.google.protobuf.Value; import com.google.protobuf.util.JsonFormat; import java.io.IOException; import java.util.List;  public class PredictCustomTrainedModelSample {   public static void main(String[] args) throws IOException {     // TODO(developer): Replace these variables before running the sample.     String instance = "[{ “feature_column_a”: “value”, “feature_column_b”: “value”}]";     String project = "YOUR_PROJECT_ID";     String endpointId = "YOUR_ENDPOINT_ID";     predictCustomTrainedModel(project, endpointId, instance);   }    static void predictCustomTrainedModel(String project, String endpointId, String instance)       throws IOException {     PredictionServicPredictionServiceSettingsceSettings =         PredictionServicPredictionServiceSettings          .setEndpoint("us-central1-aiplatform.googleapis.com:443")             .build();      // Initialize client that will be used to send requests. This client only needs to be created     // once, and can be reused for multiple requests. After completing all of your requests, call     // the "close" method on the client to safely clean up any remaining background resources.     try (PredictionServicPredictionServiceClientceClient =         PredictionServicPredictionServiceClientonServiceSettings)) {       String location = "us-central1";       EndpointName endEndpointNameEndpointName.of(EndpointNameation, endpointId);        ListValue.BuildeListValueue = ListValue.newBuiListValue     JsonFormat.parseJsonFormatinstance, listValue);       List<Value> instanListValuelistValue.getValuesList();        PredictRequest pPredictRequest=           PredictRequest.nPredictRequest            .setEndpoint(endpointName.toSendpointName.toString().addAllInstances(instanceList)               .build();       PredictResponse PredictResponse = predictionServiceClient.predict(predictRequest);        System.out.println("Predict Custom Trained model Response");       System.out.format("\tDeployed Model Id: %s\n", predictResponse.predictResponse.getDeployedModelId()out.println("Predictions");       for (Value predictionValueedictResponse.predictResponse.getPredictionsList()em.out.format("\tPrediction: %s\n", prediction);       }     }   } }

Node.js

在試用這個範例之前，請先按照Node.js使用用戶端程式庫的 Vertex AI 快速入門中的操作說明進行設定。詳情請參閱 Vertex AI Node.js API 參考說明文件。

如要向 Vertex AI 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

/**  * TODO(developer): Uncomment these variables before running the sample.\  * (Not necessary if passing values as arguments)  */  // const filename = "YOUR_PREDICTION_FILE_NAME"; // const endpointId = "YOUR_ENDPOINT_ID"; // const project = 'YOUR_PROJECT_ID'; // const location = 'YOUR_PROJECT_LOCATION'; const util = require('util'); const {readFile} = require('fs'); const readFileAsync = util.promisify(readFile);  // Imports the Google Cloud Prediction Service Client library const {PredictionServiceClient} = require('@google-cloud/aiplatform');  // Specifies the location of the api endpoint const clientOptions = {   apiEndpoint: 'us-central1-aiplatform.googleapis.com', };  // Instantiates a client const predictionServiceClient = new PredictionServiceClient(clientOptions);  async function predictCustomTrainedModel() {   // Configure the parent resource   const endpoint = `projects/${project}/locations/${location}/endpoints/${endpointId}`;   const parameters = {     structValue: {       fields: {},     },   };   const instanceDict = await readFileAsync(filename, 'utf8');   const instanceValue = JSON.parse(instanceDict);   const instance = {     structValue: {       fields: {         Age: {stringValue: instanceValue['Age']},         Balance: {stringValue: instanceValue['Balance']},         Campaign: {stringValue: instanceValue['Campaign']},         Contact: {stringValue: instanceValue['Contact']},         Day: {stringValue: instanceValue['Day']},         Default: {stringValue: instanceValue['Default']},         Deposit: {stringValue: instanceValue['Deposit']},         Duration: {stringValue: instanceValue['Duration']},         Housing: {stringValue: instanceValue['Housing']},         Job: {stringValue: instanceValue['Job']},         Loan: {stringValue: instanceValue['Loan']},         MaritalStatus: {stringValue: instanceValue['MaritalStatus']},         Month: {stringValue: instanceValue['Month']},         PDays: {stringValue: instanceValue['PDays']},         POutcome: {stringValue: instanceValue['POutcome']},         Previous: {stringValue: instanceValue['Previous']},       },     },   };    const instances = [instance];   const request = {     endpoint,     instances,     parameters,   };    // Predict request   const [response] = await predictionServiceClient.predict(request);    console.log('Predict custom trained model response');   console.log(`\tDeployed model id : ${response.deployedModelId}`);   const predictions = response.predictions;   console.log('\tPredictions :');   for (const prediction of predictions) {     console.log(`\t\tPrediction : ${JSON.stringify(prediction)}`);   } } predictCustomTrainedModel();

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def endpoint_predict_sample(     project: str, location: str, instances: list, endpoint: str ):     aiplatform.init(project=project, location=location)      endpoint = aiplatform.Endpoint(endpoint)      prediction = endpoint.predict(instances=instances)     print(prediction)     return prediction

傳送線上原始推論要求

gcloud

下列範例使用 gcloud ai endpoints raw-predict 指令：

如要使用指令列中指定的 REQUEST 內的 JSON 物件要求推論，請執行下列操作：

 gcloud ai endpoints raw-predict ENDPOINT_ID \      --region=LOCATION_ID \      --request=REQUEST

如要使用儲存在 image.jpeg 檔案中的圖片和適當的 Content-Type 標頭要求推論：
```
 gcloud ai endpoints raw-predict ENDPOINT_ID \      --region=LOCATION_ID \      --http-headers=Content-Type=image/jpeg \      --request=@image.jpeg  
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。
- REQUEST：您要取得推論結果的要求內容。要求的格式取決於自訂容器的預期內容，不一定是 JSON 物件。

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

from google.cloud import aiplatform_v1   def sample_raw_predict():     # Create a client     client = aiplatform_v1.PredictionServiceClient()      # Initialize request argument(s)     request = aiplatform_v1.RawPredictRequest(         endpoint="endpoint_value",     )      # Make the request     response = client.raw_predict(request=request)      # Handle the response     print(response)

回應會包含下列 HTTP 標頭：

X-Vertex-AI-Endpoint-Id：提供這項推論的 Endpoint ID。
X-Vertex-AI-Deployed-Model-Id：提供這項推論的端點 DeployedModel ID。

傳送線上說明要求

gcloud

下列範例使用 gcloud ai endpoints explain 指令：

在您的本機環境中，將下列 JSON 物件寫入檔案。檔案名稱不重要，但以這個範例來說，請將檔案命名為 request.json。
```
{  "instances": INSTANCES } 
```
更改下列內容：
- INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。
執行下列指令：
```
gcloud ai endpoints explain ENDPOINT_ID \   --region=LOCATION_ID \   --json-request=request.json
```
更改下列內容：
- ENDPOINT_ID：端點的 ID。
- LOCATION_ID：您使用 Vertex AI 的區域。
如要將說明要求傳送至特定 DeployedModel 的 Endpoint，可以選擇指定 --deployed-model-id 旗標：
```
gcloud ai endpoints explain ENDPOINT_ID \   --region=LOCATION \   --deployed-model-id=DEPLOYED_MODEL_ID \   --json-request=request.json
```
除了先前說明的預留位置外，請替換下列項目：
- DEPLOYED_MODEL_ID 選用：您要取得說明的已部署模型 ID。ID 會納入 predict 方法的回應中。如要為特定模型要求說明，且您在同一個端點部署了多個模型，可以使用這個 ID 確保系統傳回該特定模型的說明。

REST

使用任何要求資料之前，請先替換以下項目：

LOCATION_ID：您使用 Vertex AI 的區域。
PROJECT_ID：
ENDPOINT_ID：端點的 ID。
INSTANCES：您要取得推論結果的執行個體 JSON 陣列。每個執行個體的格式取決於訓練好的機器學習模型預期的輸入內容。詳情請參閱「設定線上推論的輸入格式」。

HTTP 方法和網址：

POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain

JSON 要求主體：

 {   "instances": INSTANCES }

如要傳送要求，請選擇以下其中一個選項：

curl

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain"

PowerShell

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/endpoints/ENDPOINT_ID:explain" | Select-Object -Expand Content

如果成功，您會收到類似下列內容的 JSON 回應。在回覆中，預期會看到以下取代內容：

PREDICTIONS：預測結果的 JSON 陣列，其中一個預測結果對應至您在要求主體中加入的每個執行個體。
EXPLANATIONS：說明的 JSON 陣列，每項預測都有一則說明。
DEPLOYED_MODEL_ID：提供預測結果的DeployedModel ID。

 {   "predictions": PREDICTIONS,   "explanations": EXPLANATIONS,   "deployedModelId": "DEPLOYED_MODEL_ID" }

Python

如要瞭解如何安裝或更新 Python 適用的 Vertex AI SDK，請參閱「安裝 Python 適用的 Vertex AI SDK」。詳情請參閱 Python API 參考說明文件。

def explain_tabular_sample(     project: str, location: str, endpoint_id: str, instance_dict: Dict ):      aiplatform.init(project=project, location=location)      endpoint = aiplatform.Endpoint(endpoint_id)      response = endpoint.explain(instances=[instance_dict], parameters={})      for explanation in response.explanations:         print(" explanation")         # Feature attributions.         attributions = explanation.attributions         for attribution in attributions:             print("  attribution")             print("   baseline_output_value:", attribution.baseline_output_value)             print("   instance_output_value:", attribution.instance_output_value)             print("   output_display_name:", attribution.output_display_name)             print("   approximation_error:", attribution.approximation_error)             print("   output_name:", attribution.output_name)             output_index = attribution.output_index             for output_index in output_index:                 print("   output_index:", output_index)      for prediction in response.predictions:         print(prediction)

後續步驟

瞭解線上推論記錄。

透過自訂訓練模型取得線上推論 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

設定線上推論的輸入格式

讓樣本採用 JSON 字串格式

編碼二進位資料以用於推論輸入

要求和回應範例

要求主體詳細資料

TensorFlow

資料編碼

多個輸入張量

scikit-learn

XGBoost

PyTorch

回應主體詳細資料

回應主體範例

TensorFlow

scikit-learn

XGBoost

向端點傳送要求

將線上推論要求傳送至專屬公開端點

將線上推論要求傳送至共用的公開端點

gcloud

REST

curl

PowerShell

Java

Node.js

Python

傳送線上原始推論要求

gcloud

Python

傳送線上說明要求

gcloud

REST

curl

PowerShell

Python

後續步驟

透過自訂訓練模型取得線上推論