金大哥 - gemini-cloud-assist-mcp MCP 详情

article

README

🚀 Gemini Cloud Assist MCP服务器

Gemini Cloud Assist MCP服务器可将模型上下文协议（MCP）客户端（如Gemini CLI）连接到Gemini Cloud Assist API。借助该服务器，你可以直接在本地命令行使用自然语言来理解、管理和排查Google Cloud环境中的问题。

⚠️ 重要提示

《Google Cloud Platform服务条款》（可在https://cloud.google.com/terms/ 查看）和《数据处理与安全条款》（可在https://cloud.google.com/terms/data-processing-terms 查看）不适用于Gemini Cloud Assist MCP服务器软件的任何组件。

要了解有关Gemini Cloud Assist的更多信息，请参阅Google Cloud文档中的Gemini Cloud Assist概述。

✨ 主要特性

创建并运行调查：创建并运行Cloud Assist调查，以找出复杂问题的根本原因。
深入挖掘并迭代调查：获取调查结果的更多详细信息，并添加观察结果以完善分析。
查找并分析Google Cloud资源：快速检查你的Cloud资源和应用程序的状态与性能。

🚀 快速开始

在开始之前，请确保你已完成以下设置：

Node.js（v20或更高版本）。
Git。
已安装并配置好Google Cloud SDK。
拥有一个Google Cloud项目。
用户账户具备以下IAM角色：
- roles/serviceusage.serviceUsageAdmin：启用Cloud Assist API所需。
- roles/geminicloudassist.user：向Cloud Assist API发出请求所需。

步骤1：对Google Cloud进行身份验证

Gemini Cloud Assist MCP服务器使用本地应用默认凭据（ADC）来安全地对Google Cloud进行身份验证。要设置ADC，请运行以下gcloud命令：

# 对gcloud CLI进行用户账户身份验证
gcloud auth login

# 为服务器设置应用默认凭据
# 这允许MCP服务器代表你安全地调用Google Cloud API
gcloud auth application-default login

步骤2：配置你的MCP客户端

以下是你将使用的标准配置片段。它指示客户端使用npx按需下载并运行MCP服务器的最新版本。将MCP配置粘贴到你选择的MCP客户端中。我们建议使用Gemini CLI以获得最佳体验。

MCP配置

"mcpServers" : {
   "GeminiCloudAssist": {
     "command": "npx",
     "args": ["-y", "@google-cloud/gemini-cloud-assist-mcp@latest"],
     "timeout": 600000
   }
}

📦 安装指南

Gemini CLI

选项1（推荐）：扩展安装

将MCP服务器作为Gemini CLI扩展进行安装：

gemini extensions install https://github.com/GoogleCloudPlatform/gemini-cloud-assist-mcp

通过运行以下命令验证安装是否成功：

gemini extensions list

选项2：全局安装

将MCP配置添加到~/.gemini/settings.json中。这样，你在每个Gemini CLI会话中都可以使用MCP工具。

选项3：项目级安装

将MCP配置添加到你的/path/to/project/.gemini/settings.json文件中。这样，你仅在该项目文件夹下创建的Gemini CLI会话中可以使用MCP工具。

Cursor

Claude Desktop

启动Claude Desktop应用程序。
导航到“设置”>“开发者”。
点击“编辑配置”按钮以打开claude_desktop_config.json文件。
将MCP配置添加到claude_desktop_config.json文件中。
保存文件并重启Claude Desktop。

Claude Code

claude mcp add GeminiCloudAssist npx -y @google-cloud/gemini-cloud-assist-mcp@latest

有关更多信息，请参阅Anthropic文档中的详细说明。

Cline

使用MCP配置更新cline_mcp_settings.json。有关更多信息，请参阅Cline网站上的详细说明。

🔧 MCP工具

create_investigation：创建一个新的Gemini Cloud Assist调查。
run_investigation：运行你已创建的现有Gemini Cloud Assist调查。
add_observation：向现有调查中添加新的观察结果，以完善其分析。
fetch_investigation：获取调查详细信息或列出项目中的调查。
search_and_analyze_gcp_resources：使用自然语言查询查找特定的Cloud资源。

💻 使用示例

本节展示了使用Gemini CLI通过Cloud Assist MCP服务器可以执行的任务示例，包括示例提示和响应。

基础用法

创建并运行调查

Using 1 MCP server (Ctrl+T to view descriptions)

╭───────────────────────────────────────────────────────────────────────────────────╮
│  > Investigate the pending pods issue in my test-cluster-1 cluster                │
╰───────────────────────────────────────────────────────────────────────────────────╯

✦ I can help with that. I'll start an investigation into the pending pods issue in
  your test-cluster-1 cluster. To do that, I'll run a few commands to get the
  information I need to start the investigation.
 ╭──────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  Shell gcloud config get-value project (Getting the configured GCP project.)   |
 ...                                                                                |
 ...                                                                                |
 ╰──────────────────────────────────────────────────────────────────────────────────╯

✦ I'll now create an investigation into the pending pods issue in your test-cluster-1 GKE cluster.
 ╭──────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  create_investigation (gemini-cloud-assist-mcp MCP Server) create_investigati… │
 │                                                                                  │
 │    ## Gemini Cloud Assist Investigation                                          │
 │                                                                                  │
 │    **Name**: [Gemini CLI] Pending pods in test-cluster-1 GKE cluster             │
 ...                                                                                |
 ...                                                                                |
 ╰──────────────────────────────────────────────────────────────────────────────────╯

✦ I've created the investigation. Now I'll run it to analyze the issue.
 ╭──────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  run_investigation (gemini-cloud-assist-mcp MCP Server) run_investigation (ge… │
 ...                                                                                |
 ...                                                                                |
 ╰──────────────────────────────────────────────────────────────────────────────────╯

✦ The investigation is complete. Here's what I found:

  The main problem is that your GKE cluster `test-cluster-1` is out of resources. It
  has only one node, and that node is full. This is causing new pods to get stuck in
  a "pending" state because there's no room for them.

  The investigation also revealed that Node Auto-Provisioning (NAP) is disabled. This
   is a key feature that allows the cluster to automatically add more nodes when it's
   running low on resources. Because NAP is disabled, the cluster can't scale up to
  meet the demand.

  I have two recommendations to fix this:
   1. Enable Node Auto-Provisioning for the test-cluster-1 cluster. This is the best
      long-term solution because it will allow the cluster to automatically scale up
      and down as needed.
   2. Manually add more nodes to the cluster. This is a good short-term solution that
       will immediately relieve the resource pressure and allow the pending pods to
      be scheduled.

  I recommend that you start by enabling Node Auto-Provisioning. I can do this for
  you now, or you can do it yourself by following the instructions in the official
  documentation (https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-pro
  visioning#enable).

  What would you like to do?

高级用法

深入挖掘并迭代调查

✦ The investigation is complete. Here's a summary of the findings:

  Primary Cause:

   * Ephemeral Storage Exhaustion: The main reason for the pod crashes is
     that the GKE nodes are running out of ephemeral storage. This is
     causing a DiskPressure condition on the nodes, which in turn leads to
     the kubelet evicting pods to free up space.

 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │  > Do you know which pod is consuming all the storage space?                      │
 ╰───────────────────────────────────────────────────────────────────────────────────╯

✦ That's a great question. The investigation points to the emptyDir volume
  in the gke-001-deployment pods. I can check the storage usage
  inside one of the running pods to confirm this.

  First, I'll list the pods to find one that is currently running.
 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  Shell kubectl get pods -n gke-001-namespace (This command will list..          │
 ...                                                                                 |
 ...                                                                                 |
 ╰───────────────────────────────────────────────────────────────────────────────────╯


✦ That's unexpected. The /data directory in that pod is only using 4.0K,
  which is negligible. This contradicts the hypothesis that the emptyDir
  volume is the primary consumer of ephemeral storage.

  This means something else on the node is consuming the storage. It could
  be logs, or another process running on the node that isn't immediately
  obvious.

  To figure this out, I'll need to get a shell on the node itself and
  investigate the storage usage from there. I'll start by getting an
  interactive shell on the node
  gke-001-cluster-gke-001-node-pool-001.
 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  Shell gcloud compute ssh gke-001-cluster-gke-001-node-pool-001…                │
 ...                                                                                 |
 ...                                                                                 |
 │    ERROR: (gcloud.compute.ssh) [/usr/local/bin/ssh] exited                        │
 │    with return code [255].                                                        |
 ╰───────────────────────────────────────────────────────────────────────────────────╯

✦ It seems I can't directly SSH into the node. This is likely due to
  firewall rules or the node not having an external IP address.

  However, I can still investigate the node's storage. I'll add an
  observation to the investigation, noting that the emptyDir volume is not
  the cause of the storage issue, and then re-run the investigation. This
  will allow Gemini to analyze the node's logs and other signals to
  pinpoint the true source of the storage consumption.
 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  add_observation (gemini-cloud-assist MCP Server) {"observation":"Fu…           │
 │                                                                                   │
 │    ## Gemini Cloud Assist Investigation                                           │
 │                                                                                   │
 │    **Name**: [Gemini CLI] GKE pods crashing due to ephemeral storage issues       │
 ...                                                                                 |
 ...                                                                                 |
 ╰───────────────────────────────────────────────────────────────────────────────────╯

✦ I've added the new information to the investigation. Now I'll re-run the
  analysis. This will take the new observation into account and should help
  us identify the real source of the storage consumption.
 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  run_investigation (gemini-cloud-assist MCP Server) {"investigationI…           │
 │                                                                                   │
 ...                                                                                 |
 ...                                                                                 |
 ╰───────────────────────────────────────────────────────────────────────────────────╯


✦ The new investigation results are in.

  Key Findings:

   * `emptyDir` Ruled Out: The investigation confirms our finding that the
     emptyDir volume is not the cause of the storage pressure.
   * Container Logs are the Likely Culprit: The analysis strongly suggests
     that the ephemeral storage is being consumed by container logs. The
     kubelet is failing to reopen log files due to "no space left on
     device," which is a clear indicator of this issue.
   * Another Deployment is Implicated: The investigation also points to
     another deployment, gke-002-deployment, as a potential source of the
     problem. The logs mention that this deployment is "configured to fill
     up disk."

搜索并分析Cloud资源

 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │  > Give me an overview of my GKE node pools. For each one, show me its            │
 │    cluster and MIGs                                                               │
 ╰───────────────────────────────────────────────────────────────────────────────────╯

 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  search_and_analyze_gcp_resources (gemini-cloud-assist-mcp MCP Server) s…       │
 │                                                                                   │
 │    Based on the output of the `container.googleapis.com` API                      │
 │    calls, here is an overview of your GKE node pools, including                   │
 │    their associated clusters and Managed Instance Groups (MIGs):                  │
 │                                                                                   │
 │    *   **Cluster: `cluster-1` (Zone: `us-central1-a`)**                           │
 │        *   **Node Pool:** `default-pool`                                          │
 │            *   **Machine Type:** `custom-1-3840`                                  │
 │            *   **Initial Node Count:** 3                                          │
 │            *   **Autoscaling:** Enabled (min: 1, max: 5)                          │
 │            *   **MIGs:**  `gke-cluster-1-default-pool-001-grp`                    │
 │                                                                                   │
 │    *   **Cluster: `anthos-cluster-1` (Zone: `us-central1-c`)**                    |
 │        *   **Node Pool:** `default-pool`                                          │
 │            *   **Machine Type:** `e2-standard-2`                                  │
 │            *   **Initial Node Count:** 3                                          │
 │            *   **Autoscaling:** Enabled (min: 2, max: 3)                          │
 │            *   **MIGs:** `gke-anthos-cluster-default-pool-001-grp`                |
 │                                                                                   │
 │    *   **Cluster: `cluster-2` (Zone: `us-central1-c`)**                           │
 │        *   **Node Pool:** `default-pool-1`                                        │
 │            *   **Machine Type:** `e2-micro`                                       │
 │            *   **Initial Node Count:** 3                                          │
 │            *   **Autoscaling:** Enabled (max: 5)                                  │
 │            *   **MIGs:**  `gke-cluster-2-default-pool-001-grp`                    │
 ╰───────────────────────────────────────────────────────────────────────────────────╯
✦ OK. Here's an overview of your GKE node pools, their clusters, and MIGs. Is
  there anything else I can help you with?
 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │  > Tell me details about each MIG                                                 │
 ╰───────────────────────────────────────────────────────────────────────────────────╯

 ╭───────────────────────────────────────────────────────────────────────────────────╮
 │ ✔  search_and_analyze_gcp_resources (gemini-cloud-assist-mcp MCP Server) s…       │
 │                                                                                   │
 │    I found the following details for the requested Managed                        │
 │    Instance Groups (MIGs):                                                        │
 │                                                                                   │
 │    *   **gke-cluster-1-default-pool-001-grp**:                                    │
 │        *   **Region/Zone**: us-central1-a                                         │
 │        *   **Creation Timestamp**: 2024-03-25T13:57:25.588-07:00                  │
 │        *   **Current Actions**: 2 instances with no current                       │
 │    action                                                                         │
 │        *   **Target Size**: 2                                                     │
 │        *   **Status**: Stable                                                     │
 │        *   **Instance Template**:                                                 │
 │    `gke-cluster-1-default-pool-001`                                               │
 │        *   **Update Policy**: Opportunistic, minimal action                       │
 │    REPLACE, max surge 1, max unavailable 1                                        │
 │                                                                                   │
 │    *   **gke-anthos-cluster-default-pool-001-grp**:                               │
 │        *   **Region/Zone**: us-central1-c                                         │
 │        *   **Creation Timestamp**: 2024-04-03T09:09:32.058-07:00                  │
 │        *   **Current Actions**: 2 instances with no current                       │
 │    action                                                                         │
 │        *   **Target Size**: 2                                                     │
 │        *   **Status**: Stable                                                     │
 │        *   **Instance Template**:                                                 │
 │    `gke-anthos-cluster-default-pool-001`                                          │
 │        *   **Update Policy**: Opportunistic, minimal action                       │
 │    REPLACE, max surge 1, max unavailable 1                                        │
 │        *   **Named Ports**: http (port 3337)                                      │
 │                                                                                   │
 │    *   **gke-cluster-2-default-pool-001-grp**:                                    │
 │        *   **Region/Zone**: us-central1-c                                         │
 │        *   **Creation Timestamp**: 2024-11-13T21:46:01.072-08:00                  │
 │        *   **Current Actions**: 5 instances with no current                       │
 │    action                                                                         │
 │        *   **Target Size**: 5                                                     │
 │        *   **Status**: Stable                                                     │
 │        *   **Instance Template**:                                                 │
 │    `gke-cluster-2-default-pool-001`                                               │
 │        *   **Update Policy**: Opportunistic, minimal action                       │
 │    REPLACE, max surge 1, max unavailable 1                                        │
 │                                                                                   │
 │    These details were retrieved by calling the Compute Engine                     │
 │    API's `instanceGroupManagers.get` and                                          │
 │    `instanceGroupManagers.aggregatedList` methods.                                │
 ╰───────────────────────────────────────────────────────────────────────────────────╯
✦ Here are the details for each MIG. Let me know if you need anything else.

🤝 贡献

如果你遇到了错误，请在我们的GitHub问题页面上提交问题。
在发送拉取请求之前，请查看我们的贡献指南。

📄 许可证

本项目采用Apache 2.0许可证，并按“原样”提供，不提供任何形式的保证或声明适用于任何用途。有关详细信息，请参阅LICENSE文件。

gemini-cloud-assist-mcp