aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authoralextarazanov <alextarazanov@yandex-team.com>2022-08-10 16:15:22 +0300
committeralextarazanov <alextarazanov@yandex-team.com>2022-08-10 16:15:22 +0300
commit0acde0be21b007e0b0da0f728085afa7bc7e4e34 (patch)
tree89f2090c1dfa8ce67d20d42f6eba4692fb39abca
parent50f75419ab107c5898ba27fb8ba565f87d7a3886 (diff)
downloadydb-0acde0be21b007e0b0da0f728085afa7bc7e4e34.tar.gz
[review] Check translate Health Check API
-rw-r--r--ydb/docs/en/core/reference/ydb-sdk/health-check-api.md69
-rw-r--r--ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md7
-rw-r--r--ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml6
3 files changed, 82 insertions, 0 deletions
diff --git a/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md b/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md
new file mode 100644
index 0000000000..f27921ed71
--- /dev/null
+++ b/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md
@@ -0,0 +1,69 @@
+# Health Check API
+
+{{ ydb-short-name }} has a built-in self-diagnostic system, which can be used to get a brief report on the database status and information about existing problems.
+
+To initiate the check, call the `SelfCheck` method from `Ydb.Monitoring`. You must also pass the name of the checked DB as usual.
+
+Calling the method will return the following structure:
+
+```protobuf
+message SelfCheckResult {
+ SelfCheck.Result self_check_result = 1;
+ repeated IssueLog issue_log = 2;
+}
+```
+
+The `self_check_result` field of the `enum` type contains the DB check result:
+
+* `GOOD`: No problems were detected.
+* `DEGRADED`: Degradation of one of the database systems was detected, but the database is still functioning (for example, allowable disk loss).
+* `MAINTENANCE_REQUIRED`: Significant degradation was detected, there is a risk of accessibility loss, and human intervention is required.
+* `EMERGENCY`: A serious problem was detected in the database, with complete or partial loss of accessibility.
+
+If problems are detected, the `issue_log` field will contain problem descriptions with the following structure:
+
+```protobuf
+message IssueLog {
+ string id = 1;
+ StatusFlag.Status status = 2;
+ string message = 3;
+ Location location = 4;
+ repeated string reason = 5;
+ string type = 6;
+ uint32 level = 7;
+}
+```
+
+* `id`: A unique problem ID within this response.
+* `status`: Status (severity) of the current problem. It can take one of the following values:
+ * `RED`: A component is faulty or unavailable.
+ * `ORANGE`: A serious problem, we are one step away from losing accessibility. Intervention may be required.
+ * `YELLOW`: A minor problem, no risks to accessibility. We recommend you continue monitoring the problem.
+ * `BLUE`: Temporary minor degradation that does not affect database accessibility.
+ * `GREEN`: No problems were detected.
+ * `GREY`: Failed to determine the status (a problem with the self-diagnostic mechanism).
+* `message`: [Text that describes the problem](#problems).
+* `location`: Location of the problem.
+* `reason`: Possible IDs of the nested problems that led to the current problem.
+* `type`: Problem category (by subsystem).
+* `level`: Depth of the problem nesting.
+
+## Possible problems {#problems}
+
+* `Pool usage over 90/95/99%`: One of the pools' CPUs is overloaded.
+* `System tablet is unresponsive / response time over 1000ms/5000ms`: The system tablet is not responding or it takes too long to respond.
+* `Tablets are restarting too often`: Tablets are restarting too often.
+* `Tablets are dead`: Tablets are not started (or cannot be started).
+* `LoadAverage above 100%`: A physical host is overloaded.
+* `There are no compute nodes`: The database has no nodes to start the tablets.
+* `PDisk state is ...`: Indicates problems with a physical disk.
+* `PDisk is not available`: A physical disk is not available.
+* `Available size is less than 12%/9%/6%`: Free space on the physical disk is running out.
+* `VDisk is not available`: A virtual disk is not available.
+* `VDisk state is ...`: Indicates problems with a virtual disk.
+* `DiskSpace is ...`: Indicates problems with virtual disk space.
+* `Storage node is not available`: A node with disks is not available.
+* `Replication in progress`: Disk replication is in progress.
+* `Group has no redundancy`: A storage group lost its redundancy.
+* `Group failed`: A storage group lost its integrity.
+* `Group degraded`: The number of disks allowed in the group is not available.
diff --git a/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md b/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md
new file mode 100644
index 0000000000..b1b1eaa08f
--- /dev/null
+++ b/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md
@@ -0,0 +1,7 @@
+# gRPC API overview
+
+{{ ydb-short-name }} provides the gRPC API, which you can use to manage your DB [resources](../../concepts/datamodel.md) and data. API methods and data structures are described using [Protocol Buffers](https://developers.google.com/protocol-buffers/docs/proto3) (proto 3). For more information, see [.proto specifications with comments](https://github.com/ydb-platform/ydb-api-protos).
+
+The following services are available:
+
+* [{#T}](health-check-api.md).
diff --git a/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml b/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml
index 62ca3cc85e..f277c552e9 100644
--- a/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml
+++ b/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml
@@ -11,6 +11,12 @@ items:
include: { mode: link, path: example/toc_p.yaml }
- name: Handling errors in the API
href: error_handling.md
+ - name: gRPC API
+ items:
+ - name: Overview
+ href: overview-grpc-api.md
+ - name: Health Check API
+ href: health-check-api.md
- name: Code recipes
include: { mode: link, path: recipes/toc_p.yaml }