diff options
author | alextarazanov <alextarazanov@yandex-team.com> | 2022-08-10 16:15:22 +0300 |
---|---|---|
committer | alextarazanov <alextarazanov@yandex-team.com> | 2022-08-10 16:15:22 +0300 |
commit | 0acde0be21b007e0b0da0f728085afa7bc7e4e34 (patch) | |
tree | 89f2090c1dfa8ce67d20d42f6eba4692fb39abca | |
parent | 50f75419ab107c5898ba27fb8ba565f87d7a3886 (diff) | |
download | ydb-0acde0be21b007e0b0da0f728085afa7bc7e4e34.tar.gz |
[review] Check translate Health Check API
-rw-r--r-- | ydb/docs/en/core/reference/ydb-sdk/health-check-api.md | 69 | ||||
-rw-r--r-- | ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md | 7 | ||||
-rw-r--r-- | ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml | 6 |
3 files changed, 82 insertions, 0 deletions
diff --git a/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md b/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md new file mode 100644 index 0000000000..f27921ed71 --- /dev/null +++ b/ydb/docs/en/core/reference/ydb-sdk/health-check-api.md @@ -0,0 +1,69 @@ +# Health Check API + +{{ ydb-short-name }} has a built-in self-diagnostic system, which can be used to get a brief report on the database status and information about existing problems. + +To initiate the check, call the `SelfCheck` method from `Ydb.Monitoring`. You must also pass the name of the checked DB as usual. + +Calling the method will return the following structure: + +```protobuf +message SelfCheckResult { + SelfCheck.Result self_check_result = 1; + repeated IssueLog issue_log = 2; +} +``` + +The `self_check_result` field of the `enum` type contains the DB check result: + +* `GOOD`: No problems were detected. +* `DEGRADED`: Degradation of one of the database systems was detected, but the database is still functioning (for example, allowable disk loss). +* `MAINTENANCE_REQUIRED`: Significant degradation was detected, there is a risk of accessibility loss, and human intervention is required. +* `EMERGENCY`: A serious problem was detected in the database, with complete or partial loss of accessibility. + +If problems are detected, the `issue_log` field will contain problem descriptions with the following structure: + +```protobuf +message IssueLog { + string id = 1; + StatusFlag.Status status = 2; + string message = 3; + Location location = 4; + repeated string reason = 5; + string type = 6; + uint32 level = 7; +} +``` + +* `id`: A unique problem ID within this response. +* `status`: Status (severity) of the current problem. It can take one of the following values: + * `RED`: A component is faulty or unavailable. + * `ORANGE`: A serious problem, we are one step away from losing accessibility. Intervention may be required. + * `YELLOW`: A minor problem, no risks to accessibility. We recommend you continue monitoring the problem. + * `BLUE`: Temporary minor degradation that does not affect database accessibility. + * `GREEN`: No problems were detected. + * `GREY`: Failed to determine the status (a problem with the self-diagnostic mechanism). +* `message`: [Text that describes the problem](#problems). +* `location`: Location of the problem. +* `reason`: Possible IDs of the nested problems that led to the current problem. +* `type`: Problem category (by subsystem). +* `level`: Depth of the problem nesting. + +## Possible problems {#problems} + +* `Pool usage over 90/95/99%`: One of the pools' CPUs is overloaded. +* `System tablet is unresponsive / response time over 1000ms/5000ms`: The system tablet is not responding or it takes too long to respond. +* `Tablets are restarting too often`: Tablets are restarting too often. +* `Tablets are dead`: Tablets are not started (or cannot be started). +* `LoadAverage above 100%`: A physical host is overloaded. +* `There are no compute nodes`: The database has no nodes to start the tablets. +* `PDisk state is ...`: Indicates problems with a physical disk. +* `PDisk is not available`: A physical disk is not available. +* `Available size is less than 12%/9%/6%`: Free space on the physical disk is running out. +* `VDisk is not available`: A virtual disk is not available. +* `VDisk state is ...`: Indicates problems with a virtual disk. +* `DiskSpace is ...`: Indicates problems with virtual disk space. +* `Storage node is not available`: A node with disks is not available. +* `Replication in progress`: Disk replication is in progress. +* `Group has no redundancy`: A storage group lost its redundancy. +* `Group failed`: A storage group lost its integrity. +* `Group degraded`: The number of disks allowed in the group is not available. diff --git a/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md b/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md new file mode 100644 index 0000000000..b1b1eaa08f --- /dev/null +++ b/ydb/docs/en/core/reference/ydb-sdk/overview-grpc-api.md @@ -0,0 +1,7 @@ +# gRPC API overview + +{{ ydb-short-name }} provides the gRPC API, which you can use to manage your DB [resources](../../concepts/datamodel.md) and data. API methods and data structures are described using [Protocol Buffers](https://developers.google.com/protocol-buffers/docs/proto3) (proto 3). For more information, see [.proto specifications with comments](https://github.com/ydb-platform/ydb-api-protos). + +The following services are available: + +* [{#T}](health-check-api.md). diff --git a/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml b/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml index 62ca3cc85e..f277c552e9 100644 --- a/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml +++ b/ydb/docs/en/core/reference/ydb-sdk/toc_i.yaml @@ -11,6 +11,12 @@ items: include: { mode: link, path: example/toc_p.yaml } - name: Handling errors in the API href: error_handling.md + - name: gRPC API + items: + - name: Overview + href: overview-grpc-api.md + - name: Health Check API + href: health-check-api.md - name: Code recipes include: { mode: link, path: recipes/toc_p.yaml } |