<doc-view>

<h2 id="_contents">Contents</h2>
<div class="section">
<ul class="ulist">
<li>
<p><router-link to="#_overview" @click.native="this.scrollFix('#_overview')">Overview</router-link></p>

</li>
<li>
<p><router-link to="#maven-coordinates" @click.native="this.scrollFix('#maven-coordinates')">Maven Coordinates</router-link></p>

</li>
<li>
<p><router-link to="#_api" @click.native="this.scrollFix('#_api')">API</router-link></p>

</li>
<li>
<p><router-link to="#_configuration" @click.native="this.scrollFix('#_configuration')">Configuration</router-link></p>

</li>
<li>
<p><router-link to="#_examples" @click.native="this.scrollFix('#_examples')">Examples</router-link></p>

</li>
<li>
<p><router-link to="#_additional_information" @click.native="this.scrollFix('#_additional_information')">Additional Information</router-link></p>

</li>
</ul>

</div>


<h2 id="_overview">Overview</h2>
<div class="section">
<p>It’s a good practice to monitor your microservice’s health, to ensure that it is
available and performs correctly.
Applications implement health checks to expose health status that is collected
at regular intervals by external tooling, such as orchestrators like
Kubernetes. The orchestrator may then take action, such as restarting your
application if the health check fails.</p>

<p>A typical health check combines the statuses of all the dependencies that
affect availability and the ability to perform correctly:</p>

<ul class="ulist">
<li>
<p>network latency</p>

</li>
<li>
<p>storage</p>

</li>
<li>
<p>database</p>

</li>
<li>
<p>other services used by your application</p>

</li>
</ul>

</div>


<h2 id="maven-coordinates">Maven Coordinates</h2>
<div class="section">
<p>To enable Health Checks
add the following dependency to your project&#8217;s <code>pom.xml</code> (see
 <router-link to="/about/managing-dependencies">Managing Dependencies</router-link>).</p>

<markup
lang="xml"

>&lt;dependency&gt;
    &lt;groupId&gt;io.helidon.reactive.health&lt;/groupId&gt;
    &lt;artifactId&gt;helidon-reactive-health&lt;/artifactId&gt;
&lt;/dependency&gt;</markup>

<p>Optional dependency to use built-in health checks:</p>

<markup
lang="xml"

>&lt;dependency&gt;
    &lt;groupId&gt;io.helidon.health&lt;/groupId&gt;
    &lt;artifactId&gt;helidon-health-checks&lt;/artifactId&gt;
&lt;/dependency&gt;</markup>

</div>


<h2 id="_api">API</h2>
<div class="section">
<p>A health check is a Java functional interface that returns a
<code>HealthCheckResponse</code> instance. You can choose to implement a health check
inline with a lambda expression or you can reference a method with the double
colon operator <code>::</code>.</p>

<markup
lang="java"
title="Health check with a lambda expression:"
>HealthCheck hc = () -&gt; HealthCheckResponse
        .named("exampleHealthCheck")
        .up()
        .build();</markup>

<markup
lang="java"
title="Health check with method reference:"
>HealthCheckResponse exampleHealthCheck() {
    return HealthCheckResponse
        .named("exampleHealthCheck")
        .up()
        .build();
}
HealthCheck hc = this::exampleHealthCheck;</markup>

<p><code>HealthSupport</code> is a WebServer service that contains a collection of
registered <code>HealthCheck</code> instances. When queried, it invokes the registered
health check and returns a response with a status code representing the overall
status of the application.</p>

<div class="block-title"><span>Health status codes</span></div>
<div class="table__overflow elevation-1  flex sm7
">
<table class="datatable table">
<colgroup>
<col style="width: 16.667%;">
<col style="width: 83.333%;">
</colgroup>
<thead>
</thead>
<tbody>
<tr>
<td class=""><code>200</code></td>
<td class="">The application is healthy (with health check details in the response).</td>
</tr>
<tr>
<td class=""><code>204</code></td>
<td class="">The application is healthy (with <em>no</em> health check details in the response).</td>
</tr>
<tr>
<td class=""><code>503</code></td>
<td class="">The application is not healthy.</td>
</tr>
<tr>
<td class=""><code>500</code></td>
<td class="">An error occurred while reporting the health.</td>
</tr>
</tbody>
</table>
</div>

<p>HTTP <code>GET</code> responses include JSON content showing the detailed results of all the health checks which the server executed after receiving the request.
HTTP <code>HEAD</code> requests return only the status with no payload.</p>

<p>The following code snippets show how to register health checks while building an
instance of <code>HealthSupport</code>:</p>

<markup
lang="java"
title="Create the health support service:"
>HealthSupport health = HealthSupport.builder()
    .addLiveness(hc)        // hc created above
    .build();</markup>

<markup
lang="java"
title="Create a custom health check:"
>HealthSupport health = HealthSupport.builder()
    .addLiveness(() -&gt; HealthCheckResponse.named("exampleHealthCheck")
                 .up()
                 .withData("time", System.currentTimeMillis())
                 .build())
    .build();</markup>

<p>The custom health check above returns a status of <code>UP</code> and the current time.
After creating the <code>HealthCheck</code> and registering it in a <code>HealthSupport</code>, we
must add the latter to the WebServer routes as follows:</p>

<markup
lang="java"

>Routing.builder()
        .register(health)
        .build();</markup>

<p>Here is a sample response to the custom health check registered above:</p>

<markup
lang="json"
title="JSON response:"
>{
    "status": "UP",
    "checks": [
        {
            "name": "exampleHealthCheck",
            "status": "UP",
            "data": {
                "time": 1546958376613
            }
        }
    ]
}</markup>

<div class="admonition tip">
<p class="admonition-inline">Balance collecting a lot of information with the need to avoid overloading
the application and overwhelming users.</p>
</div>

<p>The following table provides a summary of the Health Check API classes.</p>

<div class="block-title"><span>Health check API classes</span></div>
<div class="table__overflow elevation-1  ">
<table class="datatable table">
<colgroup>
<col style="width: 40%;">
<col style="width: 60%;">
</colgroup>
<thead>
</thead>
<tbody>
<tr>
<td class=""><code>org.eclipse.microprofile.health.HealthCheck</code></td>
<td class="">Java functional interface representing the logic of a single health check</td>
</tr>
<tr>
<td class=""><code>org.eclipse.microprofile.health.HealthCheckResponse</code></td>
<td class="">Result of a health check invocation that contains a status and a description.</td>
</tr>
<tr>
<td class=""><code>org.eclipse.microprofile.health.HealthCheckResponseBuilder</code></td>
<td class="">Builder class to create <code>HealthCheckResponse</code> instances</td>
</tr>
<tr>
<td class=""><code>io.helidon.reactive.health.HealthSupport</code></td>
<td class="">WebServer service that exposes <code>/health</code> and invokes the registered health
checks</td>
</tr>
<tr>
<td class=""><code>io.helidon.reactive.health.HealthSupport.Builder</code></td>
<td class="">Builder class to create <code>HealthSupport</code> instances</td>
</tr>
</tbody>
</table>
</div>


<h3 id="_built_in_health_checks">Built-in health checks</h3>
<div class="section">
<p>You can use Helidon-provided health checks to report various
common health check statuses:</p>


<div class="table__overflow elevation-1  ">
<table class="datatable table">
<colgroup>
<col style="width: 4.348%;">
<col style="width: 4.348%;">
<col style="width: 13.043%;">
<col style="width: 65.217%;">
<col style="width: 13.044%;">
</colgroup>
<thead>
<tr>
<th>Built-in health check</th>
<th>Health check name</th>
<th>JavaDoc</th>
<th>Config properties</th>
<th>Default config value</th>
</tr>
</thead>
<tbody>
<tr>
<td class="">deadlock detection</td>
<td class=""><code>deadlock</code></td>
<td class=""><a target="_blank" href="./apidocs/io.helidon.health.checks/io/helidon/health/checks/DeadlockHealthCheck.html"><code>DeadlockHealthCheck</code></a></td>
<td class="">n/a</td>
<td class="">n/a</td>
</tr>
<tr>
<td class="">available disk space</td>
<td class=""><code>diskSpace</code></td>
<td class=""><a target="_blank" href="./apidocs/io.helidon.health.checks/io/helidon/health/checks/DiskSpaceHealthCheck.html"><code>DiskSpaceHealthCheck</code></a></td>
<td class=""><code>helidon.healthCheck.diskSpace.thresholdPercent</code><br>

+
<code>helidon.healthCheck.diskSpace.path</code></td>
<td class=""><code>99.999</code><br>

+
<code>/</code></td>
</tr>
<tr>
<td class="">available heap memory</td>
<td class=""><code>heapMemory</code></td>
<td class=""><a target="_blank" href="./apidocs/io.helidon.health.checks/io/helidon/health/checks/HeapMemoryHealthCheck.html"><code>HeapMemoryHealthCheck</code></a></td>
<td class=""><code>helidon.healthCheck.heapMemory.thresholdPercent</code></td>
<td class=""><code>98</code></td>
</tr>
</tbody>
</table>
</div>

<p>The following code adds the default built-in health checks to your application:</p>

<markup
lang="java"

>HealthSupport health = HealthSupport.builder()
    .add(HealthChecks.healthChecks())   <span class="conum" data-value="1" />
    .build();

Routing.builder()
       .register(health)   <span class="conum" data-value="2" />
       .build();</markup>

<ul class="colist">
<li data-value="1">Add built-in health checks using defaults (requires the <code>helidon-health-checks</code>
dependency).</li>
<li data-value="2">Register the created <code>HealthSupport</code> with web server routing (adds the
<code>/health</code> endpoint).</li>
</ul>

<p>You can control the thresholds for built-in health checks in either of two ways:</p>

<ul class="ulist">
<li>
<p>Create the health checks individually
using their builders instead of using the <code>HealthChecks</code> convenience class.
Follow the JavaDoc links in the <router-link to="#built-in-health-checks-table" @click.native="this.scrollFix('#built-in-health-checks-table')">table</router-link> above.</p>

</li>
<li>
<p>Using configuration as explained in <router-link to="#_configuration" @click.native="this.scrollFix('#_configuration')"></router-link>.</p>

</li>
</ul>

</div>


<h3 id="_kubernetes_probes">Kubernetes probes</h3>
<div class="section">
<p>Probes is the term used by Kubernetes to describe health checks for containers
(<a target="_blank" href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes">Kubernetes documentation</a>).</p>

<p>There are three types of probes:</p>

<ul class="ulist">
<li>
<p><em>liveness</em>: Indicates whether the container is running</p>

</li>
<li>
<p><em>readiness</em>: Indicates whether the container is ready to service requests</p>

</li>
<li>
<p><em>startup</em>: Indicates whether the application in the container has started</p>

</li>
</ul>

<p>You can implement probes using the following mechanisms:</p>

<ol style="margin-left: 15px;">
<li>
Running a command inside a container

</li>
<li>
Sending an <code>HTTP</code> request to a container

</li>
<li>
Opening a <code>TCP</code> socket to a container

</li>
</ol>

<p>A microservice exposed to HTTP traffic will typically implement both the
liveness probe and the readiness probe using HTTP requests.
If the microservice takes a significant time to initialize itself, you can also define a startup probe, in which case
Kubernetes does not check liveness or readiness probes until the startup probe returns success.</p>

<p>You can configure several parameters for probes. The following are the most
relevant parameters:</p>


<div class="table__overflow elevation-1  flex sm7
">
<table class="datatable table">
<colgroup>
<col style="width: 28.571%;">
<col style="width: 71.429%;">
</colgroup>
<thead>
</thead>
<tbody>
<tr>
<td class=""><code>initialDelaySeconds</code></td>
<td class="">Number of seconds after the container has started before liveness or readiness
probes are initiated.</td>
</tr>
<tr>
<td class=""><code>periodSeconds</code></td>
<td class="">Probe interval. Default to 10 seconds. Minimum value is 1.</td>
</tr>
<tr>
<td class=""><code>timeoutSeconds</code></td>
<td class="">Number of seconds after which the probe times out. Defaults to 1 second.
Minimum value is 1</td>
</tr>
<tr>
<td class=""><code>failureThreshold</code></td>
<td class="">Number of consecutive failures after which the probe should stop. Default: 3.
Minimum: 1.</td>
</tr>
</tbody>
</table>
</div>


<h4 id="_liveness_probe">Liveness probe</h4>
<div class="section">
<p>The liveness probe is used to verify the container has become unresponsive.
For example, it can be used to detect deadlocks or analyze heap usage. When
Kubernetes gives up on a liveness probe, the corresponding pod is restarted.</p>

<div class="admonition note">
<p class="admonition-inline">The liveness probe can result in repeated restarts in certain cases.
For example, if the probe is implemented to check all the dependencies
strictly, then it can fail repeatedly for temporary issues. Repeated restarts
can also occur if <code>timeoutSeconds</code> or <code>periodSeconds</code> is too low.</p>
</div>

<p>We recommend the following:</p>

<ul class="ulist">
<li>
<p>Avoid checking dependencies in a liveness probe.</p>

</li>
<li>
<p>Set <code>timeoutSeconds</code> to avoid excessive probe failures.</p>

</li>
<li>
<p>Acknowledge startup times with <code>initialDelaySeconds</code>.</p>

</li>
</ul>

</div>


<h4 id="_readiness_probe">Readiness probe</h4>
<div class="section">
<p>The readiness probe is used to avoid routing requests to the pod until it is
ready to accept traffic. When Kubernetes gives up on a readiness probe, the
pod is not restarted, traffic is not routed to the pod anymore.</p>

<div class="admonition note">
<p class="admonition-inline">In certain cases, the readiness probe can cause all the pods to be removed
from service routing. For example, if the probe is implemented to check all the
dependencies strictly, then it can fail repeatedly for temporary issues. This
issue can also occur if <code>timeoutSeconds</code> or <code>periodSeconds</code> is too low.</p>
</div>

<p>We recommend the following:</p>

<ul class="ulist">
<li>
<p>Be conservative when checking shared dependencies.</p>

</li>
<li>
<p>Be aggressive when checking local dependencies.</p>

</li>
<li>
<p>Set <code>failureThreshold</code> according to <code>periodSeconds</code> in order to accommodate
temporary errors.</p>

</li>
</ul>

</div>


<h4 id="_startup_probe">Startup probe</h4>
<div class="section">
<p>The startup probe prevents Kubernetes from prematurely checking the other probes if the application takes a long time to start.
Otherwise, Kubernetes might misinterpret a failed liveness or readiness probe and shut down the container when, in fact, the application is still coming up.</p>

</div>

</div>


<h3 id="_troubleshooting_probes">Troubleshooting probes</h3>
<div class="section">
<p>Failed probes are recorded as events associated with their corresponding pods.
The event message contains only the status code.</p>

<markup
lang="bash"
title="Get the events of a single pod:"
>POD_NAME=$(kubectl get pod -l app=acme -o jsonpath='{.items[0].metadata.name}') <span class="conum" data-value="1" />
kubectl get event --field-selector involvedObject.name=${POD_NAME} <span class="conum" data-value="2" /></markup>

<ul class="colist">
<li data-value="1">Get the effective pod name by filtering pods with the label <code>app=acme</code>.</li>
<li data-value="2">Filter the events for the pod.</li>
</ul>

<div class="admonition tip">
<p class="admonition-inline">Create log messages in your health check implementation when setting a
<code>DOWN</code> status. This will allow you to correlate the cause of a failed probe.</p>
</div>

</div>

</div>


<h2 id="_configuration">Configuration</h2>
<div class="section">
<p>Built-in health checks can be configured using the config property keys
described in this
<router-link to="#built-in-health-checks-table" @click.native="this.scrollFix('#built-in-health-checks-table')">table</router-link>. Further, you can suppress one or more of the built-in
health checks by setting the configuration item
<code>helidon.health.exclude</code> to a comma-separated list of the health check names
(from this <router-link to="#built-in-health-checks-table" @click.native="this.scrollFix('#built-in-health-checks-table')">table</router-link>) you want to exclude.</p>

</div>


<h2 id="_examples">Examples</h2>
<div class="section">

<h3 id="_json_response_example">JSON response example</h3>
<div class="section">
<p>Accessing the Helidon-provided <code>/health</code> endpoint reports the health of your application
as shown below:</p>

<markup
lang="json"
title="JSON response."
>{
    "status": "UP",
    "checks": [
        {
            "name": "deadlock",
            "status": "UP"
        },
        {
            "name": "diskSpace",
            "status": "UP",
            "data": {
                "free": "211.00 GB",
                "freeBytes": 226563444736,
                "percentFree": "45.31%",
                "total": "465.72 GB",
                "totalBytes": 500068036608
            }
        },
        {
            "name": "heapMemory",
            "status": "UP",
            "data": {
                "free": "215.15 MB",
                "freeBytes": 225600496,
                "max": "3.56 GB",
                "maxBytes": 3817865216,
                "percentFree": "99.17%",
                "total": "245.50 MB",
                "totalBytes": 257425408
            }
        }
    ]
}</markup>

</div>


<h3 id="_kubernetes_example">Kubernetes example</h3>
<div class="section">
<p>This example shows the usage of the Helidon health API in an application that
implements health endpoints for the liveness and readiness probes. Note that
the application code dissociates the health endpoints from the default routes,
so that the health endpoints are not exposed by the service. An example YAML
specification is also provided for the Kubernetes service and deployment.</p>

<markup
lang="java"
title="Application code:"
>Routing healthRouting = Routing.builder()
        .register(JsonSupport.create())
        .register(HealthSupport.builder()
                .webContext("/live") <span class="conum" data-value="1" />
                .add(HealthChecks.healthChecks()) <span class="conum" data-value="2" />
                .build())
        .register(HealthSupport.builder()
                .webContext("/ready") <span class="conum" data-value="3" />
                .addReadiness(() -&gt; HealthCheckResponse.named("database").up().build()) <span class="conum" data-value="4" />
                .build())
        .build();

Routing defaultRouting = Routing.builder()
        .any((req, res) -&gt; res.send("It works!")) <span class="conum" data-value="5" />
        .build();

WebServer server = WebServer.builder(defaultRouting)
        .config(WebServer.builder()
                .port(8080) <span class="conum" data-value="6" />
                .addSocket("health", SocketConfiguration.builder() <span class="conum" data-value="7" />
                        .port(8081)
                        .build())
                .build())
        .addNamedRouting("health", healthRouting) <span class="conum" data-value="8" />
        .build();

server.start();</markup>

<ul class="colist">
<li data-value="1">The health service for the <code>liveness</code> probe is exposed at <code>/live</code>.</li>
<li data-value="2">Using the built-in health checks for the <code>liveness</code> probe.</li>
<li data-value="3">The health service for the <code>readiness</code> probe is exposed at <code>/ready</code>.</li>
<li data-value="4">Using a custom health check for a pseudo database that is always <code>UP</code>.</li>
<li data-value="5">The default route: returns It works! for any request.</li>
<li data-value="6">The server uses port 8080 for the default routes.</li>
<li data-value="7">A socket configuration named <code>health</code> using port <code>8081</code>.</li>
<li data-value="8">Route the health services exclusively on the <code>health</code> socket.</li>
</ul>

<markup
lang="yaml"
title="Kubernetes descriptor:"
>kind: Service
apiVersion: v1
metadata:
  name: acme <span class="conum" data-value="1" />
  labels:
    app: acme
spec:
  type: NodePort
  selector:
    app: acme
  ports:
  - port: 8080
    targetPort: 8080
    name: http
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: acme <span class="conum" data-value="2" />
spec:
  replicas: 1
  selector:
    matchLabels:
      app: acme
  template:
    metadata:
      name: acme
      labels:
        name: acme
    spec:
      containers:
      - name: acme
        image: acme
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /live <span class="conum" data-value="3" />
            port: 8081
          initialDelaySeconds: 3 <span class="conum" data-value="4" />
          periodSeconds: 10
          timeoutSeconds: 3
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready <span class="conum" data-value="5" />
            port: 8081
          initialDelaySeconds: 10 <span class="conum" data-value="6" />
          periodSeconds: 30
          timeoutSeconds: 10
---</markup>

<ul class="colist">
<li data-value="1">A service of type <code>NodePort</code> that serves the default routes on port <code>8080</code>.</li>
<li data-value="2">A deployment with one replica of a pod.</li>
<li data-value="3">The HTTP endpoint for the liveness probe.</li>
<li data-value="4">The liveness probe configuration.</li>
<li data-value="5">The HTTP endpoint for the readiness probe.</li>
<li data-value="6">The readiness probe configuration.</li>
</ul>

</div>

</div>


<h2 id="_additional_information">Additional Information</h2>
<div class="section">
<ul class="ulist">
<li>
<p><a target="_blank" href="./apidocs/io.helidon.health.checks/module-summary.html">Health Checks SE API JavaDocs</a>.</p>

</li>
</ul>

</div>

</doc-view>
