Skip to content

Commit

Permalink
Add support Kong monitoring (#12696)
Browse files Browse the repository at this point in the history
  • Loading branch information
CodePrometheus authored Oct 17, 2024
1 parent 41b00c8 commit bfbda6f
Show file tree
Hide file tree
Showing 35 changed files with 1,978 additions and 4 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/skywalking.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -667,6 +667,8 @@ jobs:
config: test/e2e-v2/cases/clickhouse/clickhouse-prometheus-endpoint/e2e.yaml
- name: ActiveMQ
config: test/e2e-v2/cases/activemq/e2e.yaml
- name: Kong
config: test/e2e-v2/cases/kong/e2e.yaml

- name: UI Menu BanyanDB
config: test/e2e-v2/cases/menu/banyandb/e2e.yaml
Expand Down
4 changes: 3 additions & 1 deletion docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@
* Support query endpoint list with duration parameter(optional).
* Change the endpoint_traffic to updatable for the additional column `last_ping`.
* Add Component ID(5023) for the GoZero framework.

* Support Kong monitoring.

#### UI

* Add support for case-insensitive search in the dashboard list.
* Add content decorations to Table and Card widgets.

#### Documentation
* Update release document to adopt newly added revision-based process.
Expand Down
9 changes: 9 additions & 0 deletions docs/en/concepts-and-designs/service-hierarchy.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ If you want to customize it according to your own needs, please refer to [Servic
| KAFKA | K8S_SERVICE | [KAFKA On K8S_SERVICE](#kafka-on-k8s_service) |
| PULSAR | K8S_SERVICE | [PULSAR On K8S_SERVICE](#pulsar-on-k8s_service) |
| SO11Y_OAP | K8S_SERVICE | [SO11Y_OAP On K8S_SERVICE](#so11y_oap-on-k8s_service) |
| KONG | K8S_SERVICE | [KONG On K8S_SERVICE](#kong-on-k8s_service) |

- The following sections will describe the **default matching rules** in detail and use the `upper-layer On lower-layer` format.
- The example service name are based on SkyWalking [Showcase](https://github.com/apache/skywalking-showcase) default deployment.
Expand Down Expand Up @@ -220,6 +221,14 @@ If you want to customize it according to your own needs, please refer to [Servic
- SO11Y_OAP.service.name: `demo-oap.skywalking-showcase`
- K8S_SERVICE.service.name: `skywalking-showcase::demo-oap.skywalking-showcase`

#### KONG On K8S_SERVICE
- Rule name: `short-name`
- Groovy script: `{ (u, l) -> u.shortName == l.shortName }`
- Description: KONG.service.shortName == K8S_SERVICE.service.shortName
- Matched Example:
- KONG.service.name: `kong::kong.skywalking-showcase`
- K8S_SERVICE.service.name: `skywalking-showcase::kong.skywalking-showcase`

### Build Through Specific Agents
Use agent tech involved(such as eBPF) and deployment tools(such as operator and agent injector) to detect the service hierarchy relations.

Expand Down
77 changes: 77 additions & 0 deletions docs/en/setup/backend/backend-kong-monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# KONG monitoring

## KONG performance from `kong prometheus plugin`
The [kong-prometheus](https://github.com/Kong/kong/tree/master/kong/plugins/prometheus) is a lua library that can be used with Kong to collect metrics.
It exposes metrics related to Kong and proxied upstream services in Prometheus exposition format, which can be scraped by a Prometheus Server.
SkyWalking leverages OpenTelemetry Collector to transfer the metrics to[OpenTelemetry receiver](opentelemetry-receiver.md)
and into the [Meter System](./../../concepts-and-designs/meter.md).

### Data flow
1. [KONG Prometheus plugin](https://docs.konghq.com/hub/kong-inc/prometheus/) collects metrics data from KONG.
2. OpenTelemetry Collector fetches metrics from [KONG Prometheus plugin](https://docs.konghq.com/hub/kong-inc/prometheus/) via
Prometheus Receiver and pushes metrics to SkyWalking OAP Server via OpenTelemetry gRPC exporter.
3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to filter/calculate/aggregate and store the results.

### Set up
1. Enable KONG [KONG Prometheus plugin](https://docs.konghq.com/hub/kong-inc/prometheus/). Note that if need to monitor per_consumer,
status_code_metrics, ai_metrics, latency_metrics, bandwidth_metrics or upstream_health_metrics, **need to enable them manually as needed**,
which can be enabled in the [konga](https://pantsel.github.io/konga/) dashboard or through the Admin API, such as the following command
~~~bash
curl -i -X POST http://{KONG-HOST}:{KONG_ADMIN_PORT}/plugins \
--data name=prometheus \
--data config.per_consumer=true \
--data config.status_code_metrics=true \
--data config.ai_metrics=true \
--data config.latency_metrics=true \
--data config.bandwidth_metrics=true \
--data config.upstream_health_metrics=true
~~~
2. Set up [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/getting-started/#docker).
For details on Prometheus Receiver in OpenTelemetry Collector, refer to [here](../../../../test/e2e-v2/cases/kong/otel-collector-config.yaml).
3. Config SkyWalking [OpenTelemetry receiver](opentelemetry-receiver.md).

### KONG Monitoring

[KONG prometheus plugin](https://docs.konghq.com/hub/kong-inc/prometheus/) provide multiple dimensions metrics for KONG server, upstream, route etc.
Accordingly, SkyWalking observes the status, requests, and latency of the KONG server, which is cataloged as a `LAYER: KONG` `Service` in the OAP.
Each Kong server is cataloged as a `LAYER: KONG` `instance`, meanwhile, the route rules would be recognized as a `LAYER: KONG` `endpoint`.


#### Kong Request Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|-------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|-------------|
| Bandwidth | bytes | meter_kong_service_http_bandwidth<br />meter_kong_instance_http_bandwidth<br />meter_kong_endpoint_http_bandwidth | Total bandwidth (ingress/egress) throughput | Kong |
| HTTP Status | count | meter_kong_service_http_status<br />meter_kong_instance_http_status<br />meter_kong_endpoint_http_status | HTTP status codes per consumer/service/route in Kong | Kong |
| HTTP Request | count | meter_kong_service_http_requests<br />meter_kong_instance_http_requests | Total number of requests | Kong |

#### Kong Database Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|-------------------------------------------------------------------------------------|-------------------------------------------|-------------|
| DB | count | meter_kong_service_datastore_reachable<br />meter_kong_instance_datastore_reachable | Datastore reachable from Kong | Kong |
| DB | bytes | meter_kong_instance_shared_dict_bytes | Allocated slabs in bytes in a shared_dict | Kong |
| DB | bytes | meter_kong_instance_shared_dict_total_bytes | Total capacity in bytes of a shared_dict | Kong |
| DB | bytes | meter_kong_instance_memory_workers_lua_vms_bytes | Allocated bytes in worker Lua VM | Kong |

#### Kong Latencies Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|------|-------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-------------|
| Latency | ms | meter_kong_service_kong_latency<br />meter_kong_instance_kong_latency<br />meter_kong_endpoint_kong_latency | Latency added by Kong and enabled plugins for each service/route in Kong | Kong |
| Latency | ms | meter_kong_service_request_latency<br />meter_kong_instance_request_latency<br />meter_kong_endpoint_request_latency | Total latency incurred during requests for each service/route in Kong | Kong |
| Latency | ms | meter_kong_service_upstream_latency<br />meter_kong_instance_upstream_latency<br />meter_kong_endpoint_upstream_latency | Latency added by upstream response for each service/route in Kong | Kong |


#### Kong Nginx Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|---------------------------------------------------------------------------------------------|---------------------------------------|-------------|
| Nginx | count | meter_kong_service_nginx_metric_errors_total | Number of nginx-lua-prometheus errors | Kong |
| Nginx | count | meter_kong_service_nginx_connections_total<br />meter_kong_instance_nginx_connections_total | Number of connections by subsystem | Kong |
| Nginx | count | meter_kong_service_nginx_timers<br />meter_kong_instance_nginx_timers | Number of Nginx timers | Kong |

### Customizations
You can customize your own metrics/expression/dashboard panel.
The metrics definition and expression rules are found in `/config/otel-rules/kong`.
The KONG dashboard panel configurations are found in `/config/ui-initialized-templates/kong`.
61 changes: 61 additions & 0 deletions docs/en/swip/SWIP-8.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Support Kong Monitoring

## Motivation

[**Kong**](https://github.com/Kong/kong) or **Kong API Gateway** is a cloud-native, platform-agnostic, scalable API Gateway
distinguished for its high performance and extensibility via plugins. Now I want to add Kong monitoring via the OpenTelemetry Collector,
which fetches metrics from it's own HTTP endpoint to expose metrics data for [Prometheus](https://prometheus.io/).

## Architecture Graph

There is no significant architecture-level change.

## Proposed Changes

1. Kong expose its own [metrics](https://docs.konghq.com/hub/kong-inc/prometheus/) via HTTP endpoint to opentelemetry collector, OpenTelemetry Collector fetches metrics from it and pushes metrics to SkyWalking OTEL Receiver via OpenTelemetry exporter.
2. The SkyWalking OAP Server parses the expression with MAL to filter/calculate/aggregate and store the results.
3. These metrics can be displayed via the SkyWalking UI, and the metrics can be customized for display on the UI dashboard.

### Kong Request Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|-------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|-------------|
| Bandwidth | bytes | meter_kong_service_http_bandwidth<br />meter_kong_instance_http_bandwidth<br />meter_kong_endpoint_http_bandwidth | Total bandwidth (ingress/egress) throughput | Kong |
| HTTP Status | count | meter_kong_service_http_status<br />meter_kong_instance_http_status<br />meter_kong_endpoint_http_status | HTTP status codes per consumer/service/route in Kong | Kong |
| HTTP Request | count | meter_kong_service_http_requests<br />meter_kong_instance_http_requests | Total number of requests | Kong |

### Kong Database Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|-------------------------------------------------------------------------------------|-------------------------------------------|-------------|
| DB | count | meter_kong_service_datastore_reachable<br />meter_kong_instance_datastore_reachable | Datastore reachable from Kong | Kong |
| DB | bytes | meter_kong_instance_shared_dict_bytes | Allocated slabs in bytes in a shared_dict | Kong |
| DB | bytes | meter_kong_instance_shared_dict_total_bytes | Total capacity in bytes of a shared_dict | Kong |
| DB | bytes | meter_kong_instance_memory_workers_lua_vms_bytes | Allocated bytes in worker Lua VM | Kong |

### Kong Latencies Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|------|-------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-------------|
| Latency | ms | meter_kong_service_kong_latency<br />meter_kong_instance_kong_latency<br />meter_kong_endpoint_kong_latency | Latency added by Kong and enabled plugins for each service/route in Kong | Kong |
| Latency | ms | meter_kong_service_request_latency<br />meter_kong_instance_request_latency<br />meter_kong_endpoint_request_latency | Total latency incurred during requests for each service/route in Kong | Kong |
| Latency | ms | meter_kong_service_upstream_latency<br />meter_kong_instance_upstream_latency<br />meter_kong_endpoint_upstream_latency | Latency added by upstream response for each service/route in Kong | Kong |


### Kong Nginx Supported Metrics

| Monitoring Panel | Unit | Metric Name | Description | Data Source |
|------------------|-------|---------------------------------------------------------------------------------------------|---------------------------------------|-------------|
| Nginx | count | meter_kong_service_nginx_metric_errors_total | Number of nginx-lua-prometheus errors | Kong |
| Nginx | count | meter_kong_service_nginx_connections_total<br />meter_kong_instance_nginx_connections_total | Number of connections by subsystem | Kong |
| Nginx | count | meter_kong_service_nginx_timers<br />meter_kong_instance_nginx_timers | Number of Nginx timers | Kong |

## Imported Dependencies libs and their licenses.

No new dependency.

## Compatibility

no breaking changes.

## General usage docs
2 changes: 2 additions & 0 deletions docs/menu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,8 @@ catalog:
path: "/en/setup/backend/backend-apisix-monitoring"
- name: "AWS API Gateway"
path: "/en/setup/backend/backend-aws-api-gateway-monitoring"
- name: "Kong Monitoring"
path: "/en/setup/backend/backend-kong-monitoring"
- name: "Database Monitoring"
catalog:
- name: "MySQL/MariaDB Server"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,12 @@ public enum Layer {
* The self observability of SkyWalking Java Agent,
* which provides the abilities to measure the tracing performance and error statistics of plugins.
*/
SO11Y_JAVA_AGENT(39, true);
SO11Y_JAVA_AGENT(39, true),

/**
* Kong is Cloud-Native API Gateway and AI Gateway.
*/
KONG(40, true);

private final int value;
/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ public class UITemplateInitializer {
Layer.ACTIVEMQ.name(),
Layer.CILIUM_SERVICE.name(),
Layer.SO11Y_JAVA_AGENT.name(),
Layer.KONG.name(),
"custom"
};
private final UITemplateManagementService uiTemplateManagementService;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,7 @@ receiver-otel:
selector: ${SW_OTEL_RECEIVER:default}
default:
enabledHandlers: ${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"otlp-metrics,otlp-logs"}
enabledOtelMetricsRules: ${SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES:"apisix,nginx/*,k8s/*,istio-controlplane,vm,mysql/*,postgresql/*,oap,aws-eks/*,windows,aws-s3/*,aws-dynamodb/*,aws-gateway/*,redis/*,elasticsearch/*,rabbitmq/*,mongodb/*,kafka/*,pulsar/*,bookkeeper/*,rocketmq/*,clickhouse/*,activemq/*"}
enabledOtelMetricsRules: ${SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES:"apisix,nginx/*,k8s/*,istio-controlplane,vm,mysql/*,postgresql/*,oap,aws-eks/*,windows,aws-s3/*,aws-dynamodb/*,aws-gateway/*,redis/*,elasticsearch/*,rabbitmq/*,mongodb/*,kafka/*,pulsar/*,bookkeeper/*,rocketmq/*,clickhouse/*,activemq/*,kong/*"}

receiver-zipkin:
selector: ${SW_RECEIVER_ZIPKIN:-}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ hierarchy:
GENERAL:
APISIX: lower-short-name-remove-ns
K8S_SERVICE: lower-short-name-remove-ns
KONG: lower-short-name-remove-ns

MYSQL:
K8S_SERVICE: short-name
Expand Down Expand Up @@ -62,6 +63,9 @@ hierarchy:

ACTIVEMQ:
K8S_SERVICE: short-name

KONG:
K8S_SERVICE: short-name

VIRTUAL_DATABASE:
MYSQL: lower-short-name-with-fqdn
Expand Down Expand Up @@ -110,6 +114,7 @@ layer-levels:
KAFKA: 2
PULSAR: 2
ACTIVEMQ: 2
KONG: 2

MESH_DP: 1
CILIUM_SERVICE: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
filter: "{ tags -> tags.job_name == 'kong-monitoring' }"
expSuffix: tag({tags -> tags.host_name = 'kong::' + tags.host_name}).endpoint(['host_name'], ['route'], Layer.KONG)
metricPrefix: meter_kong_endpoint
metricsRules:
# counter
# Total bandwidth (ingress/egress) throughput in bytes
- name: http_bandwidth
exp: kong_bandwidth_bytes.sum(['host_name','direction','route']).rate('PT1M')
# HTTP status codes per consumer/service/route in Kong
- name: http_status
exp: kong_http_requests_total.sum(['host_name','route','code']).rate('PT1M')

# histogram
# Latency added by Kong and enabled plugins for each service/route in Kong
- name: kong_latency
exp: kong_kong_latency_ms.tagNotEqual('route','').sum(['host_name','route','le']).histogram().histogram_percentile([50,70,90,99])
# Total latency incurred during requests for each service/route in Kong
- name: request_latency
exp: kong_request_latency_ms.tagNotEqual('route','').sum(['host_name','route','le']).histogram().histogram_percentile([50,70,90,99])
# Latency added by upstream response for each service/route in Kong
- name: upstream_latency
exp: kong_upstream_latency_ms.tagNotEqual('route','').sum(['host_name','route','le']).histogram().histogram_percentile([50,70,90,99])
Loading

0 comments on commit bfbda6f

Please sign in to comment.