@@ -28,9 +28,8 @@ vpc - defines and installs the VPC and subnets to use with EKS
28
28
└─logagent - deploys a logging agent (filebeat) to the EKS cluster
29
29
└─certmgr - deploys the open source cert-manager.io helm chart to the EKS cluster
30
30
└─prometheus - deploys prometheus server, node exporter, and statsd collector for metrics
31
- └─grafana - deploys the grafana visualization platform
32
- └─observability - deploys the OTEL operator and instantiates a simple collector
33
- └─sirius - deploys the Bank of Sirus application to the EKS cluster
31
+ └─observability - deploys the OTEL operator and instantiates a simple collector
32
+ └─sirius - deploys the Bank of Sirus application to the EKS cluster
34
33
35
34
```
36
35
@@ -146,15 +145,40 @@ deployment.
146
145
### Prometheus
147
146
148
147
Prometheus is deployed and configured to enable the collection of metrics for all components that have
149
- properties `prometheus.io:scrape: true` set in the annotations
150
- (along with any other connection information). This includes the prometheus `node-exporter`
151
- daemonset which is deployed in this step as well.
148
+ a defined service monitor. At installation time, the deployment will instantiate:
149
+ - Node Exporters
150
+ - Kubernetes Service Monitors
151
+ - Grafana preloaded with dashboards and datasources for Kubernetes management
152
+ - The NGINX Ingress Controller
153
+ - Statsd receiver
154
+
155
+ The former behavior of using the `prometheus.io:scrape: true` property set in the annotations
156
+ indicating pods where metrics should be scraped has been deprecated, and these annotations will
157
+ be removed in the near future.
158
+
159
+ Also, the standalone Grafana deployment has been removed from the standard deployment scripts, but has been left as
160
+ a project in the event someone wishes to run this standalone.
161
+
162
+ Finally, this namespace will hold service monitors created by other projects, for example the Bank of Sirius
163
+ deployment currently deploys a service monitor for each of the postgres monitors that are deployed.
164
+
165
+ Notes:
166
+ 1. The NGINX IC needs to be configured to expose prometheus metrics; this is currently done by default.
167
+ 2. The default address binding of the `kube-proxy` component is set to `127.0.0.1` and as such will cause errors when the
168
+ canned prometheus scrape configurations are run. The fix is to set this address to `0.0.0.0`. An example manifest
169
+ has been provided in [prometheus/extras](./prometheus/extras) that can be applied against your installation with
170
+ `kubectl apply -f ./filename`. Please only apply this change once you have verified that it will work with your
171
+ version of Kubernetes.
172
+ 3. The _grafana_ namespace has been maintained in the conifugration file to be used by the prometheus operator deployed
173
+ version of Grafana. This version only accepts a password; you can still specify a username for the admin account but it
174
+ will be silently ignored.
152
175
153
- This also pulls data from the NGINX KIC, provided the KIC is configured to allow prometheus access (which is enabled by
154
- default).
155
176
156
177
### Grafana
157
178
179
+ **NOTE:** This deployment has been deprecated but the project has been left as an example on how to deploy Grafana in this
180
+ architecture.
181
+
158
182
Grafana is deployed and configured with a connection to the prometheus datasource installed above. At the time of this
159
183
writing, the NGINX Plus KIC dashboard is installed as part of the initial setup. Additional datasources and dashboards
160
184
can be added by the user either in the code, or via the standard Grafana tooling.
@@ -188,7 +212,10 @@ As part of the Bank of Sirius deployment, we deploy a cluster-wide
188
212
[self-signed](https://cert-manager.io/docs/configuration/selfsigned/)
189
213
issuer using the cert-manager deployed above. This is then used by the Ingress object created to enable TLS access to
190
214
the application. Note that this Issuer can be changed out by the user, for example to use the
191
- [ACME](https://cert-manager.io/docs/configuration/acme/) issuer.
215
+ [ACME](https://cert-manager.io/docs/configuration/acme/) issuer. The use of the ACME issuer has been tested and works
216
+ without issues, provided the FQDN meets the length requirements. As of this writing the AWS ELB hostname is too long
217
+ to work with the ACME server. Additional work in this area will be undertaken to provide dynamic DNS record creation
218
+ as part of this process so legitimate certificates can be issued.
192
219
193
220
In order to provide visibility into the Postgres databases that are running as part of the application, the Prometheus
194
221
Postgres data exporter will be deployed into the same namespace as the application and will be configured to be scraped
@@ -204,4 +231,6 @@ provides better tools for hierarchical configuration files.
204
231
205
232
In order to help enable simple load testing, a script has been provided that uses the
206
233
`kubectl` command to port-forward monitoring and management connections to the local workstation. This command
207
- is [`test-foward.sh`](./extras/test-forward.sh) and is located in the [`extras`](./extras) directory.
234
+ is [`test-foward.sh`](./extras/test-forward.sh) and is located in the [`extras`](./extras) directory.
235
+
236
+ **NOTE:** This script has been modified to use the new Prometheus Operator based deployment.
0 commit comments