Managed Apache Flink metrics

Amazon Managed Apache Flink enables you to author and run code against streaming sources to perform time-series analytics, feed real-time dashboards, and create real-time metrics. Ensure your cloud platform is configured in SolarWinds Observability SaaS to collect this service's data. See Add an AWS cloud account. CloudWatch metrics must also be enabled for this service in the AWS Console for the metric data to be available.

Many of the collected metrics from Managed Apache Flink entities are displayed as widgets in SolarWinds Observability explorers; additional metrics may be collected and available in the Metrics Explorer. You can also create an alert for when an entity's metric value moves out of a specific range. See Entities in SolarWinds Observability SaaS for information about entity types in SolarWinds Observability SaaS.

The following table lists some of the metrics collected for these entities. To see the Managed Apache Flink metrics in the Metrics Explorer, type AWS.KinesisAnalytics in the search box.

Metric	Units	Description
`sw.metrics.healthscore`	Percent (%)	Health score. The health state provides real-time insight into the overall health and performance of your monitored entities. The health state is determined based on anomalies detected for the entity, alerts triggered for the entity's metrics, and the status of the entity. The health state is displayed as one of the following four states and colors: Good, Moderate, Bad, or Unknown. You can determine the impact of the alerts, anomalies, and statuses on the health of an entity type by going to Settings > Health, and selecting a specific entity type. You can also customize the impact. To view the health of Managed Apache Flink entities in the Metrics Explorer, filter the `sw.metrics.healthscore` metric by `entity_types` and select `awsapacheinstance`.
`AWS.KinesisAnalysis.uptime`	milliseconds (ms)	The time that the job has been running without interruption.
`AWS.KinesisAnalysis.lastCheckpointSize`	bytes	The total size of the last checkpoint.
`AWS.KinesisAnalysis.lastCheckpointDuration`	milliseconds (ms)	The time it took to complete the last checkpoint.
`AWS.KinesisAnalysis.cpuUtilization`	Percent (%)	Overall percentage of CPU utilization across task managers.
`AWS.KinesisAnalysis.containerCPUUtilization`	Percent (%)	Overall percentage of CPU utilization across task manager containers in Flink application cluster.
`AWS.KinesisAnalysis.containerMemoryUtilization`	Percent (%)	Overall percentage of memory utilization across task manager containers in Flink application cluster.
`AWS.KinesisAnalysis.containerDiskUtilization`	Percent (%)	Overall percentage of disk utilization across task manager containers in Flink application cluster.
`AWS.KinesisAnalysis.heapMemoryUtilization`	Percent (%)	Overall heap memory utilization across task managers.
`AWS.KinesisAnalysis.downtime`	milliseconds (ms)	For jobs currently in a failing/recovering situation, the time elapsed during this outage.
`AWS.KinesisAnalysis.fullRestarts`	Count	The total number of times this job has fully restarted since it was submitted.
`AWS.KinesisAnalysis.managedMemoryUtilization`	Percent (%)	Derived by `managedMemoryUsed`/`managedMemoryTotal`.
`AWS.KinesisAnalysis.numRecordsInPerSecond`	Count per second	The total number of records this application, operator or task has received per second.
`AWS.KinesisAnalysis.numRecordsOutPerSecond`	Count per second	The total number of records this application, operator or task has emitted per second.
`AWS.KinesisAnalysis.threadcount`	Count	The total number of live threads used by the application.
`AWS.KinesisAnalysis.backPressuredTimeMsPerSecond`	milliseconds (ms)	The time this task or operator is back pressured per second.
`AWS.KinesisAnalysis.busyTimeMsPerSecond`	milliseconds (ms)	The time this task or operator is busy (neither idle nor back pressured) per second.
`AWS.KinesisAnalysis.currentInputWatermark`	milliseconds (ms)	The last watermark this application/operator/task/thread has received.
`AWS.KinesisAnalysis.currentOutputWatermark`	milliseconds (ms)	The last watermark this application/operator/task/thread has emitted.
`AWS.KinesisAnalysis.idleTimeMsPerSecond`	milliseconds (ms)	The time this task or operator is idle per second.
`AWS.KinesisAnalysis.managedMemoryUsed`	bytes	The amount of managed memory currently used.
`AWS.KinesisAnalysis.managedMemoryTotal`	bytes	The total amount of managed memory.
`AWS.KinesisAnalysis.numberOfFailedCheckpoints`	Count	The number of times checkpointing has failed.
`AWS.KinesisAnalysis.numRecordsIn`	Count	The total number of records this application, operator, or task has received.
`AWS.KinesisAnalysis.numRecordsOut`	Count	The total number of records this application, operator or task has emitted.
`AWS.KinesisAnalysis.numLateRecordsDropped`	Count	The number of records that were dropped because they arrived late and were beyond the processing window.
`AWS.KinesisAnalysis.oldGenerationGCcount`	Count	The number of times the old generation garbage collection has occurred.
`AWS.KinesisAnalysis.oldGenerationGCTime`	milliseconds (ms)	The total time spent on old generation garbage collection.
`AWS.KinesisAnalysis.millisBehindLatest`	milliseconds (ms)	Indicates how many milliseconds behind the latest data the application is.
`AWS.KinesisAnalysis.bytesRequestedPerFetch`	bytes	The number of bytes requested per fetch operation from the data stream.
`AWS.KinesisAnalysis.currentoffsets`	Count	The current offsets of the data being processed in a Kinesis Data Analytics application.
`AWS.KinesisAnalysis.commitsFailed`	Count	The number of failed commit attempts in the application.
`AWS.KinesisAnalysis.commitsSucceeded`	Count	The number of successful commit operations.
`AWS.KinesisAnalysis.committedoffsets`	Count	The number of offsets that have been successfully committed.
`AWS.KinesisAnalysis.records_lag_max`	Count	The maximum lag in records being processed, measured in milliseconds.
`AWS.KinesisAnalysis.bytes_consumed_rate`	bytes	The rate at which data is consumed from the Kinesis stream.
`AWS.KinesisAnalysis.zeppelinCpuUtilization`	Percent (%)	The percentage of CPU resources being used by the Zeppelin server.
`AWS.KinesisAnalysis.zeppelinHeapMemoryUtilization`	Percent (%)	The percentage of heap memory utilized by the Zeppelin server.
`AWS.KinesisAnalysis.zeppelinThreadcount`	Count per second	The number of active threads being used by the Zeppelin server.
`AWS.KinesisAnalysis.zeppelinWaitingJobs`	Count	The number of jobs waiting to be executed in the Zeppelin server.
`AWS.KinesisAnalysis.zeppelinServerUptime`	seconds (s)	The uptime of the Zeppelin server, indicating how long it has been running continuously.

Search SolarWinds Support

Managed Apache Flink metrics