4,735 questions
0
votes
0
answers
24
views
How to set alerts on time correlation between logs on datadog?
For example, we have multiple logs that share the same structure:
Order Created { ..."trace": { "order_id":123456, ... }}
Order Paid { ..."trace": { "order_id":...
1
vote
0
answers
49
views
How to solve "invalid format of evaluation results for the alert definition" error?
I have this promQL query:
label_replace(
histogram_quantile(
0.99,
sum(
rate(
http_request_duration_seconds_bucket{
env="dev&...
1
vote
0
answers
39
views
Android 14+: How to monitor permissions on runtime?
I am searching for broadcast receivers which will be fired when one of those permissions has changed:
Battery optimization
Coarse & Fine Location
Post Notifications
Read Phone State
Background ...
0
votes
0
answers
49
views
New Relic Cassandra integration shows only a few keyspaces/tables (29), how to monitor all?
I integrated New Relic Cassandra integration with my Cassandra cluster.
Issue:
The integration is only capturing a small subset of keyspaces and tables (around 29 tables).
Most of the keyspaces/...
3
votes
0
answers
91
views
How do I generate a stable document ID for SQL executions when polling Oracle gv$session into Elasticsearch via Logstash?
I’m building a pipeline that polls Oracle’s gv$session joined with gv$sql every 5 seconds to track query executions. Each poll returns multiple rows (one per active session), and I need to aggregate ...
1
vote
1
answer
96
views
Why python prometheus client collectors create metric object every time when collect method is invoked
https://github.com/prometheus/client_python/blob/master/prometheus_client/gc_collector.py
import gc
import platform
from typing import Iterable
from .metrics_core import CounterMetricFamily, Metric
...
0
votes
0
answers
34
views
Apache reverseProxy : how to send a notification whenever a 503 service unavailable happens
I have a server running on port 8080. I have configured an Apache server as a reverse proxy to forward requests on port 80 to localhost:8080.
I guess that Apache detects whenever any "503 service ...
0
votes
0
answers
49
views
topk 10 function in PromQL query
I was able to use the below query to list the top 10 CPU utilization in grafana, which works fine. However, in an alerting system, there is no alert generated.
topk(10, 100 - (rate(...
1
vote
1
answer
107
views
Apache OMD Thruk 403 Forbidden - after inactivity without configuration changes
In a monitoring environment based on OMD/Thruk (Ubuntu 22.04), we're experiencing a blocking issue:
Accessing `https://<IP>/<site>/thruk` redirects to `/omd` and then immediately returns a ...
0
votes
0
answers
47
views
Trace Elasticsearch API call
TL;DR: I implemented a clumsy way to track a search request through various points in the network. Is there a better way?
I'm using the Elasticsearch SDK to perform searches. The search call traverses ...
0
votes
0
answers
66
views
Is there any way to calculate resources necessity for prometheus?
Hello everyone i'm pretty new in whole monitoring world, so i have this requirement to do monitoring in univention UCS with 10 nodes from different instances so the grafana could to do visualization ...
0
votes
0
answers
71
views
PRTG unable to display value
I have script which basically checks the last package update time, convert and subtracts with current date. Its working fine with Redhat based os but its not working with SUSE.
Here is the script:
#!/...
0
votes
1
answer
83
views
How to profile/monitor a KDB tickerplant to trace causes of a slow tickerplant?
I'm trying to use KDB as a low-latency pub/sub message broker that persists all messages in a queryable format.
However, I'm noticing the latency from when the tickerplant receives a message (i.e. ...
0
votes
0
answers
42
views
Create a report to show total downtime of a GCP project over 6 months
I am trying to report on service availability in GCP. We are expected to provide 90% uptime, and I know that we do. I have an uptime check where I can see the incidents and how long the server was ...
0
votes
1
answer
123
views
How to create an Alert Rule for Scale Up/Down in Azure Application Gateway Autoscale?
I am trying to create an alert rule for my Azure Application Gateway instance that triggers when the instance scales up or scales down in autoscale. My current configuration sets the minimum instance ...
0
votes
1
answer
356
views
OpenTelemetry Metrics Not Exporting to VictoriaMetrics & Service Name Issue in Spring Boot
I am trying to integrate OpenTelemetry with a Spring Boot application and export metrics to VictoriaMetrics. Below is my setup:
Java Classes & Configurations
RestartMetrics.java
public class ...
0
votes
1
answer
74
views
Python logging - already logging to a rotating file; want to send an email when logger.warn() or worse is called
I've seen mentioned that with Python's built-in logging functionality you can create multiple handlers to output logs of different levels to different locations. I've already setup my logger with a ...
0
votes
0
answers
143
views
Grafana OnCall Telegram bot integration
SO basically I am trying to connect my grafana oncall to telegram to test. I am using hobby version with docker-compose ( I will attach it too) but I exposing my engine with ngrok to telegram be able ...
-1
votes
1
answer
467
views
What is "Baggage" in spring boot tracing?
I read spring boot documentaion about observability and see the section about Baggage
You can create baggage with the Tracer API:
@Component
class CreatingBaggage {
private final Tracer tracer;
...
1
vote
1
answer
608
views
Azure Container App status running or down
How to check if Azure Container App is running and create alert if is down ?
I use Azure and ACA, with nodejs backend application.
0
votes
0
answers
45
views
Kafka Connect 3.7.0 Connector Task Startup Metrics
I recently onboarded to Kafka Connect 3.7.0 with an S3 Sink Connector to test a prototype, but I am unable to see task startup attempts/successes metrics be non-zero while my connector is running.
The ...
0
votes
0
answers
317
views
How to set dynamic labels for Opsgenie notification priority?
In Grafana, I have an alert that sends notifications to Opsgenie. Each Opsgenie notification has a priority level, P1-P5, with P1 being the most important. To set this priority level for Opsgenie in ...
2
votes
1
answer
382
views
Monitoring file IO read and write events on Windows
I want to monitor access to some files on Windows.
What I need:
Watching ALL connected drives for READ or WRITE events on files (NOT the create/delete events), doing the filtering later
getting those ...
0
votes
0
answers
107
views
"numpy.core.multiarray failed to import" Error During the Monitoring job at Model Performance Metrics Computation step in Azure ML Workspace
I am encountering an issue while running a monitoring pipeline job in Azure Machine Learning. During the "Model Performance - Compute Metrics" step, I receive the following error:
"...
-1
votes
1
answer
39
views
Hits by parent request under 1
Based on the online help of JavaMelody (http://javamelody.org/demo/monitoring?resource=help/help.html), the number of hits equal to number of executions
I am noticing in my SQL monitoring, that i have ...
0
votes
0
answers
57
views
Problems with the sensor of type Power for GPU in LibreHardwareMonitorLib
I am working with a C# service that uses the LibreHardwareMonitorLib library to gather hardware metrics, including GPU power consumption data. However, I have encountered an issue with the Power ...
0
votes
0
answers
69
views
Promethus target healths not coming up
I am trying to add monitoring to my project with Prometheus and grafana. I checked my localhost:9090/targets but I see that out of 3 services, only one comes up which happens to me Prometheus. My ...
0
votes
0
answers
44
views
Cardinality in prometheus
I am not clear on the concept of cardinality in Prometheus (and downstream components like Grafana).
Imagine a case where metrics have two labels, like so:
some_metric{label_a="1", label_b=&...
0
votes
1
answer
55
views
Azure Logic Apps - Queries for telemetry 2.0 show no results
I want to use the telemtry 2.0 for Azure Logic Apps. Therefore I already updated the host.json with the following code (as explained by Microsoft)
{
"version": "2.0",
"...
0
votes
1
answer
236
views
Clickhouse Cluster Monitoring
I have a ClickHouse cluster with 3 shards. I want to monitor this cluster using Prometheus, but the metrics provided by ClickHouse do not include data related to shard activity or availability.
I ...
0
votes
0
answers
118
views
YACE returning last non-null value for S3 replication metric, instead of null/NaN
I'm attempting to scrape S3 replication metrics, and have observed some odd behaviour with one of the metrics, OperationsFailedReplication.
On CloudWatch, this metric presents a data point when there ...
0
votes
1
answer
108
views
Vector.dev how to split configuration into files
I'm trying to split the configuration into files
├── sinks
│ └── sin.yaml
├── sources
│ └── sss.yaml
├── transforms
│ └── ttt.yaml
└── vector.yaml
cat sources/sss.yaml
type: "kafka"
...
0
votes
0
answers
41
views
RabbitMQ HTTP API Endpoint does not display the number of messages
I would like to queue the status of my RabbitMQ messages so that I can log them and alert them via a logging tool.
Unfortunately, this curl does not return the number of messages, as can be read in ...
0
votes
0
answers
114
views
zabbix agent not getting disk info
I have a zabbix agent installed on a Windows Server 2012, called SRV12 that uses a zabbix proxy in order to collect data from every resource on the computer and send it to my zabbix server that I have ...
0
votes
0
answers
113
views
MQL/PromQL CPU query for Google Cloud Monitoring
I'd like to create a query that would basically give me the CPU metrics for the top 5 used containers, but everything I've tried just doesn't work at all. The CPU metric can be something like "...
0
votes
1
answer
610
views
Why should I use opentelemetry-spring-boot-starter, and which method should be used instead? [closed]
I have a question related to integrating Spring Boot with OpenTelemetry. I am planning to use the dependency opentelemetry-spring-boot-starter. My questions are: When should I use this approach?
Why ...
-1
votes
1
answer
97
views
apache skywalking java agent trace changes
I have a simplified project reflecting the problem here
The bottom line is this: there is a producer who puts a message in RabbitMQ, there is a consumer who consumes the message from there and ...
0
votes
0
answers
60
views
Migrating grafana-loki data from binary installation to kubernetes installation
I currently have a Loki installation done using the Loki binary download and running as a service on a VM.
I am creating a Kubernetes cluster for the monitoring tools we use today (Loki, Grafana, ...
1
vote
0
answers
768
views
Customizing Slack Notifications for Uptime Kuma
I have set up Uptime Kuma to monitor some of my website URLs. If any URL goes down, it sends a notification to my Slack channel. I have also customized the Slack message format.
How I customized the ...
0
votes
1
answer
80
views
Logs Not Flowing from Function App deployed under ASE to its own Application Insights
We are experiencing an issue where logs are not flowing from our Azure Function App to Application Insights. The Function App has been deployed under an App Service Environment (ASE), and the ...
0
votes
1
answer
54
views
Pynput not logging alphanumerical keys on Mac
I want to detect when I press the "space" bar key using python and pynput.
I tried their documentation, my old code but nothing works anymore on MacOS 15.1.1. Even running the script using ...
0
votes
0
answers
136
views
How to delete latest data in Zabbix
def lambda_handler(event, context):
token = get_zabbix_token()
failed = False
# Define metric functions to iterate over
metric_functions = {
# "ELB": ...
0
votes
1
answer
91
views
How to stop monitoring process in Flask/Python?
in my project I have a USB monitor running, but I see it interfere with other functions, so I want to start/stop it when necessary. In my code the start function is ok, but the stop one no, I can't ...
0
votes
0
answers
39
views
How to produce a very fast https empty reply from nginx?
I have an nginx balancer and after two application servers with apache and tomcat.
I need to create a very fast https response server to let the client monitor the speed of its connection measuring ...
0
votes
1
answer
1k
views
Prometheus Thanos Receiver Error: content deadline exceeded
I m trying to setup remote write Thanos receiver with docker compose on a virtual box vm but for some reason the receiver api endpoint is crashing anytime Prometheus send a post request:
Error 500 msg:...
0
votes
1
answer
166
views
Why is there no SQL statement in MON$STATEMENTS.MON$SQL_TEXT for all active transactions in Firebird 2.5
I am trying to find the SQL statements for my active transactions and specifically for OAT.
I am running this query:
select MT.MON$TRANSACTION_ID,MT.MON$TIMESTAMP,MS.MON$STATEMENT_ID,MS.MON$SQL_TEXT
...
2
votes
2
answers
84
views
Check if SMTP server is online
How to check if SMTP is online and running?
func (p *sys_ping) smtp(host string) string {
client, err := smtp.Dial(host)
if err != nil {
time.Sleep(time.Duration(p.delay) * time.Second)...
1
vote
0
answers
323
views
Opentelemetry context propagation in nestjs app
there. I bumped into a problem with configuration of OTEL for NestJS. I want to add additional information into the span attributes and propagate it over all trace. I use baggage. My entry point is ...
0
votes
1
answer
205
views
Error when creating a new item prototype (zabbix proxmox)
I'm using Zabbix 7.0.4 and I'm using the template Proxmox VE by HTTP that has predefined item prototypes like these:
I want to create a new item prototype for Proxmox VE by HTTP that calculates the % ...
4
votes
4
answers
225
views
Find the statement currently running in my PL/pgSQL code block
Is there a way to figure out which statement from the block is currently running in Postgres? (Even extra extensions or tracing might be an option)
Below is the quick way to reproduce but in real ...