1,065 questions
0
votes
1
answer
52
views
Get accurate numbers for counter increases
Assume that I have this counter, with the server scraping these metrics every 15 seconds:
vector_component_sent_events_total{component_id=~"sink.*"}
If the counter increases by 30. As shown ...
1
vote
0
answers
49
views
How to solve "invalid format of evaluation results for the alert definition" error?
I have this promQL query:
label_replace(
histogram_quantile(
0.99,
sum(
rate(
http_request_duration_seconds_bucket{
env="dev&...
1
vote
1
answer
53
views
avg_over_time() window alignment in VictoriaMetrics query_range
I have the following VM range query that I'm executing over http from java:
http://localhost:8400/select/0/prometheus/api/v1/query_range?query=avg_over_time(numeric{bom_id = "id1", entry_id =...
0
votes
0
answers
71
views
What is the right gcp cloud monitoring promql query to capture "unspecified" (or all resource types) metrics?
The following promql query returns the logging.googleapis.com/log_entry_count metric in the global namespace:
increase({"__name__"="logging.googleapis.com/log_entry_count","...
0
votes
0
answers
40
views
PromQL query to Count and group in Grafana
I'm using the openstack_exporter and I'm trying to show the number of routers per network node but having trouble working out how in promQL I can work this. In the openstack_neutron_l3_agent_of_router ...
0
votes
1
answer
188
views
Promql warning: metric might not be a counter doesn't make sense
I have this promql query:
avg by (instance)(rate(fusion:node_cpu_seconds_total:foo{mode="steal"}[5m])) * 100
and I get lots of data, but then I get this at the end:
{
"infos": [
...
0
votes
0
answers
53
views
How to get log entries with the date mentioned as 3months+
I wrote a loki query to get certificate expiry. currently it mentioned which date its going to expire.so use some date functions to convert into millis.
min by(serviceReplace)(min_over_time(
{job=&...
-2
votes
1
answer
135
views
Reusing a regexp match across 2 metrics/labels in prometheus alerts [closed]
For a Prometheus alert I have the following expression:
node_hwmon_tempcelsius{sensor="nvme_nvme0"} >= node_hwmon_temp_crit_celsius{sensor="nvme_nvme0"}
repeated for nvme 0-5. ...
3
votes
1
answer
120
views
Prometheus/Grafana: increase() spikes during Kubernetes deploys with multi-pod counters (dynamic labels)
Context
Java service in Kubernetes, multiple pods.
Metrics exposed via Micrometer + Prometheus.
Grafana dashboards use increase(...) and sum by (...) to count events in a time range.
Counters are ...
1
vote
0
answers
54
views
Getting max number of simultaneously active CPU cores over time
I am trying to write a PromQL query that gets the maximum number of CPU cores that were being used at once in a pod over a given time period. Essentially, I want to see if a given run of an ...
0
votes
0
answers
49
views
topk 10 function in PromQL query
I was able to use the below query to list the top 10 CPU utilization in grafana, which works fine. However, in an alerting system, there is no alert generated.
topk(10, 100 - (rate(...
2
votes
1
answer
141
views
Displaying multiple OpenShift clusters version within grafana
I want to make a pie chart where I display the current versions of available OpenShift clusters. The clusters uses remote_write to a central Thanos instance, from where we can use promql to query ...
0
votes
0
answers
49
views
How to properly graph a counter?
My code is increasing the counter app_test_metric_counter_total every 1 minute by 1.
Attaching a label color with 3 possible values (green, yellow, red)
I have these metrics being exposed:
# HELP ...
0
votes
1
answer
473
views
How do I convert this alert policy MQL to PromQL?
I'm trying to convert an MQL query I'm using in a GCP alert policy to PromQL while also aggregating by user labels.
This is the MQL:
fetch gce_instance
| metric 'agent.googleapis.com/cpu/load_5m'
| ...
0
votes
0
answers
52
views
PromQL Percentile Report in Grafana
I wanted to find the 99 Percentile Report using PromQL in Grafana.
I use the following Query to get the data from PromQL
ifInOctets{instance="172.6.5.4", ifAlias="ISP1"}
Now I ...
2
votes
1
answer
56
views
PromQL for CPU pressure of a certain container "class"
We deploy our gitlab runners in kubernetes, where we offer different classes that distinguish in cpu/memory requests.
My goal is now to find a promQL query, that indicates a high CPU pressure over a ...
0
votes
0
answers
107
views
PromQL Max Memory over the last 30 days
I've seen a few post with people searching for a similar answer in both SO and other forums, but yet to see an answer. I would like to find the max memory by namespace and container in the last 30 day....
0
votes
0
answers
227
views
Grafana Variable - Split by comma and display all unique values
In Grafana v11.4.0, I enable Variable in Query Options using PromQL and I get the following response.
Query Type: Label Values
All
Apple, Orange, Grape
Onion, Apple, Grape, Strawberry
But I wanted to ...
1
vote
0
answers
32
views
Get overall percentile across non contiguous time ranges
In this Grafana heat map representing a histogram, how can I get the combined p95 of the two circled distributions?
I can get the overall p95 for the entire time range shown above using ...
0
votes
0
answers
19
views
Is there a prettier way to calculate the sum of the maximum values per day in 1 month of retrospective?
In prometheus dashboard calculate sum value selection_cnt_success_by_source
max_over_time by from day ago to week ago.
Write this query:
(selection_cnt_success_by_source{source="...
0
votes
0
answers
53
views
How to ignore duplicate metrics on prometheus dashboard?
I have some metrics
selection_cnt_success_by_source{container="selections", ims_system_id="5597830", instance="10.220.23.148:5006", job="selections", namespace=&...
0
votes
0
answers
30
views
Re-firing Grafana alert returns value -1
My grafana "success rate alert" returns value of -1 when re-fired probably because when the re-fire happens there is no data.
I've configured the alerting policy to be "Keep Last State&...
1
vote
0
answers
56
views
Grafana Prometheus Query to display Unique Value in Table
I created a python script to send about 1.5k to 1.7k metrics every hour to Prometheus using Pushgateway.
Now I wanted to display in Grafana with only unique values.
When I query for last 3 months, it ...
1
vote
0
answers
76
views
Which Prometheus metric type should be used for for short-lived cronjob metrics?
We use Prometheus PushGateway for collecting metrics from short-lived cronjobs. I would like to collect two metrics:
How many times a particular cronjob finished without errors.
How many times a ...
0
votes
0
answers
94
views
GCP dashboard can't use variables that have dash in values
In GCP dashboard I have a variable defined like this:
cluster_name="anthos-ooo"
When I run this PromQL query
sum(increase(logging_googleapis_com:user_my_metriv{monitored_resource="...
1
vote
0
answers
347
views
Calculating the success rate of nginx requests in Prometheus: sum of rate vs sum of counter
I have nginx ingress controller installed in Kubernetes cluster along with Prometheus and Grafana, I was exploring the Nginx Ingress controller dashboard that comes with the controller which has some ...
1
vote
0
answers
160
views
Preprocessing Grafana label values before using as tag filters in promql
In my grafana dashboard, I have a varible setup which results in values like in the format- name:region:id. So a variable dropdown could be:
au:sydney:0
au:mel:0
nz:wel:1
Now in my question, I want ...
1
vote
1
answer
105
views
Using predict_linear in PromQL with a dynamic range
We have a job that pulls messages off a Kafka topic. The job runs hourly and it's important that the job complete before the next hour arrives.
I'm trying to set up an alert that will tell me that ...
0
votes
1
answer
126
views
How to get only one result instead of three?
I have the following PromQL statement in Grafana:
topk(1,avg by (instance) (100 * (1 - rate(node_cpu_seconds_total{cluster="$cluster",mode="idle", instance=~"processing.+"...
0
votes
0
answers
107
views
PromQL query to get percentage of long-running requests
I am trying to find a way to write an alert with PromQL as if 20% percent of the requests to a certain url is more than 5 sec.
Besides I am super confused about how this works, I cannot find real data ...
-2
votes
1
answer
34
views
Need only the top first element of the prometheus metrics
Need only the top first element of the prometheus metrics.
i did topk(1,deploy_time_total{status="SUCCEEDED"}) by (imageName) but it is not only showing the displayName=7 record, but also ...
0
votes
0
answers
43
views
Joining a label to a vector with many-to-one matching labels
I've got a vector with two labels with many-to-many match to each other:
metric_1{label_1="aaa",label_2="bba"} metric_1{label_1="aaa",label_2="bbb"} metric_1{...
0
votes
0
answers
57
views
Grafana X Axis Number Sort by not working
I have this PromQL used and I get the expected result in Bar Chart, My Moto is to count the number of devices based on value.
count_values("count", client_snr{exported_job="...
1
vote
0
answers
170
views
How to count how many pods were started by namespace each day in PromQL?
I'm trying to answer a question of how many pods were started/scheduled/whatever per namespace per day. I have not found any useful counter-type metric that would be counting that, just related gauges ...
1
vote
0
answers
69
views
Rate and increase extrapolation in prometheus when service is redeploying
I have counter for successful orders. But when my service is redeploying, the rate drops for some period, while the increase does not change the trend. Please, Help me figure out why this might be ...
0
votes
0
answers
46
views
How to sum Prometheous counter with small amout of data
Using OpenTelemetry I track user actions (button clicks let's say), that are persisted in Prometheus counter:
{__name__="user_actions_total", instance="1", user_id="123"}
...
0
votes
0
answers
113
views
MQL/PromQL CPU query for Google Cloud Monitoring
I'd like to create a query that would basically give me the CPU metrics for the top 5 used containers, but everything I've tried just doesn't work at all. The CPU metric can be something like "...
0
votes
0
answers
88
views
Grafana & Prometheus how to group values by label after regex applied?
I've been trying to create a query that displays the average latency of each path on my application, while using Promql on Grafana:
sum by(path) (rate(http_server_latency_milliseconds_sum[$...
0
votes
0
answers
234
views
How to group a prometheus metric by day?
I'm using Prometheus and Grafana, and I'd like to create a graph for the total number of HTTP requests served by day.
Our application exposes the current count, and therefore, I'm using this code to ...
0
votes
0
answers
54
views
Prometheus - how to group by lable 2 metrics and filter one with another?
I have 2 metrics:
levels{set_id, instance_id}
levels_expected{set_id}
I need to group both by set_id and count all sets, where all instance_id values of levels equals levels_expected value.
Ex.:
...
0
votes
0
answers
87
views
PromQl query for how long a process ran over time
Using Prometheus and Grafana, I want to show how long a windows process has been running for, in a selected time range.
With:
time() - min by(process) (windows_process_start_time{process="foo&...
0
votes
0
answers
48
views
Why metric subtraction in grafana dashboard return No data?
Prometheus exporter prepare metrics
# HELP selection_cnt_success_by_source Selection count success by source
# TYPE selection_cnt_success_by_source gauge
selection_cnt_success_by_source{source="...
0
votes
0
answers
15
views
how to capture a partial consumer group failure in Prometheus using PromQL?
I want to explore the possibilities of capturing any sort of partial consumer group failures due to any reason using PromQL in Prometheus. So that I can use that metrics to monitor and later can add a ...
0
votes
0
answers
84
views
Calculate QoS(Quality of Service) in % in Grafana using PromQl
I have a nodejs application with counter and histogram as well setup like this:
this.express.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
...
2
votes
0
answers
279
views
Is it possible to disable a Prometheus Alert without removing it?
I would like to know how to disable a Prometheus Alert without deleting it please.
I thought about adding "AND false" at the end of the alert query, but I'm not sure if this approach will ...
0
votes
0
answers
160
views
if-else condition in Promql using Grafana Variables
I have a custom variable "Project_Name" in Grafana Dashboard. I want to execute below condition unsing Promql query in Stackdriver Metrics datasource(GCP)
if ($Project_Name == 'ALL') then :
...
0
votes
1
answer
415
views
PromQL join with identical label value but different label name
I am trying to combine 2 metrics in a promQL query using the Grafana example here:
https://grafana.com/docs/grafana-cloud/monitor-infrastructure/monitor-cloud-provider/aws/cloudwatch-metrics/query-tag-...
1
vote
0
answers
62
views
Does Grafana support the PromQL modifier @?
I'm trying to visualize a Prometheus query in Grafana, but I'm facing an issue. The query:
mymetric{mylabel="value"} @ end()
works perfectly in Prometheus and returns a valid graph. However, ...
0
votes
0
answers
112
views
Calculating API Request Rate Over time by status code and path
I'm using the below expression to calculate the rate of requests by status_code and path:
sum(rate(http_request_duration_seconds_count[10m])) by (status_code, path) > 0
I'm getting abnormally high ...
0
votes
0
answers
46
views
If else condition for PromQL
I am trying to write a promql query to achieve the following
Output total number of times that my_metric was < 1 grouped by namespace and for all other namespaces where there are no series of ...