2

Half a day wasted. FUCK!

I use grafana loki and mimir/prometheus for telemetry. A few days ago I queried loki to see if logging is still working. Yesterday I changed the datasource to mimir, changed the query parameters to get metrics from another env, ran the query, and... Querier [mimir] crashed.

Wtf.
Error says it got too much data to chew on.

So I spend 4 hours playing with the querier and grpc limits, balancing between limit errors and OOMKills [2G ram].

I got suspicious about oomk. Why would it...

Then I tried to shrink the timeframe to 15min. Still oomk. Down to 5min -- now it worked. But the number of different metrics returned was over 1k

then I look once again at the query. And ofc it is ´{env="prod"}´

turns out, forgetting that you're querying metrics with a logs' query is an expensive and frustrating mistake. Esp. at 3am.

idk why it even returned me anything...

Comments
  • 2
    Managing to waste half a day before 8 in the morning, that's not bad.
  • 1
    @ScriptCoded as Duolingo puts it: "I'm making great strides" in that regard :)
  • 2
    3am work is always a lifetime experience
  • 0
    @retoor That's a good point, I should use it more often.

    However, there's till this little matter of attributing the cause of the "badness" of the day to.. the day. Because it might just as well be a skill issue, PEBCAK, someone else getting in your way, etc.
  • 0
    @retoor and it's more fun :)
Add Comment