<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Grafana on Marat Kiniabulatov | Agile Coach, OKR, PMO</title><link>https://maratkee.com/tags/grafana/</link><description>Recent content in Grafana on Marat Kiniabulatov | Agile Coach, OKR, PMO</description><generator>Hugo</generator><language>en-US</language><lastBuildDate>Tue, 19 Sep 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://maratkee.com/tags/grafana/index.xml" rel="self" type="application/rss+xml"/><item><title>5 essential NOC Metrics to reach high uptime and detect potential outages</title><link>https://maratkee.com/posts/2023-09-19-5-essential-noc-metrics-to-reach-high-uptime-and-detect-potential-outages/</link><pubDate>Tue, 19 Sep 2023 00:00:00 +0000</pubDate><guid>https://maratkee.com/posts/2023-09-19-5-essential-noc-metrics-to-reach-high-uptime-and-detect-potential-outages/</guid><description>&lt;p&gt;My latest tenure of 2.5 years is closely related to Designing and Adopting Incident Management Framework (as part of Program Management org). This activity was driven with two primary objectives in mind:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Reach and maintain system uptime of 99.99% (our APIs and SDKs).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Ensure engineering is always firsthand source of information for any potential outage that can result in downtime.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In our foundational days, we lacked a comprehensive alerting and monitoring system. Establishing the Network Operations Center (NOC) Team was our strategic move to shape a robust system and take charge of Incident Management. We not only touched the 99.98% uptime benchmark but also heightened our proactivity from spotting 60% of incidents ahead of our merchants to a resounding 95% and higher.&lt;/p&gt;</description></item></channel></rss>