OpenText Network Operations Management (NOM) — Event & Incident Management
Blog Series: OpenText NOM — Part 3
➡ Part 1 — SNMP Explained
➡ Part 2 — Network Discovery & Monitoring
After discovery and monitoring, the next critical layer is Event & Incident Management.
Monitoring tells you what happened.
Event management tells you why it happened.
Incident management ensures it gets resolved.
This is the core of any enterprise NOC.
📌 What is Event Management?
An event is any detectable occurrence in the network:
Link Down
High CPU
Device unreachable
Interface errors
Not every event is an incident.
🖼️ Event Flow in Monitoring Systems
Event Lifecycle in NOM
1️⃣ Event Generated (polling or trap)
2️⃣ Event Normalized
3️⃣ Correlation Applied
4️⃣ Alarm Created
5️⃣ Operator Notified
📌 What is Incident Management?
An incident is a service-impacting event requiring action.
Example:
Core switch failure
Firewall outage
WAN link failure
Incident management includes:
✔ Ticket creation
✔ Assignment
✔ Escalation
✔ SLA tracking
🖼️ Incident Lifecycle
Event vs Incident
| Event | Incident |
|---|---|
| Raw alert | Business impact |
| System generated | Requires action |
| May auto-clear | Needs resolution |
Event Correlation (Root Cause Analysis)
In large networks, a single failure can generate hundreds of alerts.
Example:
Core switch down →
10 Access switches down →
200 servers unreachable →
Applications failing
Without correlation = 211 alarms
With correlation = 1 root cause alarm
🖼️ Root Cause Correlation
Noise Reduction Techniques
✔ Alarm suppression
✔ Duplicate filtering
✔ Threshold tuning
✔ Maintenance window configuration
Reduces alert fatigue in NOC teams.
SLA & Escalation Policies
Enterprise environments define:
Severity levels (Critical, Major, Minor)
Response time targets
Escalation matrix
Example:
Severity 1 → Escalate in 15 minutes
Severity 2 → Escalate in 1 hour
Integration with ITSM Tools
NOM integrates with:
ServiceNow
Remedy
Jira
Event → Ticket auto-creation → Assignment → Closure.
Real-World Example
Scenario:
Bandwidth spike on WAN link.
Flow:
Event generated
Threshold exceeded
Incident created
Ticket assigned
Root cause identified
Incident resolved
Post-incident review
🖼️ Event to Incident Flow
Best Practices
✔ Define severity clearly
✔ Implement correlation rules
✔ Avoid alert storms
✔ Use automation
✔ Track MTTR
Key Metrics
| Metric | Meaning |
|---|---|
| MTTR | Mean Time to Repair |
| MTBF | Mean Time Between Failures |
| Event Volume | Total alerts |
| False Positive Rate | Noise level |
📚 Recommended Reading
🎯 Conclusion
Discovery gives visibility.
Monitoring gives metrics.
Event management gives intelligence.
Incident management ensures resolution.
This completes the operational backbone of enterprise network monitoring.
💼 Support professionnel disponible
Si vous rencontrez des problèmes sur des projets réels liés au développement backend d’entreprise ou à l’automatisation des workflows, je propose des services de conseil payants, de débogage en production, de support projet et de formations ciblées.
Les technologies couvertes incluent Java, Spring Boot, PL/SQL, Azure, CMS, ainsi que l’automatisation des workflows (jBPM, Camunda BPM, RHPAM), DMN/Drools.
📧 Contact: ishikhanirankari@gmail.com | info@realtechnologiesindia.com
🌐 Website: IT Trainings | Digital lectern | Digital rostrum | Digital metal podium
Si vous rencontrez des problèmes sur des projets réels liés au développement backend d’entreprise ou à l’automatisation des workflows, je propose des services de conseil payants, de débogage en production, de support projet et de formations ciblées.
Les technologies couvertes incluent Java, Spring Boot, PL/SQL, Azure, CMS, ainsi que l’automatisation des workflows (jBPM, Camunda BPM, RHPAM), DMN/Drools.
📧 Contact: ishikhanirankari@gmail.com | info@realtechnologiesindia.com
🌐 Website: IT Trainings | Digital lectern | Digital rostrum | Digital metal podium
Comments
Post a Comment