Systems

21 KPIs

% of "dead" servers
% of (assigned) disk space quota used
% of disk space used
% of outage due to changes (planned unavailability)
[Outage due to planned changes] percentage of [total Outage]
% of outage due to incidents (unplanned unavailability)
Availability
Availability (excluding planned downtime)
[Actual uptime] percentage of [Planned uptime]
Average % of CPU utilization
Average % of memory utilization
CPU queue length average
Average (CPU queue length)
Downtime
[1 - (Availability/100)] X 365 X 24 X 60
Maximum CPU usage
Maximum memory usage
Mean Time to Repair (MTTR)
Average of time between resolution of incident and start of incident
Mean-time between failure (MTBF)
Number of alerts on exceeding system capacity thresholds
SUM (Alerts/events exceeding system capacity thresholds)
Power draw % by systems
RISC Asset Efficiency %
asset efficiency = (current capacity X average utilization) / reference capacity
Server to System Administration Ratio
[number of servers] divided by [number of system administrators]
Unit costs of IT service(s)
[cost of providing IT service] divided by [number of units e.g. application transactions, storage GB, number of email accounts, etc.]
x86 Asset Efficiency %
asset efficiency = (current capacity X average utilization) / reference capacity