Cloud Platform Performance Monitoring



To effectively monitor interworks.cloud Platform performance and understand platform usage impact on actual performance as well as end-user experience, it is imperative to collect and maintain metrics and key performance indicators (KPIs) from all platform server components.



Basic Server (infrastructure) performance counters


These counters apply to both the database server and the web/application servers (BSS and Storefront servers)

  • CPU usage

  • Memory consumption 

  • I/O traffic (Read/Write IOPS)

  • I/O latency
  • Disk space


Application/Web Server performance counters


The following metrics can help identify potential bottlenecks


Metric

Description

Type

Response timeAverage web request response timeApplication Insights (average)
RequestsTotal web requestsApplication Insights (count)
Failed RequestsTotal failed web requests

Application Insights (count)

Availability

Application availability (downtime)

Application Insights (percentage)
Incoming RequestsRequest Rate | Request Duration | Request Failure RateApplication Insights (Live Metrics)
Outgoing RequestsDependency Call Rate | Dependency Call Duration | Dependency Call Failure RateApplication Insights (Live Metrics)
Windows services availabilityUptime of platform related windows servicesMonitoring agent


Database Server performance counters


Metric

DescriptionNotes
Processes BlockedThe processes blocked counter identifies the number of blocked processes0 or close to 0 at all times
Lock Waits / SecThe lock waits per second counter tracks the number of times per second that SQL Server is not able to retain a lock right away for

a resource

0 or close to 0 at all times
Access Methods – Pages Splits / SecThis counter measures the number of times SQL Server had to split a page when updating or inserting data per secondBelow the 20% of Batch Requests/sec
Access Methods – Forwarded Records / SecForwarded Records/sec shows how fragmented the heaps areBelow 10% of the Batch Requests/Sec
Buffer Cache Hit RatioThe buffer cache hit ratio counter represents how often SQL Server is able to find data pages in its buffer cache when a query needs a data pageAbove 90%
Page Life ExpectancyThe page life expectancy counter measures how long pages stay in the buffer cache in secondsAbove 300 seconds
Checkpoint Pages / SecThe checkpoint pages per second counter measures the number of pages written to disk by a checkpoint operationDepending on established baseline
Batch Requests / SecBatch Requests/Sec measures the number of batches SQL Server is receiving per secondThere is no recommended value. A high number denotes a busy system (i.e.: over 1000 Batches / Sec)
SQL Compilations / SecThe SQL Compilations/Sec measure the number of times SQL Server compiles an execution plan per secondBelow 10% of the Batch Requests/sec
SQL Re-Compilations / SecThe Re-compilations/Sec counter measures the number of time a re-compile event was triggered per secondBelow 10% of the number of Compilations/Sec
Deadlocks/secMeasures the number of deadlocks occurring per second
User connectionsThe number of users connected to the SQL Server

Table Of Contents