/
Cloud Platform Performance Monitoring

Cloud Platform Performance Monitoring

 

 

To effectively monitor Infiterra Platform performance and understand platform usage impact on actual performance as well as end-user experience, it is imperative to collect and maintain metrics and key performance indicators (KPIs) from all platform server components.

 


 

Basic Server (infrastructure) performance counters


These counters apply to both the database server and the web/application servers (BSS and Storefront servers)

  • CPU usage

  • Memory consumption 

  • I/O traffic (Read/Write IOPS)

  • I/O latency

  • Disk space

 

Application/Web Server performance counters


The following metrics can help identify potential bottlenecks

 

Metric

Description

Type

Response time

Average web request response time

Application Insights (average)

Requests

Total web requests

Application Insights (count)

Failed Requests

Total failed web requests

Application Insights (count)

Availability

Application availability (downtime)

Application Insights (percentage)

Incoming Requests

Request Rate | Request Duration | Request Failure Rate

Application Insights (Live Metrics)

Outgoing Requests

Dependency Call Rate | Dependency Call Duration | Dependency Call Failure Rate

Application Insights (Live Metrics)

Windows services availability

Uptime of platform related windows services

Monitoring agent


Database Server performance counters


Metric

Description

Notes

Processes Blocked

The processes blocked counter identifies the number of blocked processes

0 or close to 0 at all times

Lock Waits / Sec

The lock waits per second counter tracks the number of times per second that SQL Server is not able to retain a lock right away for

a resource

0 or close to 0 at all times

Access Methods – Pages Splits / Sec

This counter measures the number of times SQL Server had to split a page when updating or inserting data per second

Below the 20% of Batch Requests/sec

Access Methods – Forwarded Records / Sec

Forwarded Records/sec shows how fragmented the heaps are

Below 10% of the Batch Requests/Sec

Buffer Cache Hit Ratio

The buffer cache hit ratio counter represents how often SQL Server is able to find data pages in its buffer cache when a query needs a data page

Above 90%

Page Life Expectancy

The page life expectancy counter measures how long pages stay in the buffer cache in seconds

Above 300 seconds

Checkpoint Pages / Sec

The checkpoint pages per second counter measures the number of pages written to disk by a checkpoint operation

Depending on established baseline

Batch Requests / Sec

Batch Requests/Sec measures the number of batches SQL Server is receiving per second

There is no recommended value. A high number denotes a busy system (i.e.: over 1000 Batches / Sec)

SQL Compilations / Sec

The SQL Compilations/Sec measure the number of times SQL Server compiles an execution plan per second

Below 10% of the Batch Requests/sec

SQL Re-Compilations / Sec

The Re-compilations/Sec counter measures the number of time a re-compile event was triggered per second

Below 10% of the number of Compilations/Sec

Deadlocks/sec

Measures the number of deadlocks occurring per second

 

User connections

The number of users connected to the SQL Server