Saturday, January 4, 2025

Wait stats

 A scheduler can be described as a piece of software that coordinates the execution of different processes and administers its available resources. SQL Server has its own scheduling mechanism that is implemented in what we know as SQLOS. The reason behind this is that the Windows scheduler cannot satisfy what a relational database needs for working. 

SQL Server uses a cooperative scheduling mechanism when threads can voluntary yield its quantum of time. This allows SQL Server to optimize CPU utilization, because when a thread is signaled for execution but is not ready to run it can yield its quantum of time in favor of other threads. 

SQL Server has one scheduler per CPU core independently if it is a physical core or hyperthreaded. But the schedulers aren’t bound to a CPU unless you define an affinity mask. This means that a scheduler running on CPU 1, after a context switch can end up running on CPU 2. 

A thread can be in one of three states, as exposed by the STATUS column of the sys.dm_exec_requests DMV: 

  • RUNNING – on the CPU. 

  • SUSPENDED – whenever a thread requires a resource, such as a page that is not in memory, the thread yields the CPU and moves onto an unordered waiter list, with a thread status of SUSPENDED, until the required resource is 

  • RUNNABLE – if a thread is not waiting on any resource, but is not currently on the CPU, for example because the scheduler to which it is assigned currently has another session’s thread running, it will be placed in a first-in-first-out (FIFO) queue called the runnable queue and the thread status will be 

 

We can investigate the current, “real time” wait profile of a SQL Server instance, in terms of the requests and sessions associated with suspended threads, by querying the sys.dm_os_waiting_tasks DMV. 

 

 

 

SELECT  blocking.session_id AS blocking_session_id , blocked.session_id AS blocked_session_id , waitstats.wait_type AS blocking_resource , waitstats.wait_duration_ms , waitstats.resource_description , blocked_cache.text AS blocked_text , blocking_cache.text AS blocking_text 

FROM sys.dm_exec_connections AS blocking INNER JOIN sys.dm_exec_requests blocked 

ON blocking.session_id = blocked.blocking_session_id  

CROSS APPLY sys.dm_exec_sql_text(blocked.sql_handle) blocked_cache   

CROSS APPLY sys.dm_exec_sql_text(blocking.most_recent_sql_handle) 

blocking_cache 

INNER JOIN sys.dm_os_waiting_tasks waitstats 

ON waitstats.session_id = blocked.session_id 

 

We can also perform historical wait statistics analysis, using the data provided in sys.dm_os_wait_stats.  

Identifying high signal waits (CPU pressure) 

If the signal wait time is a significant portion of the total wait time then it means that tasks are waiting a relatively long time to resume execution after the resources that they were waiting for became available. This can indicate either that there are lots of CPU-intensive queries, which may need optimizing, or that the server needs more CPU. The query below will provide a measure of how much of the total wait time is signal wait time. 

SELECT  SUM(signal_wait_time_ms) AS TotalSignalWaitTime , ( SUM(CAST(signal_wait_time_ms AS NUMERIC(20, 2))) / SUM(CAST(wait_time_ms AS NUMERIC(20, 2))) * 100 )  AS PercentageSignalWaitsOfTotalTime 

FROM    sys.dm_os_wait_stats 

 

If signal waits analysis indicates CPU pressure, then the sys.dm_os_schedulers DMV can help verify whether a SQL Server instance is currently CPU-bound. This DMV returns one row for each of the SQL Server schedulers and it lists the total number of tasks that are assigned to each scheduler, as well as the number that are runnable. Other tasks on the scheduler that are in the current_tasks_count but not the runnable_tasks_ count are ones that are either sleeping or waiting for a resource. 

SELECT  scheduler_id , 

current_tasks_count , runnable_tasks_count 

FROM sys.dm_os_schedulers WHERE scheduler_id < 255 

 

 

Common wait types 

 

CXPACKET 

Often indicates nothing more than that certain queries are executing with parallelism; CXPACKET waits in the server are not an immediate sign of problems, although they may be the symptom of another problem, associated with one of the other high value wait types on the instance. 

 

SOS_SCHEDULER_YIELD 

The tasks executing in the system are yielding the scheduler, having exceeded their quantum, and are having to wait in the runnable queue for other tasks to execute. As discussed earlier, this wait type is most often associated with high signal wait rather than waits on a specific resource. If you observe this wait type persistently, investigate for other evidence that may confirm the server is under CPU pressure. 

 

ASYNC_NETWORK_IO 

This wait is often incorrectly attributed to a network bottleneck. In fact, the most common cause of this wait is a client application that is performing row-by-row processing of the data being streamed from SQL Server as a result set (client accepts one row, processes, accepts next row, and so on). Correcting this wait type generally requires changing the client side code so that it reads the result set as fast as possible, and then performs processing. 

 

LCK* 

It means one query is holding locks on an object, and another query is waiting to get locks on that same object. Maybe one query is trying to update all the rows in the Customers table, and another query is trying to read them. 

To fix it, you want enough indexes to make queries fast, but not so many indexes that there’s a huge overhead for deletes/updates/inserts (which slows down those activities, and makes blocking worse.) It’s a delicate balance, but generally we recommend aiming for 5 or less nonclustered indexes, each with 5 or less fields on it (including includes). You can get away with more – but you’d better be really careful with blocking overhead, and if you’re facing LCK* waits, you’ve gone over (or under) the line. 

 

PAGEIOLATCH_* 

SQL Server is waiting to read data pages from storage. These pages weren’t cached in memory, so we have to get ’em from disk. In most cases, this is to satisfy a SELECT statement, but you can also see this for DELETES, UPDATES, and INSERTS (DUIs) as well. 

To fix it, start by list out the queries reading the most data. Look for missing indexes in those query plans, or non-sargable things in the query that you could tune manually. Start here because it’s the easiest/cheapest. 

Next up, consider adding more memory to the SQL Server. Sure, storage might be really slow – but changing out your storage is time-consuming and expensive.  

Finally, if you can’t do any of those, look into improving the speed of your storage. 

 

 

Wait Types 

Reason 

Actions 

Avoid 

CXPACKET 

Parallel processing 

1. Most of the time this is due to not the existence of proper indexes. 

2.Examine the query plan of the query. 

3.Read more details at CXPACKT. 

Do not avoid parallelism. 

ASYNC_NETWORK_IO 

SQL Server is waiting for a client to acknowledge receipt of sent data. 

Re-write the application code 

By looking at the name, you might think that it is something to do with the network. 

BACKUPBUFFER 

BACKUPIO 

BACKUPTHREAD 

During the database backup, these types of wait types will occur. 

Most likely these types of wait types indicate, slowness of in backup disk. 

If it is a large database, you need to leave this until the backup is completed. 

PAGEIOLATCH_SH 

PAGEIOLATCH_EX 

Waiting for a data file page to be read from disk to memory. 

Need to analyze what tables are having the issue. 

Nothing to do with the IO subsystem. 

 

 

No comments:

Post a Comment