(Source: sqlserverplanet.com)
Parallelism is a feature in SQL Server which allows expensive queries to utilize more threads in order to complete quicker. The query optimizer makes the determination of how expensive a query is based upon the Cost Threshold for Parallelism setting set at the SQL Server Instance level. This cost threshold for parallelism is an estimate (roughly defined in seconds) that the query optimizer has determined a statement will take based on an estimated (or cached) execution plan. Generally the queries that qualify for parallelism are high IO queries. In a normal process, a source is read using a single SPID (Server Process ID) and it output using the same SPID. In parallelism, multiple SPIDs are used to read a source (this is known as distributing streams), then an operation may be performed in the streams, then the streams are gathered. Below is an illustration of this:
Parallelism is a feature in SQL Server which allows expensive queries to utilize more threads in order to complete quicker. The query optimizer makes the determination of how expensive a query is based upon the Cost Threshold for Parallelism setting set at the SQL Server Instance level. This cost threshold for parallelism is an estimate (roughly defined in seconds) that the query optimizer has determined a statement will take based on an estimated (or cached) execution plan. Generally the queries that qualify for parallelism are high IO queries. In a normal process, a source is read using a single SPID (Server Process ID) and it output using the same SPID. In parallelism, multiple SPIDs are used to read a source (this is known as distributing streams), then an operation may be performed in the streams, then the streams are gathered. Below is an illustration of this:
Although parallelism may seem like a good idea, often times it
is not. For OLTP systems that facilitate a lot of user requests parallelism is
usually an indication of poorly written queries and/or queries that are in need
of indexes. It can also cause issues on servers that have a lot of processors
that also have disk IO contention. The reason for this is because parallelism
will by default spawn as many SPIDs as their
are processors. In other words, if you have a rogue query joining a hundred
million records that does not qualify for parallelism, then by itself, only one
process will be reading the hundred million records. However if it qualifies
for parallelism and the server has 16 processors, then you will now have 16
threads each trying to get a piece of the hundred million records powered by
their own processor. I’ve seen this bring many servers to their knees.
One indication that parallelism is occurring on your system is
when you run sp_who2 and
you see the same SPID listed more than once. This is indicative of a large query
being broken into multiple streams only to join back into a single result
later. Some streams may finish their operations prior to other streams
completions. When this happens it results in a wait type of CX_Packet.
When this wait type is common, then you are waiting for parallel streams
to finish.
As a summary, you do not want parallelism to be enabled unless
you are sure that:
- You
have more than enough resources
- You
have a well optimized SQL Server
- Your
server is mainly used for ETL purposes
When SQL Server runs on a
computer with more than one processor or CPU, it detects the best degree of
parallelism, that is the number of processors employed to run a single
statement, for each query that has a parallel execution plan. You can use the
max degree of parallelism option to limit the number of processors to use for
parallel plan execution and to prevent run-away queries from impacting SQL
Server performance by using all available CPUs. The default value for
MAXDOP is 0 (zero) and can be set or viewed using (sp_configure).
A value of 0 means that SQL Server will use all processors if a query runs in
parallel.
MAXDOP
should not be 0 and should not be greater than half the number of visible
schedulers.
For example if you have a quad core processor with hyper-threading enable, then you will have 4x2=8 visible schedulers (each scheduler is mapped to an individual processor). Schedulers can be seen by running this query and would be the rows that have a scheduler_id < 255: If you are unsure of the above values then a generic setting for MAXDOP should not be more than 8. So if you run the command below and have more than 8 you should set this value to a maximum of 8.
SELECT *
For example if you have a quad core processor with hyper-threading enable, then you will have 4x2=8 visible schedulers (each scheduler is mapped to an individual processor). Schedulers can be seen by running this query and would be the rows that have a scheduler_id < 255: If you are unsure of the above values then a generic setting for MAXDOP should not be more than 8. So if you run the command below and have more than 8 you should set this value to a maximum of 8.
SELECT *
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255;
No comments:
Post a Comment