i've found that Execution Throttle was relied on so heavily in the past with QLogic HBAs that many folks are unfamilar with setting a per-LUN servic queue depth for QLogic in Windows.
The Dell host connectivity guide i linked above in the thread discusses this on page 67.
Before changing LUN service queue depth regardless of host HBA make/model, consider. requirements/recommendations for the storage array.
Working in the past with EMC CLARiiON arrays and some of the older Hitachi arrays, there was good reason not to set higher than 32.
Why consider increasing LUN service queue depth from default?
Consider this example. Much higher average write service time in the dark blue box than the light blue box. Even though total bytes/sec is lower.
Looking at an individual LUN from perfmon, same comparison between dark blue and light blue boxes. Except bytes/sec is expectedly lower all around, and average write/sec service time is higher.
Looking at IOPs it's pretty much the same story. Neither bytes/sec nor IOPs explains the higher avg write service times in the dark blue box.
Saturated write cache, high pending writes, forced write cache flushes can elevate both write and read service times.
Something like a copy-on-write snapshot being maintained at that time of this capture could elevate avg write service times, too.
A COW snapshot would transform first writes into [read from source] + [write to snapshot] + [write to source] unless using a shadow filesystem strategy like NetApp.
COW without shadow filesystem is more painful on all HDDs than on an AFA, but there's still overhead on an AFA.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
och. the stuff i'm looking at today has so many variables and so much variability its about to do me in.
i'm trying to help optimize work in a workload group with a small footprint of concurrent queries and workers while other workload groups are active.
the gray inside the red-dashed-line boxes - that's what i'm trying to help out. without doing unnecessary damage to the other workload groups.
might be a long haul.
its hard to pick out the CPU utilization for just that workload group from 3 am to 6 am.
and the waits it experiences aren't all that easy to eliminate (memory_allocation_ext waits).
there's some good stuff in here about #sqlserver soft-NUMA which applies to autosoftNUMA
~~
Understanding Non-uniform Memory Access
2012 October 4 docs.microsoft.com/en-us/previous…
"Memory nodes are created based on hardware NUMA and therefore not impacted by soft-NUMA."
Thumbs up! soft-numa nodes must be fully contained in memory nodes. so memory nodes enforce something on soft-numa, but not the other way 'round.
"Soft-NUMA does not provide memory to CPU affinity."
Thumbs up, i think. i guess i would probably have said that memory-to-cpu affinity is based on memory node rather than soft-NUMA node.