there's some good stuff in here about #sqlserver soft-NUMA which applies to autosoftNUMA
~~
Understanding Non-uniform Memory Access
2012 October 4 docs.microsoft.com/en-us/previous…
"Memory nodes are created based on hardware NUMA and therefore not impacted by soft-NUMA."
Thumbs up! soft-numa nodes must be fully contained in memory nodes. so memory nodes enforce something on soft-numa, but not the other way 'round.
"Soft-NUMA does not provide memory to CPU affinity."
Thumbs up, i think. i guess i would probably have said that memory-to-cpu affinity is based on memory node rather than soft-NUMA node.
"The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs and no hardware NUMA."
ok, this is getting dicey.
"Configuring four soft-NUMA nodes provides four I/O threads and four lazy writer threads, which could increase performance."
and... that's a problem, bub.
Maybe this was true about lazy writers in #sqlserver 2008? i don't think so. Definitely not true in #sqlserver 2019.
all right then. my super-secret 64 "logical cpu" system with #sqlserver 2019 cu9 installed. with a single memory node. but 4 softNUMA nodes due to autosoftNUMA.
A single lazy writer. because 1 per memory node.
but for some extra fun notice there are 8 transaction log writers, all on softNUMA node 3.
at 128 "logical cpus", 2 lazy writers. because 2 memory nodes. (and 2 processor groups because limited to 64 "logical processors" in a processor group).
autosoftNUMA created 8 softNUMA nodes.
and we're still at 8 transaction log writers, because as of #sqlserver 2019 cu9 that's the most you'll get.
anyway, i bring this up because a colleague came across the article after we discussed disabling autosoftNUMA on a system.
my colleague wanted to make sure there wouldn;t be a lazy writer bottleneck due to disabling autosoftNUMA.
nope, i don't think so.
"The I/O comment directly refers to the I/O completion port and thread that is created on a per logical node. So you can configure soft NUMA to allow advanced TCP/IP bindings and each logical node receives a specific I/O completion port and managing thread..."
"NOTE : The I/O Completion threads in SQL Server 2005 and 2008 are designed to handle connection requests and TDS traffic. They are NOT handling database, data and log file I/O operations."
"Logical nodes do NOT receive additional lazy writer thread but the physical nodes do... The lazy writer thread creations are tied to the SQL OS view of the physical NUMA memory nodes."
"So whatever the hardware presents as physical NUMA nodes will equate to the number of lazy writer threads that are created."
• • •
Missing some Tweet in this thread? You can try to
force a refresh
och. the stuff i'm looking at today has so many variables and so much variability its about to do me in.
i'm trying to help optimize work in a workload group with a small footprint of concurrent queries and workers while other workload groups are active.
the gray inside the red-dashed-line boxes - that's what i'm trying to help out. without doing unnecessary damage to the other workload groups.
might be a long haul.
its hard to pick out the CPU utilization for just that workload group from 3 am to 6 am.
and the waits it experiences aren't all that easy to eliminate (memory_allocation_ext waits).
i've found that Execution Throttle was relied on so heavily in the past with QLogic HBAs that many folks are unfamilar with setting a per-LUN servic queue depth for QLogic in Windows.
The Dell host connectivity guide i linked above in the thread discusses this on page 67.