L_  N___ Profile picture
15 Dec, 66 tweets, 11 min read
i don't use #Windows Resource Monitor very often.
it doesn't seem to like me.
first right-hand disk graphs froze, stopped updating. Even though sqlservr numbers kept updating.
then i tried to change graph size/position and whatever in the lower screen cap happened. Image
not a vm-wide memory problem. no paging space used. Image
not a vm-wide cpu problem. 67% is the peak - got batch mode workers on 32 of 48 vcpus and they're keeping things cooking. Image
so the disk graph displays again. but its a still, point-in-time graph. the sqlservr numbers are updating about once per second, graph never updates. Image
i wonder if there's a problem with the Windows physicaldisk or logicaldisk WMI. that might show up in event log.
maybe process disk counters in WMI are still ok, but physicalDisk or LogicalDIsk are borked.
oh. Windows update is doing its thing.
no logicaldisk or physicaldisk numbers came through in perfmon for quite some time, either.
but the disk counters in the process object kept getting updated.
its over now.
but my disk bytes/s are too low now, not what i expected.
i think the pending updates are bogging down performance. thank goodness i'm free to just install and reboot.
wish there was a way to tie performance impact to a pending update without an exhausting investigation.

i don't have time for that. too many other investigations.
it looks like something in this set of Windows updates really harshed my groove.

both in #sqlserver performance, and observability of performance since at least the disk counters were unavailable for quite a while.
ok, the RDP connection has gone from silently failing to timing out now. it's making progress on the updates.
omigosh. it's worse after the install and reboot. both the process disk numbers and the disk graph are frozen now. Image
hm. clock in the rdp still says 11:47, it hasn't updated in 4 minutes either.
i'm starting to get the hint.
2 DOP 8 MBoR queries. Looking much better now. Image
This might be an interaction between Carbon Black and the KB4589208 install, which failed with Error 0x80070002.
my time on this system is a temporary gift, i didn't realize Carbon Black was on here.
ok, 2 DOP 8 BMoR queries. Looking much better now. Image
1.5 billion - 2 billion disk bytes/s is the range for two of these DOP 8 BMoR queries run simultaneously. i've got ~485 of these queries in the stack so this can run continuously and keep churning the bpool. Let's see how 3 and 4 queries concurrently look.
Four DOP 8 queries doesn't give much much more bytes/s disk throughput. just more tightly grouped at 1.8 billion bytes/s.
its a thread scheduling problem. i'm purposely avoiding tables large enough for full scan of pk/ci to be disfavored.
and all of these queries are BMoR pk/ci/heap full scans.
so i have 4 sessions, each running fairly fast dop 8 bmor queries in succession. each dop 8 query decides its thread placement on its own. my cpu utilization is ranging from 48% to 67%. Four DOP 8 BMoR queries gives me 32 BMoR workers churning through their part of full scans.
each of those workers could drive a vcpu to near 100% utilization by themselves. So 32 of 48 vcpus with a BMoR worker doing fullscan; system should stay at ~65-67% cpu utilized *if* each worker on their own vcpu.
but when some of the parallel workers for session 55 double up on a scheduler with parallel workers for session 52, CPU utilization drops below 65%.
i could micromanage this and get the parallel workers for each DOP 8 query on their own scheduler (for up to six DOP 8 queries on this 48 vcpu VM).
But that's a pain. Can also assume a range of doubling up, and then overscheduler concurrent queries to compensate.
that would mean a "full rack" on this 48 vcpu vm wouldn't be six DOP 8 queries, but 9-12. Maybe 9 DOP 8 queries will be enough to get one BMoR parallel worker on every scheduler.
what the heck!!? i almost missed this entirely.
just a moment ago my 4 BMoR dop 8 queries were up over 6 billion bytes/s.
So the range for 4 DOP 8 BMoR queries is 1.8 billion - 6 billion bytes/s. Image
that's way too big of a range for this tool.
my first pass was really ugly; i was doing something with VARCHAR or NVARCHAR columns in these rowstore tables. problem was the string column might be in off-row storage. Even though readahead gives good throughput for in-row data, off-row storage pages don't readahead.
or, if off-row storage pages *do* readahead, they weren't using readahead in my query :-)
so i was more selective. that first pass, i just grabbed really large tables. This pass, i made sure they weren;t so large as to be disfavored. And i only used tables that had no off-row storage pages. but enough rows to ensure BMoR.
7 billion disk bytes/s for four DOP 8 BMoR queries.
And... this isn't super-duper hardware. this is the "almost time to cycle it out of the datacenter" hardware. Image
well, i'm glad i looked into the Windows updates and got them squared away (ish) for now.
this level of activity wasn't even possible on this system just a little while ago.
another clean reboot, all updates installed, no more eligible updates.
fired up logman to csv for perfmon counters bc Resource Monitor CPU and disk visualizations are not synchronized with each other, and that bugs me.
ok, power run of all queries at DOP 8 one after the other. 484 queries. made it through 186 of them since 12:59; about 14 minutes.
my math is bad. it really is 484 queries, but the pace so far has been 183 completed in 15 minutes.
300 of the queries in 23 minutes.

i need at least one more full power run of the 484 queries before i take a close look, anyway.
because the queries which ran while the bpool was ramping up had different runtime conditions than those while the bpool was at "mature" size.
at 30 minutes, ~400 DOP 8 queries have completed.
35 minutes for the first power run.
time to theorize why these full scans have such varying bytes/s rate in a power run.
i think it comes down to rows/page. After a tipping point, additional rows/page for this query probablybrings down the bytes/s rate for the query. Even at DOP 8.
so to get a subset of these rowstore queries with a tighter bytes/s range and an increased average bytes/s, i may have to get rid of tables that exceed a certain number of rows per page in the PK/CI/heap.
i think the power run that had bpool rampup at the beginning will do slightly better than the "all mature bpool" power run.
and that just might come down to contention brining new pages into the bpool when there is next to no free memory, vs no contention during bpool rampup.
ah, no real conclusions to be made about that right now.
this second power run also finished in 35 minutes. Too little difference to analyze for that purpose. but two runs close enough to each other for me to whittle down the count of queries and bring up avg disk bytes/s :-)
oh. i've maybe made a mistake in assigning a table/query to a session in multi-session tests.
no two sessions will overlap execution of the same query, but i didn't do much to prevent a session arriving at a query for table A while table A is still in cache from a prior query.
this is the kind of thing one might see if the various sessions were stagger-started and maybe separated by 10 items each in the list. but over time drew closer and closer and suddenly were right next to each other.
then the follower would get higher cache hit ratio due to the leader's prior query.
rats.
let's see. if the list is sufficiently long i think it will be anough to always grab the next assignment based on the assignment most recently given.
i was making assumptions about "most recently given" based on placement of the sessions in the task list. instead i actually need to store and look up the most recent assignment time for the 484 table queries in the list.
there's the two power runs. although the shape is chaotic, the echoes between the two power runs can be seen. Image
these graphs are with a 5 second interval. Image
i'll sort the tables in ascending order of rows/page for the pk/ci/heap, which i think will give me a power run for a single session with bytes/s declining over time.
for the prior two power runs, this was surprising. there's 8 Windows volumes with the main database data files, and two voolumes with tempdb data files. a lot less in current disk queue length than i was expecting. Image
314 queries done.
sorting based on ascending rows per page seems to have worked well to give me an avg disk bytes/s that declines over time.
there are still some surprises in there, though.
because it's only 1 session, the ratio of plan compile time to plan execution time is quite important. because disk IO during plan compile won;t be very high.
so this run started at 2 billiob bytes/s according to perfmon, and now it has settled below 1 billion bytes/s. i imagine it'll stay below 1 billion bytes/s now till it's done.
under 800 million bytes/s now.
400 queries complete, 84 to go.
under 600 million bytes/s.
40 queries to go.
seems like my prediction fits better at the low throughput end than the high throughput end.
my prediction wasn't perfect but that's pretty nice, innit? Image
and the queue length Image
didn't show this yet.
workload was basically 100% readahead. Image
this suggests to me that the readahead target is not sufficient for a DOP 8 BMoR query.
At DOP 4 i expect the queries to take a bit longer, but the waits may be more hidden because readahead might be enough to hide the waits more often.
ack. i forgot to share the graph 🤣 Image
yeah, with the current readahead, scaling DOP 4 BMoR queries might be more productive than scaling DOP 8 BMoR queries when the goal is to max disk throughput.
what i'm really trying to do is stress the buffer pool with the highest rate of traffic i can.
turns out that is largely the same as trying to max disk read throughput 🤣
uh-oh. what if readahead already accounts for DOP of the scan operator?

then i'd expect everything to stretch by 2 for the DOP 4 queries compared to DOP 8. twice as long at half throughput. same number of total waits, same total amount of wait time, accumulated at half speed.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with L_ N___

L_  N___ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @sqL_handLe

17 Dec
just like that, i'm probably done thinking about scalabilty strategies for #sqlserver, and back to investigating latch timeouts on very busy systems. screen capture of a Teams i...
here's the thing. the number of mdmp files for a given large-memory system can be pretty small, especially if an Availability Group is involved.
while a dump is in progress, #sqlserver memory must remain in a constant consistant state. the more memory, the longer the dump may take (there are other un-enumerated factors).

but, Availability Group lease timeout can only be set as high as 100 seconds currently.
Read 5 tweets
17 Dec
484 #sqlserver batch mode on rowstore queries. DOP 8. run consecutively by a single session. each query is a fullscan of a different single table, grouping by a CASE statement on a non-indexed INT/BIGINT/NUMERIC data type column. producing one or two result rows of 1 column each. On a Y axis scale with maxi...
based on observation of the first experiment with these queries, in which the queries were ordered alphabetically by the name of the only table in the FROM, i reordered the queries based on decreasing pages:rows ratio. that gives the evident decreasing pages/sec trend :-)
a closeup of the first 10 minutes shows the considerable variation even though the downward trend is evident. 16:18 to 16:28 on the x axi...
Read 25 tweets
17 Dec
wow
~~
Oracle in Talks to Buy Cerner
An agreement, which could potentially be worth $30 billion, would rank as biggest ever for software giant
2021 December 16
wsj.com/articles/oracl…
i don't know i've ever heard a story about Oracle in talks to buy a company with an ending other than Oracle *buying* that company.
oh. i forgot about the TikTok debacle, mentioned by Financial TImes here.
so that's *one* story about Oracle not buying.
~~
Oracle nears deal to buy health IT company Cerner for $30bn
2021 December 16
ft.com/content/9bf806…
Read 13 tweets
17 Dec
such a good blog post from @nocentino ...
~~
Understanding #SQL Server IO Size
2021 December 10
nocentino.com/posts/2021-12-…
i'll nitpick two details, though :-)
the 512k upper bound for disk IO size isn't properly a #sqlserver limit, although it *is* the limit on most systems. it's the upper bound of the disk adapter.
some disk adapters - most fibre channel HBAs and some others, too - allow configuration of the maximum disk transfer size, and can be set higher than 512kb.
Read 23 tweets
16 Dec
#AIX #IBMPower
Earl Jew
Part 1
CPU Threading Efficiency: How to Improve L2/L3 Cache Hits
2017 February 15
techchannel.com/SMB/2/2017/aix…
#AIX #IBMPower
Earl Jew
Part 2
Recognizing the Efficiency Benefits of CPU Threading
2017 March 15
techchannel.com/SMB/3/2017/aix…
#AIX #IBMPower
Earl Jew
Part 3
CPU Threading Efficiency: Tactical AIX Monitoring of the Runqueue Value
2017 May 10
techchannel.com/SMB/5/2017/aix…
Read 6 tweets
16 Dec
#AIX #IBMPower
there is just no-one else like Earl Jew
~~
CPU Threading Efficiency: The Processor Consumed Value
2017 September 13
techchannel.com/SMB/9/2017/aix…
indubitably
~~
"To conduct tactical monitoring, we must also consider the complexities of virtualization, consolidation and concurrency alongside the activities of multiple CPU cores."
"... for capacity planning, this assumption is mostly harmless.

However, this assumption is reckless for tactical monitoring..."
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(