Re: bad query plan for join on column with mostly ' ' content

I could say [run 'update index stats' on each table] and [make sure you have statistics on every referenced column in each referenced table], but I'll skip that and guesstimate that the high volume of blanks (63.25% of table) has left the A.ref1 column with a good bit of skew in the density stats (total density >> range cell density).

Without any SARGs the optimizer has to make some general guesses about how many matching rows it'll find in each table; so with a large skew in density stats the optimizer assumes that for each row in B it will find a large number of matches in A (total density * #rows), with the total IO cost for reading A (by index) being more expensive than just doing a table scan.

I'd suggest you try running [sp_modifystats 'A','ref1','REMOVE_SKEW_FROM_DENSITY'] to set the total density = range cell density (this should cause the optimizer to think it'll find fewer rows in A for each join from B; 'course, if you have a lot of queries that join on A.ref1=<space> then removing the skew could lead the optimizer to run a lot of expensive index lookups against A.ref1). Keep in mind that the density skew will be reset after any 'update stats' command that processes column ref1, ie, you'll need to run sp_modifystats/REMOVE_SKEW_FROM_DENSITY after each 'update stats'.

NOTE: Your old workaround of using stats from a table with no <space> values should lead to range cell density being (roughly) equal to total density ... which is what sp_modifystats/REMOVE_SKEW_FROM_DENSITY does for you. And yes, you could use this in 12.5 to see if you get the same benefit.

As for why the optimizer chose the join order of B->A instead of A->B ... *shrug*. The optimizer should know that B only has enough data for 1 page so why not join A->B? For 1 row in B the performance should be the same regardless of join order, but for more than 1 row in B (but still only 1 data page) the performance of A->B should be better as you only table scan A once.

If you can modify the code another option you might want to try is enabling allrows_mix/allrows_dss; for these types of joins with highly skewed stats you may find that a merge/hash join is faster (though for largish tables the overhead for worktables may be prohibitive).

Re: bad query plan for join on column with mostly ' ' content

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112