You definitely should think of moving off sp122 quicker.....not a lot of changes in sp135 from where you are currently. WRT sampling and stats think of it this way. <start simplified explanation> Normally, without sampling, we scan the table (or index if updating stats only on a single index) and record the distinct values along with the frequency of occurrence. When finished, we construct the (default) 20 step values and compute the cell histograms. For example, let's say we have a table with 100 rows with an identity column with values 1-100. When finished, the stats would have 20 histogram cells and we would have a weight of 5/100's for each step (0.05). With stats sampling at 10%, we start scanning the table and we read every 10th row. For that row, we record in a work table the values for the index keys...so we would read rows 10, 20, 30, 40, 50, 60, 70, 80, 90 & 100. In our case, we have gotten lucky as the max value (100) was detected during the sampling....but we might not be able to create 20 steps as we only have 10 values, so we only create 10 histograms with 10/100 (0.10) for a weighting for each step. Now, let's increase to 30% sampling (or let's say 33% for fun to make things easier). Now the values we read are 3, 6, 9, 12, 15, 18, ....,90, 93, 96, 99. We will have 30 distinct values from which we can construct 20 steps and the weight of each histogram would be ~7/100. However, we do have more steps, consequently, queries with IN() or OR might do better as the finer granularity is less likely to aggregate to above 40% when a table scan is chosen - for example, where col in (15,25,35,45,55) might result in a tablescan with 10% sampling due to 5x costing each SARG (with aggregation due to OR logic) vs. with 30% sampling, it might not. However a query with col=100 will hit the out of range histogram (if enabled) on 30% sampling, whereas it hits a cell on 10% sampling. In that case we are lucky as we got the last row with 10% sampling vs. say if the values were more interdispersed. So, I would say that generally 30% sampling gives *better* statistics but there may be edge cases where it is worse. I think the lower the sampling percentage, the bigger the issue would be with low(er) cardinality columns - particularly where distinct(*) on column/sampling % is < histogram steps requested. For really high cardinality columns such as names, it likely makes minimal difference. In either case, if you have issues, it may be as much the number of histogram cells (update index stats with n values clause) vs. the sampling %...... One can't say what the exact impact would be without seeing the actual stats that result as well as query predicates.
One of the diffs is that in scanning the table, we end up creating a worktable that has to be sorted for each column and then frequency cells/histogram steps derived. This is where tempdb (and proc cache) get hit the hardest - and a lot of the slowness comes from as worktables oft end up flushed to disk and PIO to re-read when sorting/aggregating stats. With hash stats, we use an in-memory hash table for the values and hence no tempdb nor sorting hence a lot less PIO involved on the tempdb side - which is likely to give you better stats as it will read entire table, but I have found runs 5x+ faster, which will give you the target speed you are were after with sampling.