Unfortunately, proc cache management is a rather involved topic, but the above points to a rather confusion of issues.
First, TF757 simply reduces the scale down of large allocations when the proc cache is full by replacing a proc in cache. There is a bit of an inverse relationship with ELC, but let's talk about that later. When a proc is first loaded into cache, the entire plan for the proc has to be loaded into cache. Prior to ASE 15, this was largely done by successive 2K memory pages. ASE 15 supported procs grabbing 2K, 4K, 8K and 16K allocations. However, as you can imagine, if proc cache is nearly full, then memory is likely fragmented and loading a large plan into proc cache could result in the plan having to grab a bunch of 2K pages all over the place - kinda like the impact on CPU in the OS when the OS memory pages free become smaller, the CPU necessary to find those holes becomes more. TF757 simply allows ASE to instead of scaling the requests down to 2K chunks, to instead bump another proc plan out of cache to make (hopefully) some contiguous room to load the new proc plan.
However, I would NOT use TF757 if "cpu was increasing for no apparent reason" - I would only do so if the proc cache manager was showing high spinlock contention AND after monitoring monProcedureCacheMemoryUsage/monProcedureCacheModuleUsage, monProcedureCache and monCachedProcedure that I could see a pattern that indicated that proc cache was full a lot and proc plans were being recompiled or loaded quite frequently - and then likely only as a temporary measure to determine if the proc cache recompilation can be reduced. More on that in a second.
Now, as far as ELC is concerned, you should really get a copy of Stefan Karlsson's ASE 16 performance paper that he has posted on SCN - specifically about page 18 where he starts to discuss proc cache and ELC. In it, the difference between the ELC behavior in ASE 15 and ASE 16 - specifically pointing out that in ASE 16 the ELC size is now configurable and by default double what it was in ASE 15 - as well as some other changes that improve ELC operations. Remember, the whole goal of ELC is to minimize contention on proc cache manager spinlock by caching free proc cache pages locally. However, a *key* aspect of ELC is that we are discussing *free* proc cache pages - if the proc cache is nearly full.....welllllllllll .....then there is nothing to put into ELC and therefore the various engines have to free stuff from proc cache to make room....which means contention on the proc cache spinlocks.
So, you can see that TF757 really has no direct bearing on ELC. TF757 really only kicks in when the proc cache is nearly full, which implies that the ELC has been exhausted. If the proc cache is NOT full and ELC is exhausted, the engine simply grabs another chunk of memory pages for the ELC.....
Now then - unfortunately, data can be contorted to mean many different things. For example, in your list of procs, you listed one (f_update_instruction_p) that you (I think - I could be wrong) were pointing out that had a large number of plans in cache (195) as if to imply that proc plans were not being removed rapidly enough. Unfortunately, the data you provide - summarized as it was - was insufficient to support such a conclusion. For example, if 195 connections executed that proc concurrently, you would legitimately need 195 plans in cache as ASE 15 doesn't (normally) do plan sharing.
All of this is only one possibility driving your problem. One of the most common issues that can drive proc cache higher while slowing execution speeds is quite simple - merge joins - specifically sort merge joins. There are others - e.g. if there is a lot of recompilation - then proc cache will be under memory pressure....but let's deal with that later (again). What you should really do is plot CPUTime and LogicalReads from monProcessActivity.....you will need two different scales, but if the spikes coincide - then it is more likely you have merge joins going on - especially if monOpenObjectActivity shows entries with IndexID=0 as the top objects when sorted by LogicalReads in descending order.
Secondly, once you have eliminated LIO from driving the problem, I would make sure that you are seeing proc cache cache contention - and that is where monSpinlockActivity comes into play - or if not on ASE 15.7, you will need to run spinmon.
One aspect to consider is that ASE 15 requires a lot more proc cache than did ASE 12.5 - one of the reasons is quite simple - more statistics. In ASE 12.5, while we supported 'update index statistics', few people used it.....and the default optimization techniques couldn't really leverage them (although there were quite a few cases where it could). In ASE 15, we started advocation (errrr....maybe even "demanding") that customers use update index statistics - which means now when there is any optimization to be done, a lot more statistics has to be loaded into proc cache for optimization. I am assuming you are aware of this and increased proc cache considerably from 12.5 to 15 upgrade (since you brought this up) or else, that *could* be the source of your problem.
However, now lets talk about proc cache re-compilation. There are number of instances in which this happens....e.g. we have a #temp table outside the proc that the proc references (a common pre-15 technique used to try to force ASE to pick up an index on a #temp table). However, ASE 15 also added 'procedure deferred compilation'.....remember, this option allowed ASE to recompile a procedure at the first execution of a statement that contained a @var or #temp table. Soooooo.....we load a proc into cache, we create a plan, start executing it.....and voila - first thing we know we are recompiling a statement due to a @var or #temp. This creates a NEW plan - and subsequent executions of the same statement in that plan do not cause a recompilation....but you can easily see how that this impacts proc cache....interestingly, if you trace monSysStatement for that execution, you will actually see the PlanID change. Net result, along with the increased number of stats, we need more proc cache for deferred compilation.....
To really see how this could be impacting, consider the following scenario:
1) User connects to ASE and creates a few #temp tables for their session.
2) User executes a stored proc...which populates the #temp tables and returns the results and then truncates the #temp table
Since there are concurrent users, the user grabs a proc plan from cache. However, because the #temp tables, the plan is recompiled. During execution, we hit an statement with @var and/or #temp....and you guessed it......
Much of the fault here lies with a 12.5 style of coding that was required due to trying to eliminate contention on system tables in tempdb. That being to pre-create the #temp tables upfront. However, my experience with customers that did this was that often is increase the log semaphore contention on tempdb due to the increase in logged IO in tempdb - and in some cases, the net result was identical - just the blocking shifted from system tables to the log semaphore which wasn't visible in syslocks/monLocks. The other coding style was the use of subprocs to populate the #temp tables and then the outer proc continued executing.....this, in my mind, is a legitimate use of subprocs to simplify proc maintenance......but it has the nasty side effect of causing a recompilation....
However, think about it.....if I have 4 tempdb's or 1 tempdb, the amount of recompilation due to #temp should remain the same....as it wouldn't matter if everything was in 1 tempdb or not - the proc plan would have to be recompiled. What we don't know (for example) is what impact the IMDB setup had on decreasing memory available for data cache - and THIS could result in an increase in response time by driving PIO higher.....but by simply stating the symptoms, we don't know if this is what is happening or not.
Certainly, enabling minimal_logging would NOT have any difference on the number of temp tables created NOR would have any impact on the number of transactions. Physically impossible. The ONLY thing it does is change whether transactions are logged or not as an aspect of transaction size .....e.g. if the transaction exceeds the size of the session tempdb ULC, it will still be logged to allow it to be rolled back, but if it doesn't exceed the size of the session tempdb ULC, it won't be logged. Sooooooo......
1 - If you are seeing a difference in the number of transactions or number of #temp tables, you are either experiencing a different workload - OR - your monitoring is faulty and you are missing those data elements in the previous captures. I would bet on the former.
2 - That increase/change in workload is likely driving the difference in proc cache, which has it's own set of symptoms
You may not wish to believe me - but send this post to your "more famous" consultant and ask his opinion....just make sure you mention my name as the source of the post.
Either way, in order to reduce the amount of recompilation and impact on proc cache, you may want to consider rewriting some of the stored procs in such a way that recompilation is not as frequent.