If you have queried v$sgastat on recent Oracle versions (by which I mean 9i and above) you probably have seen allocations for some sort of simulators in Oracle instance. Here’s an example:
SQL> select * from v$sgastat where lower(name) like '%sim%' order by name; POOL NAME BYTES ------------ -------------------------- ---------- shared pool kglsim alloc latch area 1700 shared pool kglsim alloc latches 68 shared pool kglsim count of pinned he 9248 shared pool kglsim free heap list 204 shared pool kglsim free obj list 204 shared pool kglsim hash table 4104 shared pool kglsim hash table bkts 2097152 shared pool kglsim heap 635536 shared pool kglsim latch area 1700 shared pool kglsim latches 68 shared pool kglsim main lru count 87040 shared pool kglsim main lru size 174080 shared pool kglsim object batch 909440 shared pool kglsim pin list arr 816 shared pool kglsim recovery area 2112 shared pool kglsim sga 22092 shared pool kglsim size of pinned mem 18496 shared pool ksim client list 84 shared pool log_simultaneous_copies 480 shared pool sim cache nbufs 640 shared pool sim cache sizes 640 shared pool sim kghx free lists 4 shared pool sim lru segments 640 shared pool sim segment hits 1280 shared pool sim segment num bufs 640 shared pool sim state object 48 shared pool sim trace buf 5140 shared pool sim trace buf context 120 shared pool sim_knlasg 1200 shared pool simulator hash buckets 16512 shared pool simulator latch/bucket st 4608 31 rows selected.
See, a bunch of “kgl sim” and then just “sim” allocations.
… or sometimes you can see latch contention on following latches:
SQL> select name from v$latch where name like '%sim%'; NAME ------------------------------------------------------- ksim membership request latch simulator lru latch simulator hash latch sim partition latch shared pool simulator shared pool sim alloc 6 rows selected.
Again, there seems to be some “simulation” work going on in Oracle instance.
So what are these simulators about?
These simulators help Oracle to determine how much time and effort would have been saved if buffer cache or shared pool size was bigger than current. And this info is summarized and externalized in various advisor views such V$SHARED_POOL_ADVICE, V$MTTR_TARGET_ADVICE, V$DB_CACHE_ADVICE and so on. These simulation engines were introduced in Oracle 9i along the mentioned V$ views.
Also this stuff is used by MMAN for making SGA resizing decisions if SGA_TARGET (10g+) or MEMORY_TARGET (11g+) is enabled.
I decided to write an article about simulators as quite a few people have been asking me about it lately. So, I first googled around to see if someone has already written about it already and found one here. It turned out it was written by me, I just didn’t know about it anymore :)
So, I will write an edited version of my original explanation here. Note that I don’t know the exact algorithm of the simulator, just explaining the fundamentals here.
KGL simulator is the shared pool memory allocation tracking and simulation engine.
KGL=Kernel General Library cache manager, as the name says it deals with library objects such cursors, cached stored object definitions (PL/SQL stored procs, table definitions etc).
KGL simulator is used for estimating the benefit of caching if the cache was larger than currently. The general idea is that when flushing out a library cache object, it’s hash value (and few other bits of info) are still kept in the KGL simulator hash table. This stores a history of objects which were in memory in past but got flushed out. That’s the key part: we keep a history of objects flushed out (this is done by selective sampling though, not at every object flush as it would have too much overhead).
When loading a library cache object into library cache (which means that no existing such object is in there already), Oracle goes and checks the KGL simulator hash table to see whether an object with matching hash value is in there. If a matching object is found, that means that the required object had been in cache in past, but flushed out due space pressure. That’s the second important part: we know when we would have been able to (probably) just reuse an object if the shared pool had been bigger.
Using that information of how many library cache object (re)loads could have been been avoided if cache had been bigger (thanks to KGL simulator history) and also thanks to knowing how much time the (current) object reloads took, Oracle can predict how much response time would have been saved instancewide if shared pool was bigger. So, we don’t need to increase shared pool itself to measure the benefit, we simulate a bigger shared pool by just keeping a history of aged out library cache objects hash values in a special array – and that’s why there are various kglsim and simulator allocations seen in V$SGASTAT.
The simulation results are seen from v$shared_pool_advice and are used by other advisors such ADDM.
Apparently Oracle keeps track also about how much memory is pinned in the shared pool (thus can’t be flushed out) over time. This info is probably used also for estimating shared pool resize requirements as pinned memory chunks are not usable for anything else (as long as they are pinned) in the shared pool. These measurements are also done on sampling basis.
I did a little test with my lotshparses.sql script (which just runs a lot of hard parses inside a PL/SQL loop) and ran latchprofx to report which latches and why was my session using. Note the “sim” instead of usual % in the latchprofx syntax. I am interested in only these latches which have “sim” in their name in this experiment:
SQL> @latchprofx sid,name,func,hmode 146 sim 100000 -- LatchProfX 1.12 by Tanel Poder ( http://www.tanelpoder.com ) SID NAME FUNC HMODE Held Gets Held % ---------- ----------------------------------- ---------------------------------------- ------------ ---------- ---------- ------- 146 shared pool simulator kglsim_unpin_simhp exclusive 514 49 .51 146 shared pool simulator kglsim_upd_newhp exclusive 206 198 .21 146 shared pool simulator kglsim_scan_lru: scan exclusive 95 93 .10 146 shared pool simulator kglsim_chg_simhp_free exclusive 32 32 .03 146 shared pool simulator kglsim_chg_simhp_noob exclusive 32 32 .03 146 shared pool sim alloc kglsim_chk_heaplist: alloc exclusive 4 3 .00 146 shared pool sim alloc kglsim_chk_objlist: alloc exclusive 3 2 .00 146 shared pool simulator kglsim_chk_objlist: extend exclusive 1 1 .00 146 shared pool simulator kglsim_scan_lru: rebalance exclusive 1 1 .00
This simulation engine is pretty cool because its the first time someone has tried to measure the time what could be saved by increasing shared pool size. (Response) Time is the ultimate measure of performance, so this simulator measures the right thing. Do you remember what people used before that? Library cache hit ratio! A lot of people back then (and some still today) have been misleaded by measuring the wrong things like hit ratios.
Anyway, while this is a cool feature – it can cause trouble in big systems. This is because the extra CPU usage for simulation and also the potential concurrency issues like simulator latch contention (like the shared pool simulator latch above, these latches are used even on 11g where most library cache latches have been replaced with KGX mutexes for efficiency and avoiding false contention issues). Also, some people are sensitive about the potential large memory allocations the shared pool simulator can take from shared pool (this is controlled by _kglsim_max_percent parameter which defaults to max 5% of total shared pool size). As these allocations are dynamic, then when your application suddenly starts making lots of hard parses due newly released code, causing memory shortage, then the simulator can make things even worse as it can make relatively big allocations for itself too! And as far as I know the simulator arrays don’t shrink back.
Note that I’m talking about big systems here, in most other cases the simulation shouldn’t cause trouble (on recent db versions anyway).
In some databases I have disabled all of the advisors and simulators with setting statistics_level = basic. All features what you see from V$STATISTICS_LEVEL would be disabled when setting statistics_level to basic:
SQL> select statistics_name,activation_level from v$statistics_level; STATISTICS_NAME ACTIVAT ---------------------------------------- ------- Buffer Cache Advice TYPICAL MTTR Advice TYPICAL Timed Statistics TYPICAL Timed OS Statistics ALL Segment Level Statistics TYPICAL PGA Advice TYPICAL Plan Execution Statistics ALL Shared Pool Advice TYPICAL Modification Monitoring TYPICAL Longops Statistics TYPICAL Bind Data Capture TYPICAL Ultrafast Latch Statistics TYPICAL Threshold-based Alerts TYPICAL Global Cache Statistics TYPICAL Active Session History TYPICAL Undo Advisor, Alerts and Fast Ramp up TYPICAL Streams Pool Advice TYPICAL Time Model Events TYPICAL Plan Execution Sampling TYPICAL Automated Maintenance Tasks TYPICAL SQL Monitoring TYPICAL Adaptive Thresholds Enabled TYPICAL V$IOSTAT_* statistics TYPICAL . 23 rows selected.
Sometimes you don’t want to disable absolutely everything, but only the troublemakers. The good news is that there’s an (undocumented) parameter for disabling every feature separately. As always, make sure that you know what you’re doing before changing these and get blessing from Oracle support in order to not get yourself in trouble.
So, for example the library cache memory simulation (shared pool advice) can be disabled by setting _library_cache_advice = false .
I covered shared pool and library cache above. Oracle has more cache advisors, the underlying principle in them is the same. They save some small metadata/info about flushed out objects in a simulator array and just check that array when someone is allocating memory for new object. If matching object’s metadata still happens to be in the simulator array, then Oracle knows that we could have avoided the time and effort loading (generating) that object back into memory if the corresponding cache had been bigger. Of course the algorithm must be more sophisticated than just checking for object’s existence, there are probably object’s flush timestamp and size stored in the simulator arrays as well.
While the library cache simulator array stores library cache objects hash values for looking up historical memory allocations metadata, then the buffer cache advisor stores just data block addresses (DBAs) of blocks previously in cache. When Oracle spots that a block was previously in cache while intiating physical IO for re-reading it back again, it will measure the time it took to complete the current physical IO and add this info into the simulator array. Again, that’s how Oracle can measure how much response time would have been saved if the block would still have been in the buffer cache.
The buffer cache advisor needs some special attention, as I have had most issues with that (on large systems again), mostly with buffer cache’s simulator lru latch contention in systems with lots of CPUs and physical IOs going on. The problem here is that even though you can have hundreds of thousands of cache buffers chains latches in a system, the number of simulator lru latches remains relatively low by default, causing contention.
Oracle provides a documented parameter for disabling the buffer cache simulation – you need to set db_cache_advice = off (or ready).
So, hopefully this sheds some light into how the advisors work and what the heck the simulators simulate in Oracle. Even though I provided a list of parameters which allow you to control this stuff – don’t change any of them unless you actually have a problem which they should fix (and you are sure that these parameters would fix your problem).
Jonathan Lewis has written a good article about buffer cache advisor details on his website – I recommend to read that one too if you’re interested in this stuff:
NB! I am running one more Advanced Oracle Troubleshooting training in 2018! You can attend the live online training and can download personal video recordings too. The Part 1 starts on 29th January 2018 - sign up here!