Hi, I replaced LBAcache 09jun2004 by a new 17jun2004 version. This should hopefully fix the (XMS) errors in the new 16-way (instead of fully) associative 6/2004 LBAcache family. Please test. Change: better xmscopy.asm error messages / checks, fixed a stupid binsel2.asm bug which allowed the 16-way assoc code to search for / allocate cache in up to (surprise surprise) 16 cache elements (each 8k in size usually) beyond the end of the cache. http://www.coli.uni-sb.de/~eric/stuff/soft/ lbacache-17jun2004.zip Some statistics (Tom's RAREAD on my MVP3 board / K6-2 500 MHz CPU with 100 MHz SDRAM; UMBPCI-UMBs are excluded from all memory (L1/L2) caches, alas)... The change from fully (all places in cache can be used for everything) to 16 way associative mode (only 16 element slots can be used for every particular element - the sector number determines the search range) means FASTER search in the table but WORSE cache hit percentage: Sectors have more often to be removed from cache again, because the cache is less flexible in 16 way associative mode when it comes to allocating new cache slots. If all accesses are cache HITS: Speed gain from 1 to 4 MB/s in UMBPCI UMB, 3.5/5.5 to 8.x MB/s otherwise (3.5 with FDXXMS, 5.5 with DREMM386, both with fully associative cache, 8.x with FreeDOS EMM386. Sorry, but I did not re-test the new cache with FDXXMS / DREMM386 again, nor did I re-test the old cache with FreeDOS HIMEM / EMM386. You may want to do that yourself... I used the RAREAD "cached" test modes.) If all accesses are cache MISSES (linear read of huge amounts of data will result in cache misses usually), I get a speed gain from... Argh... The above was the data for the MISSES. Now for the raread "cached" / cache HIT data: Speed gain for cache HITS for LBAcache in UMBPCI-UMB: 4->16 MB/s. For LBAcache in fast normal RAM: fully assoc + FDXXMS 8.x MB/s, fully assoc + DREMM386 50 MB/s (!?), 16-way assoc plus FreeDOS HIMEM / EMM386 50-58 MB/s. Some more systematic tests would be a nice idea, I guess . Note that both "all HITS" and "all MISSES" are no realistic measures. You should test with some real-life program and check 1. the SPEED (use RUNTIME or a stopwatch) and 2. the CACHE HIT RATIO (use LBACACHE STAT and if needed also ZERO and SYNC commands of LBACACHE). It may sound cool to have 100-300% faster cache miss processing and 0%-300% faster cache hit processing, but if cache hit percentage dropped too much because of the 16-way-assoc (as opposed to fully-assoc) behaviour, you will get WORSE overall speed. Happy testing! Maybe I could make associativity a command line option? Note that speed differences will especially occur on slow CPU / RAM for bigger caches (you will have guessed that). On fast systems with small caches, the cache hit percentage drop might outweigh the gains. If this happens, try using a bigger cache to get the cache hit percentage higher again. [Summary of stats: fully 5/2004 -> 16way-assoc 6/2004 cache speeds MBy/s: Cache MISS: umbpci 1->4 fdxxms 3.5->? dremm386 5.5->? himem/emm386 ?->8.x Cache HIT: umbpci 4->12 fdxxms 8.x->? 50->? himem/emm386 ?->50-58 MBy/s ...] Eric