then, how about a memory resident script implemented with fastcgi? it could use zlib to store data in memory and on disk. requested threads get cached and have a limited lifetime, so when no new requests regarding that thread come in, it will be purged from memory. when someone posts, the data enters a queue which also has a limited lifetime, if no new posts arrive, write the thread to disk, else append to the queue. note that i've never used zlib, so i don't know if it's better to immediately append new posts to a cached thread (and doing the lifetime thingy for writing it to disk later on) or to put it in a queue first - the former method might be better if zlib's able to append data instead of compressing everything anew.
this should keep various overhead at a minimum level and make kareha(/wakaba?) feasible for high-traffic sites.
... but that's only the theory. fastcgi doesn't seem to be supported everywhere and i'm sure that i've overlooked a lot of things, but i found this to be an interesting thought anyway.