Making smaller HTML files? (8)

1 Name: Squeeks!ZW8mHclwsg 2004-12-24 07:56 ID:Heaven [Del]

As I have just gone through and looked at filesizes etc, 4-ch.net is now 150MB in size. Now that doesn't seem like much, a set of discussion boards with to be honest not a truckload of data on them, thats rather big. Of recent WAHa did a big overhaul to make both Kareha and Wakaba run faster by using wakautils.pl to offload primitive tasks, now the next optimisation: Smaller, cleaner, more complient XHTML, whilst retaining its functionality and looks.

Things that could be altered to save on filesizes:

  • Removal of any redundant tags (I haven't found any, but this may apply)
  • Using a seperate perl script to offload all the extra management options? (this would also remove a lot of dependency on Javascript)
  • Using CSS to alter HTML code instead of using <div> tags.

I also had a somewhat comedy option of ditching XHTML and using XML+XSLT. But to implement that so that all could see it would be rather difficult, let alone making it look half decent. But filesizes would be as raw as you could effectively get them.

2 Name: !WAHa.06x36 2004-12-24 08:48 ID:csE7AvDW [Del]

The <div>s are there exactly because of CSS. The management options also take up very little space in the HTML, and mostly in the javascript, which should only get downloaded once, and then kept cached.

What WOULD help a lot though, is to get mod_deflate or mod_gzip running. This would probably get the transferred sizes cut down to half or a third. HTML is very verbose, and compresses well.

3 Name: Squeeks!ZW8mHclwsg 2004-12-24 09:52 ID:Heaven [Del]

Its more of the boards being a disk filler than a bandwidth chewer. Now its not a problem atm, however if things are going to be bigger over my neck of the woods, well then some sort of way of reducing disk usage is in order. If removing several hundred bytes from each page in the /res/folders happens, thats many more megabytes of space for more things.

4 Name: Albright!LC/IWhc3yc 2004-12-24 19:09 ID:cqAFDNVk [Del]

Having the messages stored in a database and just plugged into a template when requested, instead of creating and storing static pages, would reduce disk space usage. Of course, Kareha can't do this, so...

5 Name: !WAHa.06x36 2004-12-25 08:13 ID:wAFHOXaN [Del]

>>4

That would also increase CPU usage dramatically. The reason static pages are built is to make it possible to handle the load of a high-traffic site. For a smaller personal board this isn't much of a problem, but on a site like iichan where you had many requests per second, you don't want to keep hitting the database.

4chan, back in the day, was close to dying just because of the SQL traffic, and they only had thread pages being built dynamically. Older versions of futaba (like the one futallaby is based on) did this, but they've also changed to all static pages in the newer versions.

6 Name: MAD-Leser!r4YvKJpWUc 2004-12-26 10:18 ID:HPHjRgpU [Del]

then, how about a memory resident script implemented with fastcgi? it could use zlib to store data in memory and on disk. requested threads get cached and have a limited lifetime, so when no new requests regarding that thread come in, it will be purged from memory. when someone posts, the data enters a queue which also has a limited lifetime, if no new posts arrive, write the thread to disk, else append to the queue. note that i've never used zlib, so i don't know if it's better to immediately append new posts to a cached thread (and doing the lifetime thingy for writing it to disk later on) or to put it in a queue first - the former method might be better if zlib's able to append data instead of compressing everything anew.
this should keep various overhead at a minimum level and make kareha(/wakaba?) feasible for high-traffic sites.

... but that's only the theory. fastcgi doesn't seem to be supported everywhere and i'm sure that i've overlooked a lot of things, but i found this to be an interesting thought anyway.

7 Name: !WAHa.06x36 2004-12-26 12:54 ID:9lLEjcUr [Del]

It isn't a bad idea, no (and mod_perl is even better than fastcgi for this, no doubt), but it is a lot of work. And, it would make the script much less comaptible with various server setups.

8 Name: Squeeks!ZW8mHclwsg 2004-12-27 10:58 ID:Heaven [Del]

>>6

I like the creativity behind this idea. Looks like you put some brains into it.

No doubt if this could be implemented and prooven to be highly efficient at its task, but also not be "flaky" towards data storage, I would look at trial implementation of such an idea if server loads are enough of a problem.

Name: Link:
Leave these fields empty (spam trap):
More options...
Verification: