Hello,
I'm the admin/owner of fatechan.net, and I run /moe/ and /cw/ on iichan. When I first started my boards, I had my own box I had coloed, and everything ran fine. Due to some monetary issues tho, I was forced to cancel my colo and move to shared hosting.
Problem is, my account has now been suspended several times for excessive cpu usage. I'm hosted with cirtexhosting.com, and I run kareha. I keep my thread count around 6 pages (I manually clean out the boards), but according to my host, I'm still eating 14% of cpu time.
Any ideas for hosts that would be more lenient with their cpu usages? Dreamhost is not an option, I'm just waiting for them to collapse from overselling. I use kareha for the custom tripcode catches, so if there's another imageboard soft out there that can do that, I could switch.
Any ideas at all? I'd rather not have to pay more for a semi-dedicated just for the imageboard.
Turn off captcha. Kareha uses next to no CPU power without it.
Captcha is off, but according to my host, here's my load numbers.
fatennet fatechan.net 6.77 7.26 0.0
Top Process %CPU 14.0 /usr/bin/perl kareha.pl
Top Process %CPU 11.9 httpd [www.fatechan.net] [/moe/src/1154017429681.jpg]
Top Process %CPU 10.7 httpd [www.fatechan.net] [/moe/thumb/1158840292216s.jpg]
I cut down the page count down to 6 for both, it was at like 30ish, since I didn't like to clean out the board, and I really didn't have a need to.
Top Process %CPU 11.9 httpd [www.fatechan.net] [/moe/src/1154017429681.jpg]
Does that mean that 11.9% of CPU usage is taken up by serving an image? What the hell?
I have no idea what's going on, but the host is threatening to cancel my account, and they've suspended me twice for it.
This is perhaps obvious enough that you have already checked it, but are the pics being leeched? Googling reveals fatechan.net/moe is linked to from all over the web. All it takes is some thoughtless idiot doing image links on some really popular forum and there you go. Though how a thumb could eat up that much CPU is beyond me. Some forums allow people to link to sites for their avatar pics so maybe somebody is doing that?
BTW down to 6 pages now? Gosh, goodbye Arisa and Suzuka, goodbye Saya, goodbye ... Ayu? Uguuuuu!! I'm feeling torn. On the one hand I've been feeling guilty about bumping big threads but now I feel like I should rush in and bump the heck out of them as soon as /moe/ and /cw/ are back. But the latter might not be advisable in this situation. Uguu. Perhaps if you set up a public image archive for your fav pics and then let threads autosage at 111 pictures?
Before I cleaned out the boards, I backed it all up, so I still have all the images. If I can solve the cpu problems, I'll look into bringing back the threads I deleted.
As for leeching, I have entries in my .htaccess to prevent leeching.
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?fatechan\.net/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /hotlinking.gif [L]
The image leechers are referred to is a 1x1 transparent gif btw.
Well, I think I found the issue. I noticed on my mediawiki install, thumbnails weren't working because it couldn't find imagemagick "convert" in the path. I'm wondering if the huge cpu time is from kareha having to do the resizing itself with the perl imagemagick module.
I've put the absolute paths to the convert command in my config.pl, gonna see if that helps.
Yay, it's back to 8 pages. But seriously, you don't mind me adding to those big threads? Your wallet and host can handle the traffic?
Feel free to post and make the big threads even bigger. I have over 2tb/month in bandwidth, and you guys are nowhere near pushing it
Only thing that worried me was cpu usage. If cpu usage stays within bounds, I'll look into remerging the old threads back into the boards. I might look into making an archive board, and setup a script, where if the thread count gets too large, it'll automatically move it to a read-only store. Serving static files uses no resources, just bandwidth, and I have more than enough to burn.
This is getting rediculous, they're continuing to suspend me for excessive cpu usage. I'm on vanilla kareha, no modifications, and imagemagick is working. Captchas are off.
>>14 perhaps you could recommend another host that is not dreamhost?
age because fatechan still needs a host recommendation
the current plan is to hack kareha so it can host images on a separate server, then run it on a VPS (with miniscule bandwidth allowance) that TSO also has, but cirtex is somewhat unreliable so we'd like to switch anyway
>>16
I am extremely sad because your site is down. Please please bring it back!
did you not read the thread?
also i'm not TSO, just a friend
i love fatechan and am sad to see it go. I'm not a rich fellow by any means seeing that i'm in college but presuming donations are acceptable would money from users fix the problem and how much would presumably be needed?
The CPU thing sounds like bullshit. They could renice your tasks just the same as anything else. I mean, Kareha pretty much caches everything out the ass already, updating things mostly just when the relevant pages have changed or not been updated since last change.
It's not like /cw/ or /moe/ got enough traffic for that to happen or anything. (Could there be a hidden processor time sink thing in Kareha, for boards that've got a bit older? I remember 4ch's DQN used to have a huge post delay before the long tail was archived.)
>>21 He's running wakaba, though. I can't fathom what the problem might be.
No, I'm on Kareha. I'm still taking recommendations for friendlier shared hosting, but I'm prepared to hack kareha to run the script and db itself on a vps, then use shared hosting to host the images on a much higher bandwidth connection.
As for donations, I thank you guys for the offer, and please get in touch with me via my email (tso at fatechan.net) and I'll arrange paypal or something.
>>21 It's not like /cw/ or /moe/ got enough traffic for that to happen or anything. (Could there be a hidden processor time sink thing in Kareha, for boards that've got a bit older? I remember 4ch's DQN used to have a huge post delay before the long tail was archived.)
fatechan.net actually pulls about 500-600gb/month of traffic, which is surprising. I didn't realize how popular my boards were, but it still baffles me how kareha could be taking up that much cpu time. When I hosted it on my own server, it took up nothing.
PS, it's she ;)
Oh, I forgot to mention. It seems shared hosting is useless, due to very restrictive cpu usages. VPS plans give way too little bandwidth and hd space, and semi-dedicated/dedicated seems to be overkill, and is wayyyy out of my price range.
With donations, semi-dedicated/dedicated might be feasible. I'm still willing to do a major kareha hack, to run the main script on a VPS, and then do the actual image storage/hosting on a shared host with much higher space and bandwidth. Doing that would also allow possible load-balancing and multiple hosts, but that'd take me quite a bit of time and effort to run properly.
And to be perfectly honest, I'm a student too. I really don't have time to sit here and code something major, especially something I've never really done. The semi-dedicated/dedicated route would be a lot easier, and a lot quicker, but would cost a little more (well, it might not, the price of a capable vps and shared hosting might be around the same).
Up to you guys, I'm doing this for you guys.
(The other thing is, there's already kei.iichan's /2/ board. Which is much the same thing as your two boards.)
Based on the forum names it might appear as if there's a lot of overlap but /2/ feels like most other iichan boards, while /moe/ and /cw/ have developed their own distinct dynamics. In any case, it's idlechan, the more boards the merrier.
There's a little backhistory of that. When my boards were added to the iichan list, they were brought on under the condition that /2/ would eventually be dropped, and my boards would replace them.
That's why the main page says it's the replacement for /2/, that was the original intention. I originally started out with just /moe/, but to take up the slack of /2/ I started /cw/ to help divide things out. But I guess Blackmage forgot :T
As far as I can tell, it's impossible to do load balancing on kareha as it is. Uploaded images would have to go to the high bandwidth servers somehow, after posting, and their names would have to be the same. I sadly suggest you run dreamhost, hell gurochan is running there and has not had any problems.
Not impossible at all. DNS round-robin setup. www.fatechan.net would point to the vps, and img.fatechan.net would point to a list of image hosts. All I have todo is write a wrapper script for karena that is called when images are uploaded to push the image to the hosts.
Hell, I don't even have to do the DNS magic. I can just have the wrapper script do that as well. EX. Kareha wants to link to an image. It thinks the images are in /src/wrapper.php/imagename.whatever. What really happens is with that url, the image name is passed to the wrapper script, the script goes "hay, let me go through my list of hosts, and randomly pick one, and pass that onto the person requesting it."
The hard part for me is not the loadbalancing for the image displaying, it's figuring out how to reliably process and push uploaded images to the hosts, and keep them in sync in realtime. I've already got a basic framework going, I just need to do a little more testing and some security setup and then burn it in.
I plan on GPL'ing my abstraction hacks/scripts, to make it easy for other image boards to deal with scaling/bandwidth issues.
As for dreamhost, I'm not worried about their level of service, I'm terrified I'll give them money, then they'll finally collapse from such drastic overselling. And from what I've experienced, dreamhosts are very serious cpu nazis, and if kareha is giving me fits on a far less stressed host, I can't imagine what dreamhost is going to do to me.
Any other recommendations? I was looking at LunarPages or Servint right now.
> how to reliably process and push uploaded images to the hosts, and keep them in sync in realtime
Maybe you could just set up a cronjob on the image hosts to check for new images in a directory on the VPS every hour or so, and then a fallback for the wrapper script to fetch the image locally if it couldn't be found on the image hosts. It'd consume some extra bandwidth on the VPS, but nowhere near as much as having it all on one host.
Also, why not make your stuff public domain like Wakaba/Kareha themselves? No one's going to make a profit selling imageboard software, and you'll make the copyleft haters happier.
Per lunarpages: http://www.bearcityweb.com/lunarpages/index.html
Didn't realize kareha was public domain, I just assumed gpl/some sort of bsd license.
As for that review, that was made over a year ago, it seems they've shapen up a lot since then, and so far I've read a lot of good things.
>> Maybe you could just set up a cronjob on the image hosts to check for new images in a directory on the VPS every hour or so, and then a fallback for the wrapper script to fetch the image locally if it couldn't be found on the image hosts. It'd consume some extra bandwidth on the VPS, but nowhere near as much as having it all on one host.
Was actually just thinking about just letting the script push it itself, with ftp or possibly another catch script on the other site. Then make a cronjob do an actual rsync or whatever to keep all the mirrors in line, just in case one was down when an image was first released.
I'm trying to force the load of the actual thread/db management and syncing on the vps, and just let the image hosts be dumb hosts that just serve the images. So I want to leave the task of actually doing all the cpu intensive and actual footwork to the vps. Just make it so that anyone with extra space on a host just needs to either give me an ftp account, or drop a script and make a few directories in order to become a mirror.
I'm also trying to code an archiving system, that takes the last page on the active board, makes a static page, including threads, and moves it to an archive. This will reduce strain on the active board as it rebuilds the thread tree after each new post, and static pages need very very little resources to be hosted.
>>31
Well, sounds complicated (i.e. impossible if you don't have enough free time) And I'm a bit fuzzy on "pushing the images". This puts same load on both servers CPU wise during transfer, and double CPU on the host receiving the upload.
Dreamhost stopped overselling 3 months ago. They noticed it seems. Their prices are higher and bandwidth less. I still would ask you to contact chiisai on the site i mentioned in my post, he seems to run a ~1tb site without problems on dreamhost on kareha.
I don't know, but I was in love with your site, and I'm willing to help financially or psychologically, or technically as far as I can.
If you can pull off the load balancing scripts, I'm sure guys running imageboards will fall in love with you.
But that load is only once, during the upload cycle. The main bulk of the CPU usage is in the actual thread sorting/generation and display, not the image thumbnailing or uploading. The image upload only occurs once, and doesn't actually take up that much in resources. It's the thread sorting, rearrangement, and cache creation every time a post is made that takes up the most time, and happens every single time you post. That image is only uploaded and processed once, and that's it.
Why not switch to wakaba?
Kareha's CPU use is far higher (I vehemently disagree with >>2, because it just ain't so). Also, wakaba already has a load balancer, so no need to reinvent a wheel.
Just to nail the point home:
From secchan.net, which has several Kareha installs:
_Process CPU seconds user machine count average
kareha.pl 2958.5100 88.537% 12.327% 17311 0.171
From wakachan.org, which is all wakaba, yet gets twice as many hits:
wakaba.pl 64.8100 33.006% 0.270% 127 0.510
The numbers speak for themselves. I disagree with WAHa's extending Kareha to do what Wakaba does, rather than the other way around. This is why.
Actually, I should specify that I meant that in mode_image, it uses next to no CPU. mode_message invokes the script every time you view a thread, which drives up CPU usage considerably.
One could easily enough change mode_message to link directly to the HTML instead of linking to the script, too.
Then find me a way to safely migrate over 30 pages worth of images in two imageboards from kareha to wakaba. I'm in mode_image too.
Hacking the custom tripcode catches wouldn't be difficult for wakaba, but I already have a huge amount of threads in kareha, and I'd like to stay away from a complete wipe. The loadbalancer coding is kinda fun too anyways, trying to remain as completely drop-in and simple to setup as possible.
It's ok. I just watched the first 3 eps of nanoha strikers, and it seems they they don't fit to your board anymore. 20+ something... :(
Pfft, they just move to /cw/ now =P
>>40
Man, I hope migrating /moe/ and /cw/ to wakaba is successful. I'd hate to see all those threads I haven't yet had a chance to peruse disappear!
>>47 I'm not migrating to wakaba. I'm currently coding a new imageboard script. After some analysis of kareha, there was just too much that needed to be modified in order to make it run more efficiently.
Friend of mine wrote a script to export all the threads from kareha and strip the html from them, giving me a base to build a new script off of. I'm going to keep the style of kareha, drop in, very little configuration, and nothing other than a webserver needed to run (IE flatfiles, no sql db required).
Gives me some practice and I find it a fun project. I'll implement all the features I want, while keeping it as simple to run and maintain as possible.
So the down time wasn't just me... that's good to know... Hope Fatechan /moe/ and /cw/ can get up and running soon...
Since Fatechan's downtime seems to cause quite a bit of confusion, maybe the 403 message should be replaced with a short explanation?
>>50 Thanks for that suggestion, I'll put up a placeholder thing
>>48
So now we'll get to enjoy yet another new imageboard script, instead of the tried and true wakaba. In the meantime, no /cw/ or /moe/.
Oh well. They're your boards, do as you like... I'd like you to go in a different direction though.
>>48
Why not use wakaba or the like?
I'm just curious because I just did the same thing recently (wrote my own script, that is) and I'd be interested to know if your reasoning is anywhere near mine.
I hate mysql, that simple. And I think it's a real waste of resources to run a full db for something of this nature. This isn't like normal bulliten board software. Content on the boards isn't dynamic at all, once it's posted it doesn't change.
The only time the pages need to update, is when a new post is made, or one is removed. And even when that happens, there's at the most, a change to a thread, and to some of the main page indexes.
There's only 6 different kinds of posts.
1) Post is to an existing thread on the first page, and isn't a sage post, only the thread and the first page need to be rebuilt
2) Post is to an existing thread not on the first page and isn't a sage post, only the thread and the pages from the front back to where that thread was originally located need rebuilt
3) Post is to an existing thread and is a sage post, only the thread and the page the thread is on needs updating
6) Making a new thread, new thread needs to be generated, and all page indexes rebuilt
5) Deleting a post, only the thread and the page the thread is on need adjusted
6) Deleting an entire thread, all page indexes from where the thread was on back need to be rebuilt
The majority of posts occur in the first 3, new threads are uncommon, case 5 is usually an admin action, and case 6 is 99% of the time only done by admins.
But even then, new content isn't that common. On traditional boards the most common state is static, and static content is very easy to push around. Through my friend's (hihi pc486 :V) profiling of kareha, the majority of cpu time is being used by i/o operations. And the main source of that i/o is writing the caches.
After going through the code, kareha actually wastes a lot of time rebuilding threads when it doesn't have to. Also not storing individual posts and the shortened thread displays, wastes processing time as it has to completely strip the html from the thread, and then rebuild them as well as make the shortened version again. I think with adding a few additional cached items, the shortened threads for main page display, and the individual posts, and to follow those 6 cases and only do what's necessary, I think a flatfile system can compete with a db-based board in most cases.
Only advantages a db would give is ease of programming, less disk space taken up since there's no need to cache as much data, and the flatfile system will become very i/o bound on heavy traffic boards. I'm aiming for ease of installation and matinence, and to enable the script to be run nearly anywhere perl and some diskspace is available. Along with being built from the ground up with image loadbalancing capabilities, it'll make it very easy to scale the boards as traffic grows.
If I were programming just for myself, I'd probably just go python or ruby and postgres and not worry about ease of installation or operation. But I'm aiming to expand on the initial concept and utility provided by kareha, and make a capable and more powerful successor, while keeping the ease of use and matinence with a flatfile system.
> python or ruby and postgres and not worry about ease of installation or operation
Sounds... exactly like what I did.
Yeah, if I were coding solely for me, I'd just use postgres and whatever language I felt like. But I'd like to give back to the community and build a viable and worthy replacement for Kareha, rather than like what's been happening here, tell people to install a db and run Wakaba when they run into Kareha's shortfallings.
I think the design of flatfiles have a LOT of potential, and for most imageboards, would be more than enough to run fine. And I like the drop-in ease of the flatfile design, all people have to do is just unpack, change a few settings, and be done. Redesigning the caching system, along with adding native loadbalancing for the images, will go a long way to allow it to run fine on most size boards.
My goal for my script will be ease of installation and use, allowing anyone who knows how to use ftp and edit a few text files to be able to setup and operate the entire script, including the loadbalancing. The loadbalancer servers will just need a webserver and possibly either an ftp account, or even just a dropin script to upload the images to.
It's funny, I was talking to a friend of mine and he said just the opposite, and suggested the design of Myspace as a counterexample. Everything on there is database-driven -- all the text, images, etc. The rationale is that it takes more time to reparse a flat file and insert the dynamic content than to keep the parts in separate fields in a table and pull them out at page-fetch time, and the load balancing could be done transparently because the database server connections are all happening behind the scenes.
It makes sense for a half a second until you consider that the overhead of running a database server in the first place is probably greater than any of the IO/processing time required to cut a file into several pieces, and sites that actually see enough visitors to make it worthwhile to run two servers per page view are atypical. He pointed out that, since the DB is "in memory" it's quicker to pull a record out of it... but then again, the file inode cache would be in memory as well. So yeah, I'm beginning to doubt the usefulness of running a database at all, besides for statistical purposes and (as you mentioned earlier) to simplify the code -- but if you're trying to write portable code that can run on more than just MySQL, your code is probably not even going to be any simpler, either.
... that sort of rambled on for a while without much of a point, didn't it.
By the way, consider what Trevorchan did. I don't believe "ease of installation" should be one of the key features of an imageboard script, if only to prevent the creation of six thousand useless imageboards built by 16-year-old script kiddies.
>>58
I think she's already discouraged that by not using PHP.
>>57 My thoughts exactly. Very few imageboards see the level of traffic, where the overhead of running a database will be less than being i/o bound by a flatfile design. With proper caching, a flatfile can easily keep up, and in some cases, be faster than a db-based setup. The only major drawbacks are more complex programming, and more disk space used to store cached processing steps.
>>57
Plus the web-side component of an application like that can typically be scaled with far less pain than the big ol' xbox-hueg centralized database. It's just more hardware and that's that, whereas scaling a database means more I/O bandwidth and more CPU sockets.
So it makes sense to stick as much of the required processing in the website layer, as long as this doesn't mean doing SQL joins in the application or other database abuse. Caching, too, is usually best done at the web server assuming that coherency is being taken care of.
All of this is obviously immaterial to stuff that just runs on a web hosting account. Hell, I remember the etherchan wiki (RIP) dying because mediawiki, or that particular mediawiki installation anyway, banged the shared database server too hard. ... though I've got a suspicion that the database administrators had calmly dropped some of the indexes that mediawiki required for proper performance, thus provoking unholy quantities of sequential scans.
Also, holy shit are you people polite. Four entirely bumpworthy replies and all of them are saging...
Yeah, with the futaba-style image/post board, we're not worried about searching, so there is absolutely nothing a db can do that a flatfile just can't realistically.
Once I get a little base of code setup, and a good plan, I'm going to put up an svn server for anyone who wants to donate time and coding effort to it. I really want this to be a viable replacement for kareha, and also address any shortcomings of wakaba, so input from other admins and programmers is very welcome.
Just don't add stickies.
>>61 it's not about being polite or rude, it's about avoiding the ID. For some of us anyhow. Sage is Heaven.
Oddly enough, I've come across a couple of shared hosts where flat-file message boards are explicitly disallowed. Their excuse was they took up too much CPU. It's one of the most absurd rules I've ever encountered.
>>62
You do realize that SQL databases don't provide for searching, right? MySQL's non-ACID MyISAM tabletype supports a half-arsed form of fulltext indexing, but that is next to useless without control over things such as stopwords and the like.
As to things that SQL databases can do that flat files cannot, well... if you don't care about (or don't know) the difference between a sequential scan and an index scan, then perhaps it is better that I don't try to sell these points to you. I expect you'll find out on your own exactly why many developers prefer to simply bang out a short SQL query rather than spend a couple of hours writing the equivalent query out for flat files.
There are versatile full-text search plugins for many common SQL databases. Even SQLite has one, and a pretty decent one at that.
I don't see this as necessary, anyway, since the design of Wakaba-ish boards makes them easily indexable by web search engines. One of my boards was imported from an old YaBB setup, and it currently has about 3500 posts. I added a Google site search box to it, and it's proven every bit as useful (if not more useful) than the overly-complex YaBB search interface with none of the code -- well, except the HTML, of course.
So when're the /cw/ and /moe/ boards coming back? I'm jonesing real bad here.
I think I'll just get a temp loadbalancing hack in there for now, then I'll continue work on the real replacement script without having to rush.
Yeah /moe/ and /cw/ always seemed to have only work-safe material unlike 4chan's (http://www.4chan.org) /c/ (http://zip.4chan.org/c/imgboard.html)... especially after MOOT made the change to /b/... making /b/tards post there stuff... there...
Aw man i think the page will never come back to us.
Well i will be looking for other moe work-safe site
Fatechan! I miss you! I cannot find another site as great as you! I need delicious flat chest!
Look, I have a shit ton of space I'm not using on my dreamhost account right now. Granted, that 500gb is grossly oversold, but if you're interested, the email address is in the link.
i've been using kareha on shared hosting and i probably get as much use of it as you did if not more. i havent had any issues. maybe you need to find some competent hosts.
I've tentatively restarted /moe/ and /cw/ on a vps of mine while I still finish my kareha hacks for image load balancing. I just got to make sure I keep my bandwidth limits in check >>;;
Hmm..
So i just started a brand new imageboard..
What should i do with it?? where should i get traffic?
andy ideas?? i want to make it as unique as possible..
thanks
>>77
you've got things a little backwards there.
you should ask those questions before you decide whether to make a new imageboard or not.
>>54
flatfile databases are definitely not doing your cpu usage any favors. If you think running mysql or postgresql (postgres > mysql in most cases) is too heavy for a little operation like yours, I'm pretty sure you can set up wakaba (or pyib) to use sqlite, which should be lighter.
Anyway, take a look at vpslink if you haven't already. They offer tons of customizability, and it's the cheapest vps you'll find anywhere.
> 2007-05-08
Did anything come out of this? He still using Kareha?
I hate the spam on these boards.