I was thinking about making my own image board script, but with PHP/MySQL since I don't know any perl (which is why I just won't modify wakaba.) There are a few features I wanted to add like being able to create/delete/manage boards from the admin panel, add bbcodes, dynamically have a list of boards created and be displayed along the top.
Anyway, I'm trying to figure out the comments database for an imageboard. Just trying to find out what is necessary and what isn't and what I want to add.
Timestamp and Lasthit were the first to get me. I have no idea what those are for. Next that confused me was the md5. I didn't have a clue what the purpose of that was. Also, the "random" numbers that are generated for an image, it looks like they have something to do with the timestamp but I'm not exactly sure what.
I'm somewhat of a newb at PHP but I don't think it would be impossible to script this if I take my time. I've done a little bit of PHP in the past and it's fairly straightforward.
There were a few other things that I was wondering about, for example why mailto: was included in the e-mail field of the database, or why the directory that an image is kept is stored in the database. I thought maybe "mailto" and "src/" could be added by the script itself.
One more thing; thumbnail height and thumbnail width. I can see what they're for but I'm not exactly sure why they're necessary.
Oh, and one more thing. Tripcodes and the secret key. How are tripcodes generated and how is it based on the secret key. I'm not looking for actual code, just concepts.
I should warn you from the start that there are many things you need to be aware of when making any kind of forum software that will trip you up. There are some serious security issues involved. Uploading files is a very unsafe operation, to start out with. And if some of the things you asked about aren't immediately obvious to you, you might not be starting out from a good position.
I was somewhat experienced in writing Perl code when I started out this project, but there were still quite a number of things I fucked up and had to learn the hard way.
That said:
In closing, I'd say a more productive use of your time would be to learn Perl. PHP is a horrible language for serious work. See http://4-ch.net/code/kareha.pl/1120533289/9 for examples.
Also, BBcode is an abomination. Please don't use it. If you have to add explicit markup (and don't like the markdown/WakabaMark style), use HTML for the syntax. BBcode is just an obfuscation of HTML anyway. That's not to say that you should just allow arbitary HTML - best to add a parser that rejects any tags or attributes not explicitly allowed, or people will be cross-site scripting you in no time.
Okay that makes sense. Thanks for the quick reply. And I'm totally aware that my script could very well be a security hazard in its earlier stages.
Not being a hardcore programmer and also (admittedly) not being the sharpest apple in the barrel, I don't see why BBcode is so terrible. Even if it is shit, the truth is a lot of people are used to using it. (And yeah, I totally realize if I allowed all HTML tags I'd be screwed!) I don't even know of any forums these days that allow for real HTML; it's generally just disabled.
And why you hate on PHP? I don't know the advantages of PHP over Perl but as far as I can tell it seems like the majority of newer scripts are written in PHP. Why would this be if it sucked so badly?
How would this be different from Shiichan 4?
I don't know if it would be, but I do remember having some gripes with Schiichan. For one, I never really got it working. It seemed broken on some level or I had misconfigured it our I needed to change some arbitrary thing to make it work.
PHP scripts that don't work out of the box kill me inside. Granted, the programmer has no obligation to me, but a broken script makes me never want to use it again.
I don't even know where to download that.
You can download Shiichan v.3956 (final beta) here: http://wakaba.c3.cx/shii/shiichan
> For one, I never really got it working.
If you can't get something like Shiichan working, what are you doing trying to write your own board software?
> I don't know the advantages of PHP over Perl but as far as I can tell it seems like the majority of newer scripts are written in PHP.
Quantity != quality. Popularity != excellence.
The majority of code used to be written in BASIC too, back in the early 80s. Did that make BASIC better than, say, C, Lisp, or Forth?
Perl is a far more powerful language, and faster and more secure to boot. The learning curve is far higher than PHP's though, which results in the current situation.
I don't like BBcode because it is just as difficult to use and ugly as HTML, while not actually being HTML, so instead of learning one thing you have to learn two. It has no reason for existing. This is idiotic.
Also, BBcode is only really used by phpbb and its clones. Everywhere else, people use either some sort of simplified markup (wikis and the like) or a subset of HTML (Of the sites I use, Slashdot, LiveJournal, Flickr, and various blogs come to mind).
As for why I don't like PHP, did you read that post I linked? PHP is a mess, and it encourages sloppy coding (in that example, it makes it extraordinarily difficult to write properly secure code). It started out as a quick hack to add a little bit of dynamic content to webpages, and for that, I think it was useful. Then it suddenly wanted to be a Real Programming Language, but ended up as just a big mess.
>>7
It's easier to troubleshoot my own bugs than troubleshoot someone else's. I followed the directions for installing shiichan and it was broken.
Secondly, this is just a project of mine. Should I not program something just because it won't be perfect the first time around?
>>8
I like BBcode because it streamlines some things that would take several tags to do in HTML and is usable by pretty much everyone. I've seen BBcode used in phpbb, invision, vbulletin, various blog scripts, deviantart, etc. It's become somewhat of a "standard." I know you hate how much that sounds. People expect to be able to use BBcode and I wouldn't be the one to screw with their heads to make a statement about BBcode versus HTML.
The point is that it doesn't streamline stuff anywhere near enough to be really useful. If you want something properly streamlined, I much prefer using something like what WakabaMark already is: http://wakaba.c3.cx/docs/docs.html#WakabaMark
Also, invision and vbulletin were the phpbb clones I was talking about. I have no idea who cloned who, really, but they're all the same, and horrible.
What message board do you suggest instead, then?
Well, needless to say, I am fond of 0ch-style discussion boards.
> It's easier to troubleshoot my own bugs than troubleshoot someone else's.
It's easier to troubleshoot someone else's bugs than to rewrite the entire thing from scratch, unless you're dealing with a trivial program (which, admittedly, Shiichan is close to being).
> Should I not program something just because it won't be perfect the first time around?
I'm afrain you missed my point. If you wish to do so, by all means, have fun. However, I also think that if you had problems getting Shiichan working, you're biting off far more than you can chew by writing your own version.
Needless to say, I'm not a fan of rewrites, unless the software in question is criminally broken.
Start small, start small...
I find Wakabamark to be highly annoying, it kinda streamlines too much and garbles most of the more-than-one-line AA.
Well, I find AA to be highly annoying. I'd advise you to take it away and-
>>13
I never even spent that much time with Shiichan. I basically followed the installation instructions, ran the script, noticed it was drenched in bugs, and threw it away.
Does anybody even actually use shiichan for their image board.
world4ch uses it, and I think I was using it until I noticed that nobody posted to it.
world4ch is still running, and I think the only applied patch fixes a bug where appending the banned notice to a post reset its date to 1969. Of course, it does use an older version, but even if the newer one is buggier it's probably less work to fix it than restart on your own.
Not quite the same situation, but http://www.joelonsoftware.com/articles/fog0000000069.html
>>18
You make a good point with the link, but 1000 lines of php script doesn't quite compare to however many lines it takes to write commercial software.
Anyway, this is more of a personal project/test than it is "let's revolutionize imgboards!" I find that I learn a lot of things when it comes to languages when I jump into them (and even take on more than I know) as opposed to starting small (although I have written some lame scripts like counters and guestbooks in the past.)
Well, if you do start this project, you will no doubt learn a lot. I know I did. It might not always be fun to learn, though.
I'm making small amounts of progress. I'm coding the script to work with existing wakaba boards, but only because if I ever decide to use it I don't want to have to start from scratch or rearrange the wakaba databases. Plus wakaba databases are totally adequate.
This is just a display of posts with parent = 0 and sorted by lasthit. As you can see, a lot of things are broken.
Unlike wakaba the pages won't be pre-compiled which I'm guessing will save disk space but quite easily increase server load since the database has to be hit up every time someone views a page (although any PHP forum works this way.) The other drawback is one can't link to specific .htm files but instead those nasty URLs with all the form post/get gobbledygook but I'm not sure anyone was typing in imgboard links by hand anyway.
You're really recommended to generate static html wherever possible. The last incarnation of 4chan used dynamic for replies, and even with APC their system was pushed to the edge. By comparison, the original IIchan never went over ~2% CPU. You'll also note that the current incarnation of 4chan also generates static html.
You have to. It's a whole lot faster to spit out a page straight off disk than a worst-case fork, load, compile, execute cycle. Uses less memory than things like APC or some FastCGI variant too. All you sacrifice are a few hundred kilobytes of HTML on disk.
You're probably right. I'm currently baffled at why some of the largest forums in existence don't use precompiled html.
Maybe I could make either method optional. Dynamic pages do have the one advantage of making any database change appear with a simple reload as opposed to a cache reload.
The guys in #php of Efnet insist that precompiled HTML is dumb because if you want to change something about a page (such as change the word "Subject" in the post area) the script has to go and rebuild several other pages.
I don't know what kind of server load accessing MySQL databases makes, and I don't know what kind of server load modifying several HTML pages makes, so I'm confused.
A massive forum run with, say, vBulletin, must have several database queries run even time a page is loaded. If you wanted to change the way something is displayed you change the associated template file or modify the database whatever and from then on, the pages would load differently.
Now with pre-generated HTML, if I changed something minor about the way things are supposed to be displayed, and this is a potentially massive forum, and the script has to go and remake thousands of HTML files, how is the server going to handle this?
Would having always-dynamic pages outweigh the load caused by changing hundreds or thousands of HTML files?
Or do precompiled HTML files only work in certain cases, like in the case of this image board where you will realistically only have but so many HTML files to change (whereas a forum is constantly growing.)
> The guys in #php of Efnet insist that precompiled HTML is dumb because if you want to change something about a page (such as change the word "Subject" in the post area) the script has to go and rebuild several other pages.
You might try and consider how often this happens, compared to how often somebody views a page, and then figure out which one you should optimize for. (Hint: you don't change the word "Subject" several times a second.)
> #php of Efnet insist that precompiled HTML is dumb
That has to be one of the most idiotic programming-related things I have ever heard. I advise you to spend your IRC time elsewhere.
Use dynamic content where appropriate. Use static HTML where appropriate. For an anonymous or 2ch-style board, where content is not customized to each viewer, static is the way to go (the opposite applies for things like vBulletin). You'll notice that even wakaba uses a little dynamic content when the captchas are enabled, but that's because there's good reason to.
Or just reread my comment regarding IIchan and 4chan. This is all backed by practice, not just theory.
To elaborate a bit:
> Or do precompiled HTML files only work in certain cases, like in the case of this image board where you will realistically only have but so many HTML files to change (whereas a forum is constantly growing.)
This is a fairly complicated question, because it's related to implementation. For example, HTML is always cheaper to send, but it's more expensive to post. How expensive do you want it to be?
If you have several pages of threads, then HTML. If you never clean threads up (ie, keep them forever like vBulletin), then that post could be really expensive indeed, requiring the main linking pages to all be updated (a few hundred? more?).
But, this is also implementation-dependent, because you could design the board so that threads several pages back are segmented and never bumped, in which case older HTML is never touched. Or you could make all the older threads dynamic; it minimizes the number of HTML to be written, and nobody really looks back at older threads anyway (so it's cheap).
There are all kinds of tricks you can pull to make a site go fast. To be honest, vBulletin's implementation is rather braindead. With CSS, some javascript, and more static HTML where the most-oft read pages are, you could make it go a lot faster.
That's a good idea, but I'm a total newb at this.
I am totally surprised I made it this far. The script can display posts, images, thumbnails, and replies. You can skip forward to other pages or whatever. But you can't post and there are a few other things I need to work out like cutting down long posts. There is no sort of "Admin Panel" yet. I'm still trying to get the script to where wakaba is (at least functionally; I bet my code is a trainwreck but I wouldn't know.)
One of the challenges was, when it came to figure out how to code conceptually, I really couldn't refer to wakaba.pl because I don't know perl, and I really couldn't refer to futallaby because I seriously have no idea where the person who coded that is coming from. I had to figure a lot of stuff out by myself and by referencing compiled pages.
However, considering my experience, I've made it a long way and should be able to finish eventually; maybe by the end of August.
Right now I've just gotten to the point where I need to derive a tripcode from said word and secret key. Wonder how many feet over my head that's going to be!
> futallaby because I seriously have no idea where the person who coded that is coming from
laughs No worries. Futallaby is Ugly, with a capital U.
If you're planning on making a 2ch-style board, you'll hopefully stick to a standard tripcode mechanism (we all like our tripcodes to work across boards). This may be helpful: http://wakaba.c3.cx/sup/kareha.pl/1099727823
Old-style trip codes can be bruteforced, though, right?
I was thinking of combining the md5 of the user's inputted tripcode and the secret string, maybe half and half, to make secure tripcodes. It's not standard, though, (duh.)
Or I guess I could program it so it would be up to the admin to decide what method to use.
What do you mean by tripcodes working across boards?
> Old-style trip codes can be bruteforced, though, right?
Yeah, they can, but so can everything else. The only real issue is how fast you can test each prospective key, and the amount of processing complexity (and even time) by most common ciphers is fairly similar. Ie, we're not looking at an order of magnitude difference, unless you're a cryptoanalyst and are actively exploiting a known weakness in a cipher. Of course, this is also ignoring the pigeon-hole principle, but I digress...
Wakaba tries to get around that with optional secure tripcodes, but to be honest, I've always found them a bit silly. In concept they're great, but in order for them to work part of the key must be kept secret by the board. As a result, each board has a different secret key, and secure tripcodes are different across boards. There's also the problem of recognition: it's easier to recognize by sight a common eight-character code over something longer. So the end result is... nobody uses them.
> What do you mean by tripcodes working across boards?
Probably what you thought I meant. There are some corner cases where this isn't true, but generally, if I enter a tripcode in one 2ch-style board, the result will be the same as on another board. 4chan and IIchan use different software, but the tripcodes are identical.
>if I enter a tripcode in one 2ch-style board, the result will be the same as on another board. 4chan and IIchan use different software, but the tripcodes are identical.
I had no idea they could work like that. Nifty.
>>31
Why is it so important to have the same tripcode across boards, I mean? (I know I couldn't care.)
>>33
It's not important, it's just convenient. You can basically maintain the same identity on here, on 2channel, on 4chan, on 4-ch and on many, many japanese boards, even including such exotic places like Bar Giko.
for example, I'm always going to be anon!21anon4H3U, no matter where I post.
ISN'T THIS TOTALLY AWESOME??!?!
yes lolocaust, we know how you feel about anonyminity.
So, what determines if a post is too long?
Also I'm stuck at the point where poeple make posts. I have the text box, and I can have the text they entered get put in the database, but the text lacks formatting. What means do I use to format the text the enter? I guess codes is the easy part, I don't know how to process line breaks.
Yeah, so? I'm new.
You'll have to parse the text and perform substitution. How you substitute is entirely up to you.
One way of doing it is to replace all linefeeds with <br />. There are better ways though; for example, take a look at the source of this page.
You'll need to do this for more than just linefeeds though, otherwise you'll end up with people breaking your board by posting unrestricted HTML.
> So, what determines if a post is too long?
The post abbreviation code is highly non-trivial, since I use some fairly complex HTML code in the posts. If you go with the simple way and just replace newlines with <br/>s, you can just count the number of newlines to figure out where to break a post.
Wait, so what character in a comment gets replaced by <br />?
I'm totally clueless. Also, I don't get how file uploading works in PHP, but I guess this is CGI land so perhaps nobody knows.
Cuz it's like: $comment = $_REQUEST['comment'];
But when I print a comment, there are no line breaks even if there were some in the text area.
I tried browsing around for PHP tutorials on basic discussion boards, but none of them talk about formatting posts. And again, futallaby is like a foreign language. The only thing I found was
$com = str_replace("<br />"," ",$com);
And I was like, what? How does a space = a line break?
> Wait, so what character in a comment gets replaced by <br />?
Usually '\n' (0x0A), although windows does it differently and tacks an additional \r to it ('\r\n' = 13, 10 decimal). If you load up a file that uses unix linefeeds (just the 0x0A) in windows, it'll look like a huge long line.
BTW, you'd better be careful with a simple \n => <br /> substitution. WAHa was not kidding when he said it's not that simple.
For example, if I wanted to abuse your board, I could just post a really long string without linefeeds. E.g.: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...". I could fill up your board with one megabyte of that. So just counting linefeeds is not enough. OTOH, counting characters leads to a mess where words are cut off mid sentence, and can be abused by posting a long string of linefeeds (you'll end up with a huge mostly-blank page).
You'll have to take a hybrid approach to be safe.
> "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa..."
Sometimes I lose all hope for humanity...
>the ghost of thread 1120952093
lol wut?
The ghost of past threads!
Charles Dickens co-wrote Kareha!
This board stinks. And you said this is better than the "old-style" boards, geeze.
Quick, look at the log files! Maybe we'll get a referring forum to trollenlighten!
Hahah, it's some poor schmuck who stumbled in from some Mac downloads site. You can tell he knows what he's talking about because he stumbled onto the support board and posted in the first thread he could find.
What's going on here.
Yeah, I just recently continued work on this imageboard project of mine. It took a while to get my head back in the code but now it's going rather smoothly but I'm worried that the code is actually a trainwreck or is going to cause a server load that I could severely reduce if I knew what I was doing a little better.
I put in some silly things like templates for every aspect of the board (posts, replies, etc.) and made the code easier to read because my intention is to have some better-qualified people fix it up once it works.
Right now i'm wondering about the javascript involved when you click on a post number and have it appear in the comments box.
I hope that wasn't from me linking it in http://forums.bungie.org/story/. (look, an independently invented minimalist board script!)
This is pretty simple; it's something like this:
function quote(b,a) { var v=eval("document."+a+".mesg");
v.value+=">>"+b+"\n";
}
...
<a href='javascript:quote(123,"post1131276165");'>123</a>
The eval() is just to get the post box you're adding text to. I believe JavaScript is capable of doing more advanced stuff than just that (for instance, creating >>a-b links if you click more than one post number), and I've been meaning to look into it, but haven't done anything about it yet.
That's one way to do it, but it's pretty primitive. You can do better, and insert the >> link at the current cursor position, at least in Firefox and IE (using different code for each, of course).
Just look at the relevant function in http://wakaba.c3.cx/sup/kareha.js.
Oh okay. Yeah, wakaba's was much more suited for an imageboard since every thread doesn't have it's own reply box. Thanks for that. You're okay with me using it, right? Not renaming it or anything.
All the Wakaba and Kareha code is public domain, you can use it for anything.
Win.
I will use Wakaba and Kareha to take over the planet.
I'm stuck at the part where I have to format comments.
PHP has a function/method (whatever you call them in PHP) that formats comments called nlbr() but all that does it insert <br /> at line breaks.
Of course wakaba's comment formatting is more advanced and uses <p> tags which is more efficient partially because one has the option of using css with <p> tags unlike <br /> tags which wouldn't allow me to, say, change the background.
Considering the differences between PHP and wakaba I obviously can't ask for a code snippet, but general theory is always helpful.
Well, the WakabaMark engine I use for that does all kinds of things, including formatting nested lists, and it's not very easy to follow. The doc wiki has an implementation of it in Javascript that is probably more easily ported to PHP, but that is still a significant bit of work.
The general idea I'm using is that if I see two lines of text following each other, I put in a <br/>, and if I see a block of text followed by a number of empty lines, I surround the block with <p></p> tags. If you just want to implement that, it's pretty much up to you to figure out how to do it best, though.
> PHP has a function/method (whatever you call them in PHP) that formats comments called nlbr() but all that does it insert <br /> at line breaks.
nl2br() also changes whether it inserts "<br>" or "<br />" depending on the version of PHP used (and not depending on anything else), so like many other things in PHP it's pretty much useless.
why not make a cache, almost everyone CMS have a cache, you can select the time,of course.
man give this guy the Prize for seeing the Future he hit the nail ON the HEAD
and it was not yesterday that he put this post up
the seer
I am >>65's fin funnel.
I am terribly sorry that my pilot has created such a thread as this one.
He really is a good person, but he's been under a lot of stress lately.
Lately, he has been screaming "THIS SENSATION, IS IT CHAR!?" every time he sees something red,
And mumbling "please lead me" every time he sees a young girl.
Just the other day, he ordered me to "shoot down the biggest dumbass in this region."
I wanted to reply, "that would be you,"
But knowing that >>65 was even more tired than I was of fighting, I couldn't say anything.
This prolonged war is to blame for everything.
It's not >>65's fault.
I posted today because I just wanted to ask for everybody's understanding in this matter.
I am terribly sorry that a mere fin funnel has made such excuses.
Please, continue to treat >>65 well.
I double clicked on the screen saver (LotsaWater 1.3) to install it, but I got this instead.
Wrong thread, here's the right one: http://wakaba.c3.cx/sup/kareha.pl/1118971388/
Anyway, I've gotten this once or twice myself, and I don't quite know what would cause it, but I think quitting System Preferences and doubleclicking the saver again fixed it. Try that.
I'm wondering why my thread keeps getting bumped by totally unrelated things.
Hi, you may remember me from such failed projects as PHP imageboard. The project is still running strong, but it's no longer PHP. Bet you can't guess what I'm coding it in, now. (Hint: you wish it were perl but it's not even.)
Random question: why does lasthit have to have the same value for original posts and all replies? Wouldn't just the original post cut it?
The reason is sort of hackish - it's so that I can just select all posts and sort by lasthit and thread number to get the post pre-sorted by thread and in the right order.
Don't know what you mean.
Oh yeah, another question. Why do you choose to use the TEXT datatype over varchar, int, etc?
Oh and one more, do tripcodes have a finite length?
I like how my tripcodes don't make sense. I basically went off of what you said in an older thread; crypt() while using the 2nd/3rd characters as a salt.
That gets me the desired tripcode... Kind of. I actually end up with 3 extra characters at the beginning of the tripcode that shouldn't be there (ideally), I have to shave them off.
I tested my results against results generated by http://hotaru.freelinuxhost.com/trip.htm, and yeah, my results are identical.
I guess it's not a huge deal; just wondering if there's a more efficient method (despite the fact that the function is currently only 4 lines of code, anyway).
By the way, I'm working with Ruby.
That's essentially how it's done, yes. There are details, such as how you handle short tripcodes (less than three character), empty tripcodes, and how you handle characters outside the 7-bit ASCII range, and special characters like '. 0ch and Futaba don't even agree on all of those, so it's not really possible to get it perfect, though. Try reading through some of the tripcode threads to learn more about these issues.
Oh, okay. (Also, shit, I wish there was a way to get it perfect.)
You had said earlier that you realized the flaw with tripcodes is that the human brain typically cannot remember the random string of characters that is a tripcode (IMO this is true only to an extent; while I may not remember my tripcode string character-for-character, I can still easily recognize it without even reading it) and therefore tripcodes can be imitated when not brute-forced.
Did you ever come up with a solution to this?
>>83 It's a non-issue right now - if you have the ability to come close enough to the correct trip, the same method will locate the correct tripcode.
Bump.
What is the wizardry involved in filling an empty password field in the postarea? I think it's cool and highly effective considering 99% of us don't care to type a password in but still want to be able to delete images just in case.
ps my app is about 50% done but lately I'm making a lot of progress.
A snippet of Javascript that does the following:
The server-side script, on posting, grabs the passwords, and sends it back as a cookie.
The actual details can be found in wakaba.js
.
Also, you can obviously do all of this server-side instead if you're using dynamic pages, but as usual I recommend against dynamic pages.
Marked for deletion (old.)
I tried to find a reference for this string's constant "S_OLD" in the Wakaba code and didn't find it. No longer in use? Or am I missing something? But most of all, how does a thread get marked for deletion? Since there are many deletion methods I imagine trying to write the logic for it will be very tedious.
Has this thread gone on for too long.
Never implemented, for the exact reason you describe. No use wasting slow, slow SQL cycles on that.
K. I figured.
In my opinion, TRIM_METHOD should be 1 by default.
red letter day coming soon
I'm trying to customize wakaba by viewing the wakaba.html file and them making changes to the .pl file.
Why is the formatting on the html a mess - there are no line feeds?
The .pl file creates the .index file - what part of the perl script can I edit to get it to add in CrLf's
Who needs line feeds? Read the HTML in a DOM viewer, not a text editor.
Why are you trying to read index.html? There should be no reason to do that.