PyIB Thoughts (7)

1 Name: Anonymous : 2010-03-16 02:24 ID:aHrhaBXE [Del]

I just tried PyIB, and I saw it was pretty fast but still lacking features. I saw the code and it looks decent except for the lack of commenting.

I'd just like to know what are your experiences with it (and if you have any hopes up for it).

I'm especially interested in the security.

Thanks.

2 Name: Anonymous : 2010-03-17 17:31 ID:Heaven [Del]

All right, I'm up for a bit of a code review... let's see what a casual look through the source code reveals.

database.py
  • This comment worries me slightly:
    # Although SQLAlchemy is optional, it is highly recommended
    This suggests that all the SQL code has to be duplicated in order to handle both cases, and the Settings._.USING_SQLALCHEMY is a gigantic red-flag, because that suggests that any duplication probably isn't being done with a subclass, it's a bunch of hacked-together conditionals. But maybe this is a low level thing that's patched up in some other file. Moving on.
fcgi.py
  • Nothing interesting here, this is 3rd party code. Might be worthwhile to get the original version and diff the two, to check for any "curious" changes, but I'm not going to do that here.
formatting.py
  • For starters, the tripcode algorithm uses try: / except:, with no exception class, and a pass on the except clause. Stylistically this is horrible, not to mention, why would there be an error with string encoding? And why is it ok to just ignore it? A comment answering these questions clearly would be nice; avoiding such constructs would be better.
  • That aside, I see that the post_ variables in nameBlock() are taken as-is and dumped into html-formatted strings. Hopefully they are being escaped elsewhere when that function is called.
  • clickableURLs is a damn mess of a function.
  • matchCrossThreadRefLinks() has this gem:
    parentid = FetchOne("SELECT `parentid` FROM `posts` WHERE `id` = '%s' LIMIT 1" % postid)["parentid"]
    Jesus H. Christ! Is that raw SQL that doesn't escape its parameter? What the hell is this, PHPython? That function had better be receiving a properly sanitized SQL string, that's for fucking sure... ok, it looks like its parameter is coming from a regex match against [0-9]+, so it "should" be fine, but the question of what that function receives simply shouldn't ever need to be asked. It's playing with fire.
  • onlyAllowedHTML() is amazing. Never mind that it allows only nonsemantic <b>/<i><u>/<strike>, it's using regexes and not bothering to check for proper nesting.
  • markdown() is apparently not bothering to sanitize the html output. The markdown spec allows for raw html, so it's way too easy to write a <script> tag that does all sorts of bad things. Keep that in mind, though; who knows, maybe the function that calls that is doing the leg work.
framework.py
  • More raw SQL-strings. Some of them are at least run through _mysql.escape_string, but what if we're not using MySQL? (Whatever happened to SQLAlchemy, by the way? All of this looks to be hard-coded for MySQL. Frickin' ugh.)
  • I see usage of the pickle module .... stored inside the database! What the hell?! This might be cute but to me, it's just got Bad Idea written all over it. I never use pickle unless I'm making a one-off hack, it's insecure; and speaking from experience, it can cause subtle bugs should you ever decide to change the data structure. There's already a perfectly good SQL database at hand; use it!
  • IP addresses are stored as text? Not really a security issue, but it's kind of WTF-ish.
  • More WTF code: int(time.mktime(t.timetuple())) - hey, why not run it through ctime() and strftime(), too!
    There are several other bits like that, like the str(round(..., 2)) a few lines down from that.
  • The md5 module (line 144) is hella deprecated; that function needs a rewrite.
img.py
  • Would be nice to use os.path.join / posixpath.join as appropriate here. No problem as is though.
  • I see os.system() with interpolated parameters. This really really should be rewritten to use subprocess.call(); a space or something in the board's pathname will cause all hell to break loose here.
  • more raw SQL here... hardly worth mentioning anymore
manage.py
  • Passwords are md5-encoded in the database (good) with no salt (bad). So if someone manages to get a dump of the DB it would be fairly easy to crack them. (This is a minor issue; if someone gets the dump of your DB then you have bigger problems anyway)
  • What the hell? The manager password is stored plaintext in the login cookie?! Combine that with an XSS exploit that posts the cookie to some remote site, and you're suddenly broadcasting your full login details. Nice, considering that we might have one with markdown (formatting.py, above).
  • This is an unnecessarily huge function, that's almost entirely encapsulated in the else-clause of a giant if statement. Badly in need of refactoring.
  • Not going to delve into potential security issues after login is validated, if that stuff's fucked up, the board admin's the one who broke it anyway :)
markdown.py
  • More 3rd party code, although apparently this is not the markdown version I'm familiar with.
post.py
  • This is doing one hell of a lot of SQL queries for just rebuilding some HTML files. Chances are pretty good that the majority of the table will need to be read anyway, might as well pull it all and save a lot of round trips between the app and the DB.
  • This uses threads and job queues. What the fuck? Is this really necessary?
pyib.py
  • Question: Why do so many parts of this file call a function in another file, and raise an exception on failed condition? Why not just raise the exception in the other file?
  • Ok, so markdown is apparently a board related setting, and it always runs that nonsensical html-stripping code even when markdown is enabled. Fucking insane. Guess that clears up the issue of embedding script tags (but it's still badly coded, there's no getting around that fact).
  • Another question: how large is the message field in the database? The quote-links could potentially expand the post far beyond the 8000 characters that it's initially trimmed to (line 148); if there's a database-enforced limit there is definite potential for tags to be truncated -- which opens up some XSS possibilities.
  • lol psyco. I bet that saves a whole millisecond of processing time.
settings.py
  • Way to put some of your own settings in there for distribution, Trevor.
template.py
  • Nothing interesting here
tenjin.py
  • Another 3rd party module. Tenjin looks disgusting, btw. Figures a PHP programmer would pick such an abomination.

3 Name: Anonymous : 2010-03-17 17:47 ID:Heaven [Del]

So it looks like SQLAlchemy isn't actually used anywhere, despite that setting. In all, there don't appear to be any "real" exploits, at least not ones that were immediately apparent. Just be careful with the admin password, although I imagine losing it wouldn't be too much of a loss anyway - what would people do, delete posts? Big deal there.

The most significant issue I see is that it's written by someone who obviously is a PHP programmer. It's a decent start but it needs cleaning up. I think switching to SQLAlchemy (and not in conjunction with the mess that it has now) and getting rid of the FetchOne() junk would be a very good first step toward making it a nice clean codebase. Splitting everything into proper MVC instead of generating HTML inside the post function and other haphazard and sloppy coding would also be highly beneficial. A lot of the points I listed are little stylistic issues; having worked with Trevorchan's source code, this is an enormous improvement. I don't think I could find ten consecutive lines in Trevorchan that didn't have some major problem.

4 Name: Anonymous : 2010-03-18 01:39 ID:Heaven [Del]

>>2
>>3
Thank you so much and sorry for taking your time, your comments are extremely helpful. The raw SQL issue was highly noticeable and really ugly, and I also noticed the admin password cookie yesterday; you also helped me to find a lot of issues I didn't notice at first sight.

I'll take your advice and try to clean up what I can. Thank you.

5 Name: Anonymous : 2010-05-03 16:20 ID:3aG5rbl5 [Del]

>>2

>>> Figures a PHP programmer would pick such an abomination.

Wow, you are such a dick. You go WAY beyond construction criticism.

6 Name: Anonymous : 2010-05-05 21:36 ID:Heaven [Del]

>>5
I go that extra mile, just for you.

7 Name: Coach Factory Outlet : 2012-08-14 03:45 ID:cWWI2Oq/ [Del]

Name: Link:
Leave these fields empty (spam trap):
More options...
Verification: