[Java] Defeating CAPTCHA Images

Disclaimer: Depending upon the country you currently reside in, programmatically defeating CAPTCHA images may technically be illegal. Whether or not there is any merit behind such a law I leave as a matter for you to work out with your representatives or equivalent lawmaking body. But suffice to say, the information in this post is intended for educational and informative purposes only, and should not be used in any other context. It should also be noted that the CAPTCHA images that are used in this example are quite old, and were cracked by others long ago.

I’ve always been mildly amused at the continually growing use of CAPTCHA images, or more accurately, at their ever-increasing complexity. It seems that the only truly effective CAPTCHA’s are the ones that even human beings can barely decipher. But more interesting to me is the fact that these distorted snippets of letters and numbers have become a sort of de-facto Turing test. If you can determine what the characters are, then you are human; otherwise you are not. For whatever reason, these images have become a symbolic line in the sand separating man from machine, and by exploring ways to cross this line we may move ever so slightly closer towards the creation of true artificial-intelligence.

So let’s examine a very basic CAPTCHA image, one that was used in a popular online-forum distribution before it was cracked long ago:

PHPBB2 CAPTCHA

This CAPTCHA works on the principle of contrast. Human beings can discern distinct regions in an otherwise noisy image so long as each distinct region meets some minimum contrast level above/below that of the background noise. This kind of image can be difficult to decipher computationally, because pulling out coherent regions from amongst the background noise requires contextual understanding of large portions of the image at once, which is generally a difficult thing to accomplish programmatically. That isn’t to say it can’t be done, however.

A human being looking at this image is able to recognize that there is some threshold created by the background noise that has been introduced, above which an element is part of the encoded data, and below which an element is simply part of the background noise and should be discarded. Once that is done discerning the text becomes a simple matter of discarding everything below the noise threshold, and keeping everything above. So let’s see if we can code it. First, we need a way to determine the noise threshold:

        //init, determine the average color intensity of the image
        int average = 0;
        for (int row = 0; row < image.getHeight(); row++) {
                for (int column = 0; column < image.getWidth(); column++) {
                        int color = image.getRGB(column, row) & 0x000000FF;  //only need the last 8 bits
                        average += color;
                }
        }
        average /= image.getWidth() * image.getHeight();

This bit of code determines the average color intensity of the entire image (216 / 255 in this case). Because this CAPTCHA is in grayscale it only needs to look at a single component of the pixel color, but colorized CAPTCHA images could be processed in a similar fashion by computing the intensity using the full RGB value. In any case, now we have a basic threshold that we can use for determining which parts of the image contain valuable data, and which parts contain only noise. We can do that like so:

        //first pass, mark all pixels as WHITE or BLACK
        for (int row = 0; row < image.getHeight(); row++) {
                for (int column = 0; column < image.getWidth(); column++) {
                        int color = image.getRGB(column, row) & 0x000000FF;  //only need the last 8 bits
                        if (color <= average * .70 ) {
                                image.setRGB(column, row, BLACK);
                                darkRegion = true;
                        }
                        else if (color < .85 * average && darkRegion && row < image.getHeight() - 1 
                                && (image.getRGB(column, row + 1) & 0x000000FF) < .85 * average) {
                                image.setRGB(column, row, BLACK);
                        }
                        else if (color < .85 * average && ! darkRegion && row < image.getHeight() - 1 && column > 0 
                                && column < image.getWidth() - 1 
                                &&  (((image.getRGB(column, row + 1) & 0x000000FF) < color) 
                                        || ((image.getRGB(column + 1, row) & 0x000000FF) < color) 
                                        || ((image.getRGB(column - 1, row) & 0x000000FF) < color))) {
                                image.setRGB(column, row, BLACK);
                                darkRegion = true;
                        }
                        else {
                                image.setRGB(column, row, WHITE);
                                darkRegion = false;
                        }
                }
        }

Note that this code assumes that darker pixels are part of the data and lighter pixels are part of the background noise, because that is how the input CAPTCHA is set up. A smarter approach would be to look at the number of pixels falling above the noise threshold and the number of pixels falling below, and then keep whichever group is smaller. For a CAPTCHA like this one to be effective, there must be more noise than data, so it follows that the data that you’re looking for will always be in the smaller group of pixels.

In any case, what the above code does is traverse the image, and turn any pixels that appear to be noise white, and any pixels that appear to be data black. Note that it includes some rudimentary region-detection code, owing to the fact that we expect our data pixels to be tightly clustered together in distinct regions. So when the code encounters a pixel that it considers to be part of the data, it also lowers the selection criteria for the next pixel because there is a strong possibility that the next pixel will also be data. This helps prevent false-negatives from erroneously dropping out valuable pieces of data. Let’s take a peek at what our CAPTCHA image looks like at this point:

PHPBB2 CAPTCHA, after first pass

It’s not perfect, but it is definitely improved. We have successfully removed all of the background noise from the image, but unfortunately we have also removed some pieces of the actual data. The data that is left is all in the right place, however, so perhaps we can amplify and/or reconstruct it:

                //second pass, eliminate horizontal gaps
                for (int row = 0; row < image.getHeight(); row++) {
                        for (int column = 0; column < image.getWidth(); column++) {
                                int color = image.getRGB(column, row) & 0x000000FF;  //only need the last 8 bits
                                if (color == 255) {
                                        consecutiveWhite++;
                                }
                                else {
                                        if (consecutiveWhite < 3 && column > consecutiveWhite) {  
                                                for (int col = column - consecutiveWhite; col < column; col++) {
                                                        image.setRGB(col, row, BLACK);
                                                }
                                        }
                                        consecutiveWhite = 0;
                                }
                        }
                }
                consecutiveWhite = 0;
                
                //third pass, eliminate vertical gaps
                for (int column = 0; column < image.getWidth(); column++) {
                        for (int row = 0; row < image.getHeight(); row++) {
                                int color = image.getRGB(column, row) & 0x000000FF;  //only need the last 8 bits
                                if (color == 255) {
                                        consecutiveWhite++;
                                }
                                else {
                                        if (consecutiveWhite < 2 && row > consecutiveWhite) {
                                                for (int r = row - consecutiveWhite; r < row; r++) {
                                                        image.setRGB(column, r, BLACK);
                                                }
                                        }
                                        consecutiveWhite = 0;
                                }
                        }
                }

This code fills in any small vertical and horizontal runs of white pixels with black pixels, the rationale being that any small group of white pixels that is surrounded on either end by black pixels is virtually guaranteed to be part of the data that was erroneously discarded. Again we can take a peek at our result:

PHPBB2 CAPTCHA, after third pass

Getting better, but we’re not quite there yet. Our characters are much more distinct, but there is still some missing data. A fair bit of the missing data is now contained in small regions of white pixels that are actually encapsulated within our characters. Filling them in is a relatively simple matter:

                //fourth pass, attempt to fill regions
                for (int row = 0; row < image.getHeight(); row++) {
                        for (int column = 0; column < image.getWidth(); column++) {
                                if (image.getRGB(column, row) == WHITE) {
                                        int height = countVerticalWhite(image, column, row);
                                        int width = countHorizontalWhite(image, column, row);
                                        int area = width * height;
                                        if ((area <= 12) || (width == 1) || (height == 1)){
                                                image.setRGB(column, row, BLACK);
                                        }
                                }
                        }
                }
                
                //fifth pass repeats the fourth
                for (int row = 0; row < image.getHeight(); row++) {
                        for (int column = 0; column < image.getWidth(); column++) {
                                if (image.getRGB(column, row) == WHITE) {
                                        int height = countVerticalWhite(image, column, row);
                                        int width = countHorizontalWhite(image, column, row);
                                        int area = width * height;
                                        if ((area <= 12) || (width == 1) || (height == 1)){
                                                image.setRGB(column, row, BLACK);
                                        }
                                }
                        }
                }

Here we check, for each white pixel, how many adjacent white pixels exist both vertically and horizontally. This gives us a rough estimate of the size of the current region of white pixels. If the size is too small, then the code assumes that the white pixel is actually supposed to be part of the data, and turns it black. Note that the algorithm is methodical in its approach, in that when it detects a small region of white pixels, it toggles only the initial pixel that it tested in that region. This toggling will reduce the region-size reported for any adjacent white pixels, increasing the likelihood that they will be toggled as well on the next iteration, which is why two passes of the same algorithm are applied. And yes, I know having the same code repeated twice is poor coding style, but for illustrative purposes it gets the job done. Anyways, we now have:

PHPBB2 CAPTCHA, after fifth pass

Many of the gaps are now filled in, and the text is starting to look fairly legible. There are now, however, a few spurious black pixels that have cropped up along the edges of the characters. We could go back and refine the previous step, but instead let’s just prune out these outliers:

                //sixth pass, clear any false-positive
                for (int row = 0; row < image.getHeight(); row++) {
                        for (int column = 0; column < image.getWidth(); column++) {
                                if (image.getRGB(column, row) != WHITE) {
                                        if (countBlackNeighbors(image, column, row) < 3) {
                                                image.setRGB(column, row, WHITE);
                                        }
                                }
                        }
                }

This pruning step removes any black pixels that are bordered by 3 or fewer black pixels. This is a fairly strict threshold, and will have the effect of smoothing/rounding out corners (i.e. some legitimate data will be discarded), but it will also clear out any spurious black pixels that exist in the image. Now our image looks like so:

PHPBB2 CAPTCHA, after sixth pass

The letters have taken on a softer, more rounded quality. They also happen to look vaguely reminiscent of what you might get if you were to scan a text document using an older scanner. Which is worth mentioning because we will eventually be feeding our cleaned-up CAPTCHA image to an optical-character-recognition program that is designed to process just this sort of data. First, however, our characters are all misaligned. We’ve come this far, so we might as well fix the alignment issue while we’re at it:

                //now find the characters
                List<CharacterBox> characters = new ArrayList<CharacterBox>();
                int totalCharWidth = 10;
                int maxCharHeight = 0;
                for (int column = 0; column < image.getWidth(); column++) {
                        int highestBlack = countVerticalWhite(image, column, 0);
                        if (highestBlack < image.getHeight()) {
                                totalCharWidth += 5; //5 px spacing in between chars
                                CharacterBox box = new CharacterBox();
                                box.setX(column);
                                while (column < image.getWidth() && countVerticalWhite(image, column, 0) < image.getHeight()) {
                                        int currentBlack = countVerticalWhite(image, column, 0);
                                        if (currentBlack < highestBlack) {
                                                highestBlack = currentBlack;
                                        }
                                        column++;
                                }
                                box.setWidth(column - box.getX());
                                box.setY(highestBlack - 5);
                                box.setHeight(image.getHeight() - highestBlack + 5); //can trim this later
                                if (box.getHeight() > maxCharHeight) {
                                        maxCharHeight = box.getHeight();
                                }
                                totalCharWidth += box.getWidth();
                                characters.add(box);
                        }
                }

Here we simply compute a bounding box for each distinct region of black pixels (i.e. each character), plus some additional padding so that our output image will draw nicely. Speaking of output image, we can now create it by positioning our characters in correct alignment with each other in a new image, like so:

                //output a new image with aligned characters
                BufferedImage dst = new BufferedImage (totalCharWidth, maxCharHeight,
                                                           BufferedImage.TYPE_INT_BGR);
                for (int column = 0; column < dst.getWidth(); column++) {
                        for (int row = 0; row < dst.getHeight(); row++) {
                                dst.setRGB(column, row, WHITE);
                        }
                }
                int xPos = 5;
                int yPos = 0;
                for (CharacterBox box : characters) {
                        for (int oldY = box.getY(); oldY < box.getY() + box.getHeight(); oldY++) {
                                for (int oldX = box.getX(); oldX < box.getX() + box.getWidth(); oldX++) {
                                        dst.setRGB(xPos + (oldX - box.getX()), yPos + (oldY - box.getY()), image.getRGB(oldX, oldY));
                                }
                        }
                        xPos += box.getWidth() + 5;
                }
                ImageIO.write(dst, "png", new File(OUTPUT));

Now we have the following:

PHPBB2 CAPTCHA, fully processed

The characters are nicely aligned and uniformly spaced. We now have something that is suitable for sending into a character-recognition program. For this example we use tesseract, a free and open-source OCR program that provides a good level of accuracy. We can send our output to tesseract like so:

                Process tesseractProc = Runtime.getRuntime().exec(TESSERACT_BIN + " " + OUTPUT + " " + TESSERACT_OUTPUT);
                tesseractProc.waitFor();

This invokes tesseract on our output image, and it writes its results to a text file located at ‘TESSERACT_OUTPUT‘. In this case, the text file contains the following:

IKEECL

…which is 100% correct.

Using a handful of very simple image filtering loops based around a brief examination of how a human being would approach the image, and some existing OCR software, the CAPTCHA has been defeated. Of course, this only works for this one specific style of CAPTCHA, but the basic approach of reducing noise, amplifying data, and isolating characters should be broadly applicable to a wide range of different CAPTCHA styles. The challenge lies not in breaking the CAPTCHA, but in devising an algorithm that can attempt to break any number of different CAPTCHA styles dynamically and with a success rate comparable to that of a human being. It needs a way to determine, from the CAPTCHA image itself, what kind of noise exists and how it should best be removed. That is the real challenge, and it’s beyond the scope of this article.

Note that for the sake of preserving some sense of brevity I’ve left out the implementation of some minor utility functions and variable declarations and the like. In general, you can assume that a function (or variable) does what its name implies. If, however, you would like a complete copy of the source-code used, you can download it using this link (zipped Eclipse project).

Note that in order to get it to run you will also need to install tesseract on your system, and edit the values at the start of the Java code to point at your local tesseract installation.

Posted in coding, java | Tagged , , , | 30 Comments

Code Formatting Plugin Updated

I’ve just updated the code formatting/syntax highlighting plugin used by this site to the very cool Syntax Highlighter Evolved. This plugin is a marked improvement over the previous plugin, providing such niceties as automatic non-copyable line numbers, horizontal scrolling, built-in highlighting rules for just about any common language, and so on.

No other news to report at the moment, I just wanted to publicly show my support for this excellent WordPress plugin.

Update: Much as I still love this plugin, it did introduce a pretty severe layout bug in Internet Explorer (any post with a code block in it would be stretched beyond the width specified in the theme layout, causing the posts to overlap with the right-nav section). I fixed this by adding the following to my IE stylesheet:

.entry-content {
	width: 100%;
}

And while we’re on the subject, the image lightbox plugin also did not work correctly in Internet Explorer by default (the background overlay was solid black instead of translucent). I had to add the following CSS to get it working (again in the IE-only stylesheet):

#stimuli_overlay {
    display: none;
}

Note that this completely removes the background overlay. I could probably get the alpha set correctly without too much more fuss, but I think this works well enough for now.

To my fellow web-developers, I can only say the following: I know it’s fun to dump on IE for its lack of standards compliance, poor performance, and other failings, but at the end of the day IE is still the most used browser in the world, so it behooves us to always test our work in Internet Explorer when authoring online content or when authoring any code that will be used to generate online content.

Posted in banter, configuration, software | Tagged , , | Leave a comment

[Objective-C + Cocoa] The Poor-man’s Multithreading

I’ve been doing a fair bit of coding in Objective-C lately, more specifically using Apple’s Cocoa API. There’s a lot to like about Objective-C as a language. Predating Java by nearly a decade, here is a reflective language that more closely adheres to that fundamental object-oriented paradigm of message-passing between objects than many more-modern languages do. This is even more impressive considering that the language is built on top of C, a 38 year old language with no support for objects, reflection, most other concepts introduced by the object-oriented paradigm, or even a convenient syntax for declaring and using data structures.

Conceptually, one can reason out that Objective-C accomplishes many of its neat tricks by using the performv: selector and a few clever lookup tables placed in strategic locations. For instance, if the member functions for an object are referenced in a function table that is keyed by function name (or more appropriately, selector), and the performv: selector always consults this table when performing a method invocation, then adding new functions to an object can be accomplished, even at runtime, by simply adding some new table entries. It’s a pretty elegant design, really, although that’s a subject for another day, perhaps.

The subject for today is a somewhat simple, somewhat less elegant method that is an integral part of the Cocoa API:

- (void)performSelector:(SEL)sel withObject:(id)arg 
		afterDelay:(NSTimeInterval)secs

What this method does is schedule a given method invocation on the current run loop, to be executed after some amount of delay, specified in seconds. As a brief aside, “run-loops” are an outdated, non-intuitive, and over-complicated concept, and they should go away. Viewed abstractly, a run loop is simply an execution context. It executes a stream of commands serially relative to other commands in the run loop, and in parallel relative to commands being executed in other run loops. In essence, a run loop is simply a thread. It provides an execution context that is asynchronous from other execution contexts in the runtime environment, just like a thread does. And there is little reason to complicate the matter by inventing a new concept to use to model something that every modern Computer Science student is exposed to in their first year of college.

But anyways, the performSelector: method (and its numerous variants) allows a developer to schedule a particular method invocation to occur at some later point in time on the current thread (I shall not be using the term “run loop” any longer). Other instructions on the thread continue to execute as normal between the time performSelector: is called and the time when the target method is invoked (i.e. using this method does not block the current thread). It’s almost, but not quite, like scheduling the execution of a function closure, and can be used to create a kind of poor-man’s multi-threading.

As noted, the method invocation is actually scheduled to execute on the current thread, so there is no true concurrency occurring here; merely the illusion of concurrency. But in many cases, the illusion of concurrency is all that’s really required. To draw a parallel to other languages, the performSelector: method is comparable to the setTimeout() function that is present in both JavaScript and ActionScript. Both of these languages use a single-threaded runtime environment, but can provide the illusion of concurrency through use of the setTimeout() call. Conceptually, the execution engine that these languages use is simply processing a queue of function closures. While there is a closure that is scheduled to execute, the engine dequeues and executes it, otherwise it sleeps until there is a new closure scheduled in the queue. And so it is in a Cocoa thread, except that instead of a queue of function closures, you have a queue of selectors (or “messages”, or “method invocations”, as you prefer).

There are a number of uses for this not-quite-concurrency, such as monitoring the status of an object or operation and then posting a notification to some delegate when that status changes. For instance, here is some simple code to detect the end of a UIView startAnimating sequence:

@implementation AnimatingView 
- (void) checkStatus {
	if([self isAnimating]) {
		//animation is still running, check later
		[self performSelector:@selector(checkStatus)
			withObject:nil afterDelay: 0.1];
		return;  //be sure to do this here
	}

	//animation is done, notify and release our delegate
	[animDelegate notifyAnimationComplete];
	[animDelegate release];
}

- (void) runAnimationAndNotify: (id) objToNotify {
	//[set up your animation stuff here]
	
	//retain the object that we want to notify
	animDelegate = [objToNotify retain];
	[self startAnimating];  //start the animation
		
	//schedule a status notification check
	[self performSelector:@selector(checkStatus)
		withObject:nil afterDelay: 0.1];
}
@end

In many other languages, the above code can only be implemented by spawning a new thread to monitor the status of the animation (or by blocking/sleeping the current thread while the animation is running). You can do it that way in Objective-C as well, but the above method has a couple of advantages over using a dedicated thread:

  • No thread creation overhead. Since the current thread is used, no additional resources need to be allocated on spawning a thread just to handle the notification dispatch.
  • No need to worry about concurrency/synchronization issues. Because the notification is sent from the same thread that started the animation, there is no need to worry about the notification arriving when the calling object is in the middle of doing something else (unless your program has other threads that are making use of that same object).

Of course, there are some caveats associated with this approach, as well:

  • If the current thread blocks or sleeps, the notification will be delayed. Obviously if you are doing things on your thread that would cause it to block or sleep for a long duration of time, you should not use this approach.
  • If too much other work is being done on the current thread, then the timing of the notification will be unreliable. That’s just the nature of the beast, just like how ActionScript and JavaScript timers lose their accuracy as CPU load approaches 100%. If your workload is heavy enough to saturate a CPU core, you are better off splitting it up into multiple full-fledged threads.

Overall, however, this is a useful technique for those situations where you want concurrent behavior without having to pay the resource and synchronization cost of true concurrency. It’s not appropriate in every situation, but if you have a simple task that you’re thinking about offloading onto a dedicated thread, this approach may be a quicker and simpler alternative.

Finally, a coding post!

Posted in coding, concurrency, objective-c | Tagged , , | Leave a comment

Working FLAC and Vorbis Support in Windows Media Player

Hot off the heels of the “SMOOTH Processor” debacle, I’m faced with the task of reapplying one of my favorite Windows 7 configuration tweaks. Namely, the addition of FLAC (and also Vorbis, Speex, and so on) support to Windows Media Player. Now I know there are a variety of free media players that have this built-in, and for the longest time I would actually use Winamp3 (along with the quite awesome queue-sidecar plugin) to handle all of my media playback needs. But as time wore on and operating-systems evolved I didn’t have the patience to continue coaxing my Winamp3 install into working with the next latest and greatest OS version, nor the tolerance for its sporadic but annoying random crashes, and when Windows 7 rolled out I was presented with a compelling reason to ditch Winamp3 (and every other alternative) in favor of Windows Media Player: Homegroups.

Homegroups allow for much simplified sharing of multimedia and other content between computers on a network. There is very little explicit configuration needed, and when an application integrates with the Homegroup properly there is really no noticeable difference between a local resource and one that’s being streamed from some other computer on the Homegroup. This makes for a very cool experience when it all works, and in Windows Media Player it works seemlessly. My media library can be distributed across several systems, and yet from each one I can access the entire volume as if it is all local to that particular machine. And it just works, flawlessly, with no onerous setup or manual cajoling needed on my part. This is what good software is supposed to do; merge invisibly into the background so that the user can accomplish a complex task as if by magic, and currently only Windows Media Player does it.

I’m sure that situation will change in the near future, but at the moment its seamless Homegroup integration makes Windows Media Player the only media player that I am interested in using. And that means that some of its other shortcomings, such as a complete lack of support for a variety of free and open codecs and file formats, need to be dealt with. Now there are a number of tutorials on this subject already, but many of them give inaccurate or incomplete information. In the interest of having a complete set of instructions that actually work, if you want to enable FLAC/Vorbis/etc. support in Windows Media Player, do the following:

  1. Install the DirectShow filters/codecs.
  2. Install the Tag Support plugin.
  3. OPTIONAL: If, like me, you included your FLAC folders/files in your library before installing the above packages, then you also need to rebuild your Windows Media Player library. Make sure you close Windows Media Player and shut down its network sharing service either by stopping it under ‘Control Panel -> Administrative Tools -> Services‘ or by killing the ‘wmpnetwk.exe‘ process in the Task Manager before attempting to delete or rename its configuration folder.

And that’s really all there is to it. You may or may not end up with the ability to seek within and/or see the duration of FLAC files. I’ve seen it happen both ways on occasion, it seems like it either works or not depending upon the mood that Windows Media Player happens to be in when these plugins are installed. But the important bit is that all the previously unsupported files will now show up correctly in your library, and also (of course) that they are now playable.

And in other news, the rebuilding continues, slowly.

Posted in configuration, software | Tagged | Leave a comment

Disaster!

So I just had a HDD die on me. Do I have a backup that I can restore from? Of course not; though before you chastise me too severely perhaps I should explain my system configuration. I run a custom-built PC with two RAID arrays for storage. The first array is a RAID-0 array of two Western Digital Raptor (10,000 RPM) disks and is intended to be used only for things like OS and program installation. The second array is a RAID-5 array using 4 Seagate (7200 RPM) disks, and is intended for storing pretty much everything else. Now I’ve had disks in the RAID-5 array fail multiple times in the past, and it was no big deal because I could just RMA the disk and pop in a replacement drive and the array would rebuild itself and all would be well.

The problem is that this time it wasn’t one of the RAID-5 disks that failed, it was one of the RAID-0 disks. And although I had always recognized that volume as being vulnerable to spontaneous destruction in the event of a drive failure, the gravity of the thing when it actually happens far outweighs the mere recognition of it as a possibility. Yes, I treated the contents of this volume as expendable. But no, I wasn’t perfect about it in practice. There were some non-expendable things that I allowed to linger on the RAID-0 volume, such as my Eclipse workspace and inside of it all 64 of my Project Euler solutions, and now that they’re suddenly gone it stings just a bit. And even ignoring that, I’d rather not have to install and configure all of my software all over again. Particularly since I was also in the middle of playing through Fallout: New Vegas on hardcore mode, and my save-games got wiped out with the program install.

So I did some investigation of my own into the failed drive, hoping that I might find some way to bring it back to life, at least for long enough to get my data off of it. Standard home remedies such as tapping, dropping, and shaking the drive were all tried, along with the fabled freezer-trick, to no avail. Eventually I decided to go out and get a Torx set so that I could remove the drive’s controller board, and I located the problem:

Smooth Processor:  Destroyed

Looks like the “SMOOTH” processor isn’t feeling too smooth anymore. On the bright side, this does open up one possible avenue for resurrecting the drive. If I can track down another model with an identical firmware version, it should be possible to swap their controller boards. So if anyone has (or knows where to find) an old Western Digital Raptor, working or non (so long as the controller board is intact), that you’d be willing to part with, please let me know. The exact details of my failed drive are:

Model: WD1500ADFD-00NLR1
Manufacture date: 05 June, 2006
DCM: HBCA2AB
Firmware revision: V7353 (can only be found by removing the controller board)

I know this is a long shot, but who knows. The Internet is a very large place, after all. In the mean-time, perhaps I’ll seek some estimates from professional data-recovery companies. I’m sure their rates will be laughably unreasonable.

Sigh. Time to rebuild.

Posted in banter | Tagged | Leave a comment

Math Geeks Only

I wanted to take a brief moment to call some attention to Project Euler. Though I stumbled across it just recently, Project Euler has been around for nearly a decade now, and I find myself wishing that I had discovered it earlier. Like back when I was still in college and had loads to time to waste on arbitrary diversions.

But in any case, Project Euler is a set of mathematic-themed programming problems that anyone can attempt to solve. There are just over 300 problems available at the moment, and the difficulty increases progressively as you move from problem to problem. Solve enough of them, and you can be immortalized on the Project Euler leader-board (100 solutions are required to earn the first permanent ranking), if you care about that kind of thing.

Since finding Project Euler I’ve been compulsively solving problems almost non-stop. They start out quite easy, though some of the later ones do require a thoughtful approach to solve effectively. Particularly when you consider their “one-minute rule” in conjunction with the fact that there have been 5 to 6 iterations of Moore’s Law between now and when Project Euler was founded. One minute of computation-time back then should equate to roughly just 2-3 seconds of compute-time on a modern CPU (or perhaps more like 8-12 seconds for a single-threaded solution given that the past few iterations of Moore’s Law have focused more on increasing thread-level parallelism than on performance-per-clock in single-threaded workloads), which makes the “one-minute rule” quite a bit more challenging.

If you feel up to the task and don’t mind the possibility of introducing a massive time-sink into your life, I encourage you to give Project Euler a go. If you’re a coder with a head for numbers who likes to be challenged, I think you’ll find yourself enjoying it.

If you do take a shot at it, my advice is to keep your solutions as modular as possible (don’t just stuff all your code into main()). Many of the later problems build incrementally upon concepts introduced in previous problems, so if you always take the time to construct modular and well-defined functions (particularly in the areas of primality testing, factorization, and permutation computation) that can be reused in multiple solutions, you will be ahead of the curve.

Posted in banter | Tagged | Leave a comment

How to Write Good Code

Xkcd probably ties for second with Penny Arcade in my list of favorite webcomics (the #1 spot goes to Questionable Content, despite my unending annoyance at the fact that the author has yet to have Marten hook up with Faye). For those not familiar, Xkcd tends to focus on popular science, algorithms, coding, and other similarly geeky topics. Not long ago the following gem was published:

Xkcd-844

Now I’ve spent enough time in the tech industry to recognize that on many occasions this comic is as accurate as it is humorous. When working on any project of substance it is far more often the case that the requirements are a moving target, a living and breathing entity being pushed and pulled and cajoled and mutated by any number of external influencers, from the UX department who have just completed Yet Another Usability Study™ and concluded that The Entire UI Needs To Be Redone™ to the lead architect who wants to restructure the internals to use his new design that will Solve All Problems Forever™ to the engineers working deep in the code who have just discovered that The Feature You Want Is Not Feasible™.

The pragmatic coder is left with two basic options, often punctuated with an Unreasonable Deadline™; either code fast and try to meet the requirements before they change, or do things correctly and hope the requirements don’t change too much before you get there. More experienced coders know how to code fast better and/or how to code well faster, but the basic options remain the same, and often you will still be faced with scenarios where you’ve coded yourself into a corner, or had the requirements change before you got there.

So what to do in these situations? As much as I appreciate the tongue-in-cheek humor of the original comic, I believe there is a simple solution to the problem that it poses. Namely, refactoring. Refactoring fits in like so:

Xkcd #844-a

Refactoring is the art of transforming What I Have™ into What I Need™ and getting from I Finally Made It Work™ to I Can Make It Work Anywhere™ without having to throw away and rewrite 90% of your code, and the ability to refactor effectively is often the only thing truly separating good engineers from poor ones.

This is a subject that should be covered in any serious computer-science curriculum, yet unfortunately the first experience many engineers will have with refactoring will be when their first employer asks them to turn That Really Old Codebase That Nobody Understands™ into The Next Big Thing™. At this point one of two things will happen. Either they will refactor That Really Old Codebase That Nobody Understands™ into Something I Can Work With™, or they will decide that We Have to Rebuild It From Scratch™. Either approach can take a long time, but one is guaranteed to be far more disruptive and risk-prone than the other.

Embrace refactoring. It won’t save your life, lead you to salvation, or guarantee you eternal paradise, but it just might save your project, or even your company, from failure.

Posted in banter, coding, process, refactoring | Tagged , | Leave a comment

Apache + PHP = Headache

Fresh out of installing and configuring Apache and PHP so that I could get this WordPress blog up and running, I feel there’s a topic worthy of some brief mention here.  Namely, the proper (which I mean not in the “pedantically correct” sense but more in the “do this and it will actually work” sense) way to get Apache and PHP talking to each other after installing them both for the first time.  And to be as clear as possible here, this information pertains specifically to Apache version 2.2.17 and PHP version 5.3.5 (thread-safe).  Mileage may vary with different software versions.

Now back in the day, I would use a nifty little SourceForge project called phptriad to handle all of the little installation and configuration nonsensicalities for me.  Phptriad was quick, easy, and generally worked straight away.  There’s just one major problem with it; the project hasn’t seen any updates since September 9th of the year 2000.  Using it nowadays consigns one to hopelessly outdated versions of Apache and PHP (and also MySQL). Versions that are so hopelessly outdated that they do not meet the minimum requirements for running WordPress.

So I decided that this time around I would do things the correct way, and install the latest version of Apache, the latest version of PHP, and then configure them to talk with each other.  “How hard can it be”, I recall myself saying.  The PHP installer even had an Apache configuration step that seemed like it would handle the entire process automatically.  But sadly that was not the case; the automatic configurator did nothing useful, and while the configuration process was not hard it was exceedingly poorly documented and circuitous, turning what should have been a 2-minute task into a nearly 2-hour endeavor.

Now, do a quick Google search for Apache and PHP configuration issues and you will find countless posts suggesting adding minor variations of the following directives to Apache’s httpd.conf file:

AddHandler application/x-httpd-php .php .phtml
AddType application/x-httpd-php .php

Seems like a simple enough fix, but none of the myriad variations of these two directives did the trick. It makes sense when you think about it, considering that all these directives do is associate .php files with a MIME type. Left wide open is the question of how Apache is supposed to know what that MIME type actually means, how it should be handled, and where to find the handler. Yes, the php5_apache22 module is being loaded as well, but the module isn’t magic; Apache still needs to know when and how it should use it. So essentially what’s missing is a mapping to tell Apache “for MIME type ‘x’, use program ‘y'”. After far too much searching, I found a page that also mentioned adding the following directive:

Action application/x-httpd-php "/path/to/your/php.exe"

And finally there was the missing piece. The AddType (or if you prefer, AddHandler) directive associates .php files with a particular MIME type, and the Action directive maps that MIME type to a concrete resource that Apache can use to process items of that type. Apache now knows both what a .php file is, and how to process it, and like magic the server behaves as it should. All told far too much digging was needed to uncover that one critical line.

For reference, the complete PHP configuration section in my httpd.conf file looks like so:

PHPIniDir "~/apache/php"
LoadModule php5_module "~/apache/php/php5apache2_2.dll"
AddType application/x-httpd-php .php
Action application/x-httpd-php "~/apache/php/php.exe"

I suspect that the actual MIME type assigned to .php files matters very little here, so long as the type is consistent between the ‘AddType’ directive and the ‘Action’ directive.

Posted in Apache, configuration, software | Tagged , , , | 1 Comment

Introduction

I suppose that’s how these things start, right; with an introduction?  Well my name is Adam Roth, and if you’re reading this then you have found your way to my blog.  While the primary focus will be on coding, software, and other technical issues, expect to find occasional digressions onto other tangental or even wholly unrelated topics as well.

To give you some details about myself, I am a United States citizen by birth, and a permanent resident of Australia by choice.  I hold a Master’s degree in Computer Science, and have worked with a diverse array of technology companies ranging in size from the 5-person startup all the way up to 1000+-employee technology firms.  I intend this blog primarily as a place to explore (and perhaps more importantly, keep track of) interesting and challenging coding problems and their possible solutions.

Those who know me casually might describe me as a misanthrope, but they largely miss the point.  I hold no ill views towards humanity as a whole (except in cases where it is deserved, such as instances of mass stupidity, hysteria, or generalized group-think that result in tangible acts of inhumanity and/or neglect towards others).  I’m quite fond of it, in fact, and interested in doing what I can in my own small but unique way to help humanity progress further along (or with less hyperbole, to help those others that I am capable of helping).

What I don’t like, and what those aforementioned casual acquaintances so often mistake as a generalized disregard for humanity, is being around other people.  I just find it draining more often than not.  Always have, and probably always will.  Of course, there is the occasional rare exception to this rule; such as my wife and the small handful of people that I count as my true friends.

But I ramble on.  Suffice to say that this is my blog, and I hope that the content on it will prove to be of use to you.

Posted in banter | Tagged | Leave a comment