Nowadays most blogs and forum software integrates some form of captcha to help them from automated OCR bots flodding their blog comments or forum posts with spam. These automated bots (which I will not talk about) learns by studying differant captcha overtime. Here is an example of a OCR bot trying to learn from the captcha through bruteforce. The forum software logs “access denied” attempts:
In linux there is an OCR program called gocr, which can be taught to learn from captcha text (or any text in images format). Here is an example:
gocr has many options including ignoring moise from the image and using default database or adding your own to learn from.
Check out the gocr man file for many option: http://www.penguin-soft.com/penguin/man/1/gocr.html