Philip Newton (pne) wrote,
Philip Newton

FamilySearch Indexing, or Human OCR

The other day I read in the Ensign about the project "FamilySearch Indexing".

Basically, the thing is that LDS Church / Genealogical Society of Utah (not sure which exactly) has a couple million microfilms containing millions of names from "110 countries and principalities" -- but it's hard to find anything in those data, so they want to digitise those forms and create electronic indexes which will be available online.

To that end, they created a program which would download a page at a time (as an image) and let a person enter the data from the form into fields and submit those data; this program was made available at first to select stakes but is now available to the general public -- whether they are a member of the LDS Church or not.

So, if you'd like to give it a try, have a look at Possible reasons are: to give you something to do while you're bored, to help out with family history/genealogy, to keep your fingers nimble, to get a small glimpse into what life was like around 1900 (most of the data available for processing now is from 1900 census forms).

I've tried it a little so far, and it can be quite fun and even a bit addictive, where you'll do "just another line" or "just another batch".

A batch, the smallest work unit, is usually fifty names, and takes "usually around 30 minutes" to index. You get a week to work on a given batch, and you don't need an Internet connection to work on the names (only to submit your results and pick up a new batch or three). You can also work on a batch in bits; it'll let you pick up where you left off.

The software is written in Java and available via Java Web Start (just click on a link and it will likely "just work" if you have Java installed) as well as through a small installer; officially supported on Windows, but people use it on Macs and Linux boxen, too. It even works over proxies, though you may have to do some fiddling with the Java Web Start configuration to get it to work with a web proxy.

Give it a try :)

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded