Optical Character Recognition in C# Using MODI Library on 64 bit
Microsoft has provided a solution for OCR with office-2003 onwards. Microsoft Office Document Imaging Library is installed by default with the installation of office-2003 but in 2007 you have to install this library by customizing the setup. You will need to make sure that it is added by using the office 2007 installation setup. Just run the installer, click the continue button with add or remove features and change Imaging component status to “Run from my computer”. Make sure it is installed correctly.
If you have Office 2003 or latest installed, the OCR component is available for you to use. The only dependency that’s added to your software is Office 2007. If your client can guarantee that machines that your software will run on have Office 2007 installed then you can provide this functionality with much ease and without any extra cost.
This library is not compatible with 64 bit OS and you have to make some changes to make it compatible with 64 bit OS (either XP or Vista).
In visual studio 2008 right click the project select properties and then go to build section and change the value of “platform target” to x86.
Adding a Reference to MODI Library.
The name of the COM object that you need to add as a reference is for office 2007 it is Microsoft Office Document Imaging 12.0 Type Library and for office 2003 it is Microsoft Office Document Imaging 11.0 Type Library.
After adding the required library you can use the OCR library in the following lines of code.
Using the MODI Library.
string strText = “” ;
// Instantiate the MODI.Document object
MODI.Document md = new MODI.Document();
// The Create method grabs the picture from
// disk snd prepares for OCR.
md.Create(“c:\abc.bmp”);
// Do the OCR.
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
// Get the first (and only image)
MODI.Image image = (MODI.Image)md.Images[0];
// Get the layout.
MODI.Layout layout = image.Layout;
// Loop through the words.
for (int j = 0; j < layout.Words.Count; j++)
{
// Get this word.
MODI.Word word = (MODI.Word)layout.Words[j];
// Add a blank space to separate words.
if (strText.Length > 0)
{
strText += ” “;
}
// Add the word.
strText += word.Text;
}
// Close the MODI.Document object.
md.Close(false);
that’s all .