Cross-Platform Scanning Library

For a long time, it has annoyed me that you need special software for creating a multi-page PDF document from a flatbed scanner. Because "scanner" is an ambiguous word (which REALLY makes searching difficult!) let me clarify: I mean, for example, the process by which an HP F4280 printer/scanner device optically "scans" a paper document and creates a digital representation of it, usually ending up as a a .jpg or .tif file.

The problem is that I know of no program that is cross-platform, and capable of creating multi-page scans from a flatbed scanner. To do this, the software has to store each scanned page, one at a time, and ask the user if they want to scan another page. You cannot rely on the driver to handle multi-page scanning, because drivers only do that for form feed scanners that will "suck in" a stack of paper without pause, and combine the data before handing it off to the driver framework.

I am therefore starting on the road to creating such a program. At the low level, I want to support the TWAIN and SANE interfaces, on Windows XP or later, Mac OS X, and popular Linux 2.6 distros of approx. 2008 or later vintage.

Here is how I envision it will work. First, I need a library that provides a uniform API regardless of whether it is using TWAIN or SANE on the backend. The API must be able to provide a byte array of the bits, compressed or otherwise, of the scanned data. A few properties that are shared between TWAIN and SANE should be exposed: for example, the DPI, paper size, and color/grayscale/lineart. Then it should be possible to either specify explicitly, hint, or query the resulting image format of the byte array that is returned from a successful scan.

I'm not particularly concerned with the programming environment: I am familiar with all the popular imperative/OO languages. I'd prefer something like Java or C#, but C or C++ would work fine too.

Once I have such a library in hand, I can construct a GUI that just calls the routines as needed, and use one of several available PDF rendering libraries to create the output. I don't insist on "compile once, run anywhere", but I do insist on "write once, compile anywhere" at the least. Of course, because TWAIN and SANE are different APIs that are available on mutually-exclusive platforms, there will have to be at some point some #ifdefs or other method of distinguishing between platforms to determine which API to use.

I intend for my program and any libraries I directly link against to qualify as Free Software (per the FSF), but the only requirement I have of the programming environment is that it is available equivalently on Windows XP or later, Mac OS X, and Linux 2.6.

I've been googling quite a lot to find such a library, but I can't even find a proprietary one, let alone a free software one. If anyone has found such a gem, please provide me a link -- otherwise, any advice on getting started with my application would be appreciated. If necessary, I will do the development of the above-mentioned library myself, and release it as a separate project under the GNU LGPL. For the application's sake, I'd prefer to write it in Qt4/C++, .NET with GTK#, or Java/Swing, for maximum cross platform compatibility.

5
задан allquixotic 21 January 2011 в 17:56
поделиться