Silverlight 8-Ball

![]() |
Are you tired of battling piles of papers at home? From work, to your kid's school, to old bills and receipts, it can be too much to keep up with! In this article, learn about how to scan, crop, and set metadata for your documents. |
Difficulty: Intermediate
Time Required:
1-3 hours
Cost: $50 and up (depending on hardware choice)
Software: Visual Basic or Visual C# Express Editions, DSOFile:
Developer Support OLE File Property Reader 2.1 Sample (KB 224351)
Hardware:
Any WIA-compliant document scanner
Download:
|
In my last article, I worked with GPS. I decided to try another device, so this time I'm working with an image scanner. For a long time, I've been wanting to try to go (more) paperless around the house. Too many piles of papers, and no way to really find them later when I need to. Scanning is the way to go, though it's time-consuming enough just feeding documents in, but then home scanners are rarely full-duplex (two-sided), and then the tools aren't so great.
The simple things that I wanted were: easy-scan, metadata, and auto-cropping of the images. More than that, I wanted standard image formats with standard metadata. Too many document scanning solutions use proprietary ways to get around limitations such as using database instead of files, or using sidecar files for metadata. This sample will create plain ol' image files with metadata. Use any application (such as Windows Desktop Search or Picasa) to manage and search for documents.
I've included source code for Visual Basic and C#. Both versions are identical. You'll need to download the appropriate version of Visual Studio 2005 Express Edition to open the source code, and you will need to download the DSOFile MSDN sample referenced in the article header. The dsofile.dll must be registered before the project will startup. Presumably the application will work on any supported operating system for DSOFile (2000/XP/Vista).
User interfaces are always challenging. You want to capture all of your functionality, yet make everything accessible without being overwhelming or confusing. One design goal of mine is to always create windows that will resize well. This interface consists of two splitters. A vertical splitter separates the commands and options on the left from the properties and image on the right. A horizontal splitter then separates the properties from the image region.
Image 1: The user interface
The Scan New Document button initiates the scan, using standard Windows scanning dialogs. Once the image is transferred to the application, it appears in the Source tab. You can crop away any borders automatically by clicking Crop. It will look for the color in Crop Color, based on the specified Threshold. You can click on the image to choose the crop color, or use the value that it auto-selects from the bottom of the image. Use the Properties region to enter metadata. The From field becomes the Author field in metadata, and Type becomes Subject. The rest are direct mappings. Finally, select the image format (codec), destination folder, and compression level and click Save. Not that not all formats can hold metadata (such as BMP).
Working with scanners with .NET isn't as smooth as it could be, but the COM-interop works well enough. It would be nice to be able to use Image.FromScanner, but it's not an option!
The first step is to create a reference to Microsoft Windows Image Acquisition 1.01 Type Library. This creates wrappers in the WiaLib namespace. Then, you need to create a WiaClass instance. With that object, you can enumerate the scanners using the Devices property, or create an instance of a particular scanner. If you call the Create method without a scanner DeviceInfoClass object and there are more than one scanner attached, the standard "Select Device" dialog is shown.
Image 2: The Select Device dialog
Once this returns, you have a ItemClass object. You might think that this would allow you to easily determine the selected scanner, but in fact, none of the properties are definitive. The best option if you wanted to retain the choice of scanner would be to present your own "Select Device" dialog from the DeviceInfoClass collection, the remember the selection.
Now you can invoke the GetItemsFromUI method. This isn't required for unattended scanning, but if you want to present the scanner dialog (to choose Color, Greyscale, etc), this is a good choice. From there, enumerate the scans (plural if multiple pages were scanned) and call Transfer on each one to save them to file. I originally used Image.FromFile to load from that file, but it turns out there's a bug that leaves files open when you do that. The simple solution was to create a FileStream, call Image.FromStream, then close the stream. This allowed me to delete the temporary file afterwards. I should note, that for some strange reason, I had to wrap the Image.FromStream call with creating a new Bitmap. This should not be necessary, but if I don't, I get an unexplained OutOfMemoryException later when I convert it to black-and-white. I hate kludges!
Visual Basic
Public Function ScanDocument() As List(Of Bitmap) Dim docs As New List(Of Bitmap)() Dim currFilename As String ' Create a scanner instance (the user can select if more than one) Dim scanner As ItemClass = DirectCast(wiaManager.Create(missing), ItemClass) ' Show the standard scanning dialog (this is not a required step...) Dim scans As CollectionClass = TryCast( _ scanner.GetItemsFromUI(WiaFlag.SingleImage, WiaIntent.ImageTypeText), _ CollectionClass) ' If the user clicks Cancel, collection is Nothing If scans IsNot Nothing AndAlso scans.Count > 0 Then ' Transfer any scanned pictures to disk Dim scan As ItemClass For Each wiaObj As Object In scans scan = DirectCast( _ Marshal.CreateWrapperOfType(wiaObj, GetType(ItemClass)), _ ItemClass) ' create temporary file for image currFilename = Path.GetTempFileName() ' transfer picture to our temporary file scan.Transfer(currFilename, False) ' Create a Bitmap from the loaded file (Image/Bitmap.FromFile locks the file...) Using fs As New FileStream( _ currFilename, FileMode.Open, FileAccess.Read)
docs.Add(New Bitmap(Bitmap.FromStream(fs)))
fs.Close()
End Using ' Don't leave junk behind! File.Delete(currFilename) Next End If Return docs End Function
Visual C#
public List<Bitmap> ScanDocument() { List<Bitmap> docs = new List<Bitmap>(); string currFilename; // Create a scanner instance (the user can select if more than one) ItemClass scanner = (ItemClass)wiaManager.Create(ref missing); // Show the standard scanning dialog (this is not a required step...) CollectionClass scans = scanner.GetItemsFromUI( WiaFlag.SingleImage, WiaIntent.ImageTypeText) as CollectionClass; // If the user clicks Cancel, collection is NULL if (scans != null && scans.Count > 0) { // Transfer any scanned pictures to disk ItemClass scan; foreach (object wiaObj in scans) { scan = (ItemClass)Marshal.CreateWrapperOfType(wiaObj, typeof(ItemClass)); // create temporary file for image currFilename = Path.GetTempFileName(); // transfer picture to our temporary file scan.Transfer(currFilename, false); // Create a Bitmap from the loaded file (Image/Bitmap.FromFile locks the file...) using (FileStream fs = new FileStream(currFilename, FileMode.Open, FileAccess.Read)) {
docs.Add(new Bitmap(Bitmap.FromStream(fs))); fs.Close(); } // Don't leave junk behind! File.Delete(currFilename); } } return docs; }
Once the new image is scanned, it's displayed in the Source tab in the user interface.
Image 3: Displaying a scanned image
The standard scanning dialog lets you crop an image using the handles, but only in preview mode (you need to scan it a second time after selecting the crop region). I own two scanners. One is a standard flatbed, but the other is a compact travel scanner that pulls the sheet through (the Ambir TravelScan 600). The TravelScan doesn't work so well in that mode since scanning a second time will always be slightly different due to the feed mechanism. I'd love to know why I can't just crop final my image in the scanner dialog, but, oh well!
My solution was to add cropping to the application after it's scanned. It takes longer to do a low-quality preview along with a full-quality scan on a full sheet of paper than just doing one full-quality scan. Since my primary design goal was to be able to scan and catalog sheets of paper (not business cards, photos, etc), this made sense.
Once the image is scanned, it automatically grabs a pixel from the bottom row. It assumes that this is an unimportant edge. You can choose a different edge color by clicking in the picture. Then, use the LockBits method of the Bitmap object to get access to the raw bits. These bits are then cycled through row-by-row to the bottom. Each pixel is compared to the crop color based on the supplied threshold. Border matches are compared to the known top/left/right/bottom border locations and moved as necessary. If a crop doesn't come out right, you can just change the threshold value and try again.
Think of it as starting on the four edges with straight-edges. Drag those straight-edges toward the middle until you encounter a color that's more than the defined threshold different from the specified "border color." Note that this won't work so well for you if your scanner adds a little band around the edges of the images. One of my scanners is a little bit extra bright around the edges. If the border is overall black, that throws it off. I found that just dragging the crop region slightly in the scanning dialog helped. If it's not a full sheet document, you won't lose anything. It's a fairly simple and brute force way of cropping. A real imaging application would have some super-slick way to do it in a millisecond or two with more accuracy I'm sure!
Visual Basic
For y As Integer = 0 To img.Image.Height - 1 'loop through pixels on Y axis until end of image height For x As Integer = 0 To img.Image.Width - 1 'loop through pixels on X axis until end of image width ' Scans won't have perfect background... (only need one channel if b&w) If Math.Abs(img.RawBits(bufferLoc + 0) - cropColor.B) > threshold Then 'Determine if pixel is further left than the value we already have If leftEdge = -1 OrElse x < leftEdge Then leftEdge = x End If 'Determine if pixel is further to the top than the value we already have If topEdge = -1 Then topEdge = y End If 'Determine if pixel is further right than the value we already have If (rightEdge = -1) OrElse x > rightEdge Then rightEdge = x End If 'Determine if pixel is further to the bottom than the value we already have If (bottomEdge = -1) OrElse y > bottomEdge Then bottomEdge = y End If End If ' LEARNED: Could be 32-bit, 24-bit, 16-bit, etc. bufferLoc += img.PixelWidth Next ' LEARNED: Extra byte(s) per line for 4-byte boundary... bufferLoc += img.RowPadding Next
Visual C#
//loop through pixels on Y axis until end of image height for (int y = 0; y < img.Image.Height; y++) { //loop through pixels on X axis until end of image width for (int x = 0; x < img.Image.Width; x++) { // Scans won't have perfect background... (only need one channel if b&w) if (Math.Abs(img.RawBits[bufferLoc + 0] - cropColor.B) > threshold) { //Determine if pixel is further left than the value we already have if (leftEdge == -1 || x < leftEdge) leftEdge = x; //Determine if pixel is further to the top than the value we already have if (topEdge == -1) topEdge = y; //Determine if pixel is further right than the value we already have if ((rightEdge == -1) || x > rightEdge) rightEdge = x; //Determine if pixel is further to the bottom than the value we already have if ((bottomEdge == -1) || y > bottomEdge) { bottomEdge = y; } } // LEARNED: Could be 32-bit, 24-bit, 16-bit, etc. bufferLoc += img.PixelWidth; } // LEARNED: Extra byte(s) per line for 4-byte boundary... bufferLoc += img.RowPadding; }
Image 4 : The same image, auto-cropped (threshold=82)
I learned a few things in all of this. Bitmaps have many ways that they can be stored. Pixels can be anywhere from 1-bit up to 12-bit, with and without an additional alpha component (for transparency). This makes it impossible to just copy the bitmap to a byte array and just look at each byte as a pixel. To make it worse, in some pixel formats, the RGB order may not be the same, and some index the colors rather than using RGB at all. Phew!
Even once you know the pixel width, you can't assume that [pixel width * width] is equal to size of a row. This is due to the fact that bytes are packed for optimum efficiency based on maintaining a 4-byte boundary. In other words, the actual width of bytes in a row will always be divisible by four. The actual width of bytes is called the "stride."
You can obtain all of the numbers when you grab the bitmap's bits, but it still complicates things when you cycle through the pixels. My solution (*ahem* cheat) was to convert the image to a fixed 32-bit RGB structure (no Alpha component). That way I always knew the pixel width and row padding.
I did two more things to simplify my routine. Before grabbing the bits, I converted the image to black-and-white and shrunk to 1/4th its size. This gave me significantly fewer bytes to compare, and being black-and-white, I didn't need to compare all three channels for detecting the border. In order to perform the black-and-white conversion, I used a ColorMatrix object to transform the color values. I used code from this blog entry to figure out the values to use.
Now you might be wondering why I didn't go with a simple 8-bit greyscale image format. The bottom line is I'm probably not smart enough! The 8-bit formats were indexed (you'd expect a simple intensity value for each byte...), and if you try to grab the Graphics object for drawing a cropped image, you get a GDI+ exception every time. Maybe with more effort I could have figured it out (and maybe I still will!), but it works well enough for now. My speed tests from cropping the original versus the shrunken black-and-white are pretty conclusive so I stuck with it.
With the image scanned and (optionally) cropped, the final step is to key in the metadata and save the file. I've been searching for a way to work with metadata from .NET for too long. The Image class has methods (Get/SetPropertyItem) for dealing with metadata properties, but they are mostly suited for reading the fields, and if you use them on an existing file you end up re-encoding the bitmap (recompressing...).
The best solution turned out to be the Microsoft DSO OLE Document Properties Reader 2.1 (often referred to as simply DSOFile). This wraps the IPropertyStrorage COM class which is typically used for reading properties in OLE-based Office documents, and also adds the ability to read from the new XML-based Office formats, and many other files. Using this sample object, you can access the Title, Subject, Author, Category, Keywords, and Comments field that you see in the Summary tab of all files in Windows. This does vary a bit depending on file type, but it's very easy to use.
Once a scan has taken place, the user can choose to crop it. When Save is clicked, the method saves the file (choosing the cropped version, if present). This happens by setting the compression level and codec, and calling Save. The drop-down box for choosing codec is actually populated by the System.Imaging.ImageCodecInfo collection.
Once the file is saved, the metadata can be set. It's possible to use the built-in Image methods to do this prior to saving, but the DSOFile code is just so convenient for working with metadata. Data need not be converted to byte arrays, and fields are set using standard properties, not numeric ID's.
Visual Basic
' Grab file info if it exists Dim fi As New FileInfo(filename) If Not fi.Exists Then Return False End If ' Update the file creation date based on the supplied date If creationDate.HasValue Then fi.CreationTime = creationDate.Value End If ' Create and open the properties object oleDocument = New OleDocumentPropertiesClass() oleDocument.Open(fi.FullName, False, dsoFileOpenOptions.dsoOptionDefault) If Not oleDocument.IsReadOnly Then ' Keywords should be semicolon-separated, not carriage returns or commas Dim keywordsSeparated As String = keywords.Replace(Chr(10), ";"c).Replace(","c, ";"c) oleDocument.SummaryProperties.Author = docSource oleDocument.SummaryProperties.Comments = comments oleDocument.SummaryProperties.Keywords = keywordsSeparated oleDocument.SummaryProperties.Subject = docType oleDocument.SummaryProperties.Title = title If oleDocument.IsDirty Then oleDocument.Save() End If Return True End If
Visual C#
// Grab file info if it exists FileInfo fi = new FileInfo(filename); if (!fi.Exists) return false; // Update the file creation date based on the supplied date if (creationDate.HasValue) fi.CreationTime = creationDate.Value; // Create and open the properties object oleDocument = new OleDocumentPropertiesClass(); oleDocument.Open(fi.FullName, false, dsoFileOpenOptions.dsoOptionDefault); if (!oleDocument.IsReadOnly) { // Keywords should be semicolon-separated, not carriage returns or commas string keywordsSeparated = keywords.Replace('\n', ';').Replace(',', ';'); oleDocument.SummaryProperties.Author = docSource; oleDocument.SummaryProperties.Comments = comments; oleDocument.SummaryProperties.Keywords = keywordsSeparated; oleDocument.SummaryProperties.Subject = docType; oleDocument.SummaryProperties.Title = title; if (oleDocument.IsDirty) oleDocument.Save(); return true; }
Both reading and writing properties is easy, though keep in mind that only JPEG and TIFF support properties well (no warnings or errors will occur with other containers/formats).
Image 5: Showing the saved metadata
The Received date picker doesn't actually update metadata. It works using the FileInfo object, updating the CreationTime property which directly updates the file. This is only done if the date field's checkbox is checked.
There are so many things to add to this start! Faster cropping would be a start... I also had wanted to add deskewing (straightening), but I quickly realized that I wouldn't be able to do that feature justice. With enough time maybe!
The UI itself could use some better design, multi-part documents would be good (for TIFF or GIF formats). I had also planned on integrating Windows contacts with the From ComboBox, but I didn't get to it. Image rotation can be useful and isn't terribly difficult.
Once you get into the document imaging world, there are so many possibilities and expectations. I'm afraid that this just scratches the surface, but I'm pleased with it as a starting point. If anyone's interested in taking it further, let me know!
In this article, we've learned more about scanning image manipulation, and working with file metadata. The application is useable, but definitely not commercial-grade! For any comments, questions, or suggestions, contact me through my blog. Happy scanning!
![]() |
Arian Kulp is an independent software developer and writer working in the Midwest. He has been coding since the fifth grade on various platforms, and also enjoys photography, nature, and spending time with his family. Arian can be reached through his web site at http://www.ariankulp.com. |
Make this a Windows Home Server Add-In that is always available and automatically puts the scanned docs in the right place in a shared folder...
Nice idea. As a future feature maybe someone can add support to then send these documents over to a Sharepoint Document Library (where metadata information can easily be searched on and the innards of scanned documents can be OCRed into Word documents or PDFs).
@Eric: Did you try to get it to work under Win2k?
this is a cool program. Is that anyway I could run this program on Windows 2000?
Hi,
This is an example of just what I need, however, what must I do to get it to run on VISTA, I get an error that it can't find wiaLib?
@Robert: Email me, use the contact page at the top of the blog.
OK, I have spent 2 days trying to get either of these examples to work in Vista, with the new 2008 express versions of C##, and VB. Is there anyone out there who can tell me what has to happen to make it work.
Thank You
The summary for the ScanDocument function says "Currently this is limited to single-image scans, but it is easy to modify." For the life of me, I can't figure out how to get this to batch scan multiple documents using a document feeder. Any ideas?
Hi - good work!
I came across your work when (web) searching for hints on using Windows Desktop Search to read the OLE Storage property pages (which are the "Properties" attached to Word and other documents).
[ I haven't found a solution - which is why I'm posing the question to you ]
I have used the DSOFile before (made sure of getting the latest version) with both Win32 and .NET, and for non-OLE Document files.
I'm just surprised that Windows Desktop Search 4 (and probably 3.x) don't index the property sheets that some users fill in quite diligently. I suspect it is because Vista uses a different system for its file metadata (I'm not sure of this, though).
Hi, first of all nice job.
But, i am programming something very similar. I use WIA for scanning pictures from scanner, and
WIA.Item lastItem = scannerDevice.ExecuteCommand( CommandID.wiaCommandTakePicture );
throws and exception not implemented. can smbdy help me? i tried by
scannerDevice.Items[this.m_ScannerIndex + 1].Transfer( imageFormat )
but for first picture No problemo...but second callin this code returns me same pricture as before. What i did wrong?
I have found a way to make this work for automatic document feeders. Here is the code:
Public Function ScanDocument()
Dim hasMorePages As Boolean
Dim docs As New List(Of Bitmap)()
'Dim docs As Bitmap
Dim currFilename As String
' Create a scanner instance (the user can select if more than one)
Dim scanner As ItemClass = DirectCast(wiaManager.Create(missing), ItemClass)
' Show the standard scanning dialog (this is not a required step...)
'Dim scans As CollectionClass = TryCast( _
'scanner.GetItemsFromUI(WiaFlag.SingleImage, WiaIntent.ImageTypeText), _
'CollectionClass)
Dim scans As CollectionClass = TryCast(scanner.GetItemsFromUI(WiaFlag.SingleImage, WiaIntent.ImageTypeColor), CollectionClass)
' If the user clicks Cancel, collection is NULL
If scans IsNot Nothing AndAlso scans.Count > 0 Then
' Transfer any scanned pictures to disk
Dim scan As ItemClass
For Each wiaObj As Object In scans
scan = DirectCast( _
Marshal.CreateWrapperOfType(wiaObj, GetType(ItemClass)), _
ItemClass)
' create temporary file for image
currFilename = Path.GetTempFileName()
' transfer picture to our temporary file
hasMorePages = True
Do While hasMorePages
scan.Transfer(currFilename, False) 'doit être exécuté pour chaque page!
'vérifier si le feeder est prêt pour une autre page!
hasMorePages = Convert.ToUInt32(scanner.GetPropById(WiaItemPropertyId.ScannerDeviceDocumentHandlingStatus))
'MessageBox.Show(hasMorePages)
' Create a Bitmap from the loaded file (Image.FromFile locks the file...)
Using fs As New FileStream( _
currFilename, FileMode.Open, FileAccess.Read)
' KLUDGE: Must wrap the FromStream Image with a new Bitmap.
' Otherwise get OutOfMemoryException later when using ColorMatrix on it.
Dim myimage As New Bitmap(Image.FromStream(fs))
Dim MyThumbNail As Image
MyThumbNail = myimage.GetThumbnailImage(632, 825, AddressOf ThumbNailAbort, Nothing)
docs.Add(MyThumbNail)
fs.Close()
End Using
' Don't leave junk behind!
File.Delete(currFilename)
Loop
Next
End If
Return docs
End Function
The point of all this is to call the transfer method from the scanner for each page, verifying each time if a page is in the feeder.
Emmanuel, Arian
I like the idea of what you're doing folks. I had been thinking along the lines of creating a scanning app but I suppose 'stuff' got in the way
Does your code create a single output for the batch of scanned images or does it create mulltiples, Emmanuel?
Please... do not reply in French or I'll have to spend a couple of months translating it!
what about double sided scan? any clue?
@JaK depends how the scanner works.
Coding4Fun attempts to show how to do some of the hard stuff but we can't cover every edge case such as this.
I triying to do it with WIA 2.0 (not appear WiaLib) so, when i try to call at ShowSelectedDevice method appear a InvalidCastException Error.
will leave the code here.
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
Device dDevice = SelectDevice();
}
protected Device SelectDevice()
{
Device devScanner = null;
CommonDialogClass cDialogScanner = new CommonDialogClass();
devScanner = cDialogScanner.ShowSelectDevice(WiaDeviceType.ScannerDeviceType, false, false);
return devScanner;
}
and appear InvalidExceptionCast...
I working with VS2008 in C# on VISTA
my scanner HP ScanJet 3400C
digitalports gmail com
@Alfredo ShowSelectDevice is returning a different typed object then. It isn't returning a "Device" object
I am trying to save multiple scans (pages) in one TIFF file. I thought changing
CollectionClass scans = scanner.GetItemsFromUI(WiaFlag.SingleImage, WiaIntent.ImageTypeText) as CollectionClass;
into
CollectionClass scans = scanner.GetItemsFromUI(WiaFlag.UseCommonUI, WiaIntent.MinimizeSize) as CollectionClass;
will do it, but it didn't.
I have one HP Scanjet G3010, on WinXP SP3, MS Visual Studio 2008. Using HP Solution Center (proprietary software) I am able to scan multiple pages in one TIFF file. I would really like to do it from your application.
Your help/reply will be much appreciated. Thank you.
After some research, I will go with this method https://msdn.microsoft.com/en-us/library/ms630819(VS.85).aspx#FilterSharedSample015
Anyway, if anybody has a better ideea, please let me know.
Regards.
Sorry, I don't know how to edit the previous comment. I hope is obvious that I asked for a C# better ideea for my project. Thank you.
@Cristian without relearning how this article works again, if your MSDN solution works, go with it.
Hello, I want to make the capture whitout the line:
CollectionClass scans = scanner.GetItemsFromUI(WiaFlag.SingleImage, WiaIntent.ImageTypeColor) as CollectionClass;
I want to give the parametres, that It will be automatic
but I do't know how.
Any idea??
@cerinzano Sorry, this is a bit older and I never went back to update it. A more recent example of WIA scanning in .NET that makes it easier to scan without the UI can be found here: 10rem.net/.../scanning-images-in-wpf-via-wia