Blog Post

OCR search indexing in SkyDrive and Windows 8.1

Sign in to queue

The Discussion

  • User profile image
    dagkon

    Hi

    How about PDF files? Can we search in them when they are stored in SkyDrive?

    /dag

  • User profile image
    pneborg

    I was able to successfully upload a picture that contained text through windows 8.1 camera roll. I copied my image to the camera roll folder after enabling the camera roll sync within windows 8.1. The OCR worked and generated extracted text. However, I am not able to obtain any search results using words from the extract text. I tried the search within Skydrive from a browser as well as from Windows 8.1 Search Files (as shown in the Channel9 demo from September on Skydrive's OCR). Can anyone shed light on why OCR extracted text is not part of the search index from within SkyDrive? What do I need to get this working as it was shown in the Channel9 demo?  I waited 24 hours and still no search results for any extract text words.

    Thank you

  • User profile image
    Tim

    Hi Patrick, did you ever get an answer to your question? I am also running into the same issue. Thanks!

  • User profile image
    pneborg

    HI Tim, In preparing for my PluralSight course on SkyDrive Collaboration, Communication & Cloud Storage I have a section on SkyDrive's OCR feature, I was able to figure out what is occurring. Please check out my course to see a demonstration and more details of SkyDrive's OCR. Here is a summary that hope will help clear up things up for you:

    1. You cannot search for a Photo's OCR extract text from with SkyDrive's online search mechanism (Please Microsoft consider adding this to SkyDrive's online search). The searching that is currently available is exclusive to windows 8.1 file explorer search and is integrated with windows 8.1 indexing options
    2. So you need to make sure your indexing options (win-c, find indexing options) is configure to index your SkyDrive folder
    3. OCR will initiate from a photo dropped in your win 8.1 camera folder or phone camera roll. SkyDrive currently does not OCR process other photos uploaded to other SkyDrive folders (Microsoft please consider adding this to a future release)
    4. Issue and Work around: There seems to be a timing bug. After you drop a photo into your camera roll (or take a photo on a windows phone), and it is uploaded to skydrive. It is then sent off to MS OCR service asynchronously. The extracted text will appear subsequently. However prior to the extract text being added to the photo, the photo is synced to your window's win8.1 SkyDrive folder (also named camera roll) without the extract text. So win8.1 index does not have extracted text for the photo to index. Even after the OCR service is added, that newly added extract text is not synced to you win8.1 skydrive's camera folder apparently. But (here is my work around), if you then go to your photo online within SkyDrive and use the edit Extract Text function or add a caption to your photo, then you trigger a refresh download sync to occur. This time the photo has its extracted text so it is added to windows 8.1 search index (might take a minute for the sync and index to catch up). And from that point you can search for your photo's ocr extract text from win8.1 file explorer search. (Microsoft please consider fixing this timing issue)
    5. With respect to searching, even after applying the work around above there are a few quirks. If you search your local SkyDrive folder cache it works. If you search c:\users\... which contains your default local SkyDrive folder cache it works. If you search "This PC" it does not come back with reliable results oddly. Another area for potential improvement. I get some sporadic results from win-F search as well
  • User profile image
    Tim

    Thank you so much Patrick. Your suggestions worked like a charm. I hope Microsoft decides to implement some of your suggestions.

Add Your 2 Cents