Solutions for managing knowledge as content & print
Key info for users and decision makers for Xerox DocuShare and ABBYY recognition products.
Crawlers Aren't So Creepy
ABBYY's new iFilter for SharePoint comes free with Recognition Server and can do a thorough job of updating image-only PDFs and other image formats into searchable PDFs. You have several options to choose when deciding how to affect your files. The default is to sweep through and find unsearchable image files, OCR them, and feed the converted text back into SharePoint's metadata without altering the original image file so that you can conduct full-text searching and find the document(s) you seek. The downside of this approach is that when you find and retrieve the document, a PDF in this scenario, you can't search within it on your PC as it's still just an image file. You also can't copy text from it. If that meets your requirements, then that's a good option for you. Another option is to convert the file to a searchable PDF and add it back to your library's folder as a new and separate file. That can be a little confusing as to which is searchable and which isn't. Another option is to check the searchable PDF back in as a new version. The two versions will look the same, but one will have a text layer. You'll want to make sure your storage volume can handle the additional space requirements. The last option would be to overwrite the original file. That's a good space saving method compared with adding another version as before. One of these options will work for you.
Recognition Server also has the ability to perform either Fast and Thorough analysis of your PDFs. Fast looks only at the first page to see if there is a text layer and determines if the text quality is sufficiently accurate to skip over. If it's determined to be of poor quality (doesn't closely meet ABBYY's OCR quality) then the document is processed and the result is returned to SharePoint. The Thorough method checks the entire document for OCR accuracy. If it was emitted by a PDF driver the accuracy should be perfect and Recognition Server will determine that it doesn't need to be altered and return any results. If it's not searchable or very accurate, Recognition Server will process it and return the results in your chosen manner. The Thorough method will consume page counts in your license for documents that don't need to be processed just as it does for those that do, so you may want to set up a filter for documents you can identify as coming from PDF emitters rather than scanners.
Our DocuShare OCR Crawler uses the new version method to return results. We check to see if your document is searchable within DocuShare before retrieving it for conversion. This is done by having indexing on for PDFs (the default setting) and having abstracts turned on for PDFs (the default setting). Our crawler looks to see if your document is searchable by seeing if there's any content in the abstract. The abstract is the snippet of text that reveals the content of the content-indexed document when you find it in your search results. This saves your page count license by not having to retrieve the document to see if it requires OCR processing. Once converted to a searchable PDF, we check it back in as a new version and tag that specific version so that it will be skipped when looking for new documents to process. If you were to print the document, sign it, and upload it again as a new document or new version the DocuShare OCR Crawler would reprocess it on its next round.
Crawlers run on schedules, so you'd typically tell it to look for new conversion candidates every few hours. You can set them more frequently if they prevent themselves from running multiple executions and bogging down your content management system, as does the DocuShare OCR Crawler.
For more information about the ABBYY Recognition Server iFilter for SharePoint
For more information about our DocuShare OCR Crawler for DocuShare
Accounts Payable Automation with FlexiCapture for Invoices
ABBYY FlexiCapture for Invoices is a ready-to-run accounts payable automation solution that delivers all essential functionality for establishing fast, cost-effective and transparent invoice processing — from documents arrival to posting. Here's a short video (5 mins 25 secs) that should serve as a good introduction.
Here's another ABBYY data capture cartoon under two minutes in length that's more entertaining than informative.
Hey! My DocuShare Windows Client stopped working
Browser support for VB scripts is going away or may already be gone for you. A couple of our customers have recently noted that they're having trouble using the check-in forms implemented in the DocuShare Windows Client. This is because their more complex set of forms relies on VB scripts and the latest versions of Internet Explorer no longer support VB scripts. What to do?
This is easy to fix. Right-click on your DocuShare Client in the Windows tool tray and select DocuShare Client Properties. On the General tab, in about the middle of the window, click on "Use simple check-in form". Click OK. Try again. You'll find a simplified set of forms to use with roughly the same functionality, but they don't depend on the browser's support for VB scripts.
Here's what the simple forms interface looks like.
Remember that Windows 7 is the last OS that Xerox will release for which they will support DocuShare Windows Client. Consider using DocuShare Drive instead as it also support Windows 8.X. We're told that it will soon support Windows 10, Xerox is just waiting for Microsoft to change the way they certify applications.
Would you like help upgrading?
What does DocuShare 7 look like? You can see it at the DocuShare Users Group site. If you don't already have an account, you can set one up for yourself. If you like that self-registration feature, we can install one for you.
Would you like an overview and brief demonstration before deciding to upgrade? If so, I'll arrange a webinar for you to attend.
ABBYY Customer Success Story
RWS, one of the world's leading patent translation and search companies, processes tens of millions of documents with ABBYY Recognition Server. You can see their ABBYY success story here.
Currently discounted items
We work with a syndicated reseller of ABBYY retail products that gives us access to discounts they work out with ABBYY USA which allows us to present several of ABBYY's retail products with steep discounts. The sales are generally one to three weeks in duration, so if you're interested in FineReader for Windows or Mac, PDF Transformer+, Screenshot Reader, or various other bundled deals, keep your good eye on our On Sale page.
Scanner of the Month: Epson WorkForce DS-510 for only $318.00
Wouldn't you like a good scanner to file your tax statements and supporting documentation on your PC or home server? The scanners we sell are a cut above what you'll find at Office Depot. Unless you like flipping paper over and rescanning the backside and merging pages of the two files together, you'll want a scanner like this that scans both sides at once and with greater image clarity. It can scan up to 50 sheets at a time at 26 pages per minute for one-sided print or 52 pages per minute for two-sided copy. It comes with Epson® Scan, Document Capture Pro, ABBYY® FineReader® OCR, NewSoft TM Presto! ® BizCard OCR (Windows on CD, Mac via web download), and EMC Captiva ISIS (Windows only, via web download). It's important that your scanner is TWAIN compliant, as just about every application works with TWAIN scanners. LED scanners don't require a warm-up period. One year limited warranty included for the U.S. and Canada.
Here's our special offer on the Epson WorkForce DS-510. Create an account and login to see our prices that are below recommended pricing.
If you're looking for a new scanner, be sure to shop with us for really good prices and the best side-by-side comparison page anywhere. You'll find this and our unique application enablers for DocuShare and ABBYY products on our CriteriaFirstWare site.