Solutions for managing knowledge in content & print

Key info for users and decision makers for Xerox DocuShare and ABBYY recognition products.

DocuShare 7 is available Now!

Remember the Eagles' album "Hell Freezes Over!"?  How long have we been waiting for DocuShare 7?  DocuShare 6 appeared in 2007 and has been followed by versions 6.5 and 6.6 with significant enhancements.  So, what took so long?  Is DocuShare 7 a significant rewrite of the base code?  No, so your customizations may continue to work with little or no changes.  If you have customized your DocuShare site, download a 30 day trial and test your changes in the DocuShare 7 environment.  DocuShare 7 has a new responsive interface that adapts to PCs, tablets, and phones that may reduce or eliminate the need for other apps to present DocuShare's content on smaller displays.  Mobile access is still available.  What else is new?  Here's a brief list of the 60+ new features and enhancements ...

  • Drag & drop multi-file upload to web, multi-file download to zip files.
  • Configurable user interface (UI) elements (easier than modifying VDFs).
  • Document routing improvements.
  • Create and run workflow routing slips on the fly.
  • New Content Rule actions & management interface.
  • Lifecycle Management improvements.
  • New reporting capabilities - Type Reports and Permission Reports.
  • Allow third party apps to integrate with DocuShare through a parameterized search URL.
  • Expanded object type support.
  • Homepage improvements and options to easily change banner images.
  • Newer, faster, higher capacity version of the Autonomy IDOL search engine.

This easily qualifies as a major release.  The browser-based drag & drop interface may eliminate your need for the DocuShare Windows client software options.  I know this doesn't answer all your questions for more information look at the following links for additional information and call or write to us with your specific questions or concerns.

Brochure: http://www.criteriafirst.com/library/DocuShare_7_brochure.pdf

Other resources: http://docushare.xerox.com/resource/resource_pb.html

Download DocuShare 7: http://docushare.xerox.com/products/ds_products_trial_eng.html

 

Streamline Your Invoice Processing  - replay at your convenience

Streamline Your Invoice Processing – A Complete Solution from Xerox and ABBYY
We sat in on the presentation last week and liked what we saw.  The demonstration was smooth and logical.  If you'd missed this you can view it online as a saved video and a slideshow. 

Webinar Recording (includes demonstrations)

http://go.pardot.com/e/71072/play-id-ggk3gt/y1lyn/73362715

Slides on SlideShare

http://go.pardot.com/e/71072/secret-3hZMM45E0SLXET/y1lyq/73362715

If you find this relates to your interests, give us a call.  Criteria First is a Value-Added Reseller and integrator for both DocuShare and ABBYY FlexiCapture.  We're in a unique position to save you money when implementing this type of solution.


Contributors vs Consumers

Who are they and what are they doing in DocuShare?  There are two ways to tell.

1.  Have your DocuShare administrator go into the Admin Home to generate a report of users showing when they were added as a user, what level of user they are such as DocuShare (contributor) / Read-Only (secure consumer) / CPBYY reX (power user), how many documents they own (contributed), and the overall size of their contributions, and the last time they'd logged in.  This is called the
Repository Statistics by User report.  There's another such report for groups just under Repository Use in the Content Management section called Group Statistics, but it doesn't reveal activity.  For reports by group ...

2.  Use our DocuShare Audit Assistant to show what users have done in the last year by listing all their activities including what they'd contributed, which collections they've looked in, what documents they've looked at, and which they've checked out to update and return as a new version.  It can reveal much more than this, but that's the basics.

Now, what to do with this?  Save the reports and post them where others can see them.  It's like a productivity report and you can post that report card in a collection shown on the DocuShare homepage for everyone to see.  You could also use this in your performance reviews.  If your people aren't using documents to contribute knowledge to your organization they ought to at least be reviewing many of them.

 

Innovative Idea of the Month

Would you be interested in a DocuShare classifier?  This would essentially be a program that crawls your specific collections, looks at your documents, derives information about what type of document they are, extracts specific information to apply as searchable metadata, and makes them easier to categorize and find using DocuShare's advanced search interface or the new Quick Search interface.  This would be useful for DocuShare sites where people bulk upload folders full of documents at the end of the day or the completion of a project, but not really in a manner that makes it easy to find key documents you need.

If you're interested, click on the suggestion box to send us a note with your thoughts on this.

This email address is being protected from spambots. You need JavaScript enabled to view it.

 

Why image-based capture is best

... vs OCR everything and try to parse out data.  You'll end up verifying all the words and numbers you don't care about if you bother to make any corrections at all.   You'll likely want to bypass verifying that data-mass and thereby miss incorrectly recognized data and never see it again.  The PDFs generated won't have a consistent internal structure you can traverse to find your data.  Leveraging x/y coordinates on the page makes capture more predictable and reliable.  Before your document becomes a PDF ABBYY stores in a proprietary internal format that the data capture tool uses to map the data of your pages to identify alphanumeric content, graphical content (lines, checkboxes, bubbles, etc.), images, barcodes, and anchors so that it can present to you only the content that relates to the data you're trying to capture.  Some of this data makes sure your page is straight, correctly sized, and upright for optimal recognition.  It also separates data you determine is important from the rest of the form or document.  If you want to capture seven fields of data from your document, this ensures they're correctly recognized or easy to focus on to correct. 

Rules can be applied to determine if the data type matches what you're hoping to find, such as in a social security number pattern of nnn-nn-nnnn.  Rules can be built-in or added by the administrator to cover names, addresses, monetary values, account numbers, order numbers, date, time, etc..  Because of the internal proprietary format of the document, the application can easily isolate and capture such data.  If you are parsing PDFs for content you may find yourself dealing with many different ways in which other emitters and engines mark up the data that your parser has to deal with.  It may be far beyond your control to affect which OCR products others use to generate PDFs from scanned documents and the internals to the PDF from various emitters will differ, too.  Emitters may be from Adobe, Nuance, ABBYY, Microsoft, DocuCom, and many others.  This places a huge burdon on the parser to interpret all these input sources.

Also, once the entire document is OCR'd for parsing, you may find you're better off not storing the document in a searchable format.  Why?  Bloat.  There's no benefit to forcing redundant form verbiage or extraneous invoice blather down the throat of your document managements systems' index database.  It increases maintenance and storage concerns while causing distracting search results.  It's better to strip off the text layer and save the image-based, non-searchable PDF file with the extracted metadata that you'll be searching for to find the document.  Index bloating means that you'll find product codes, drivers licenses, street addresses, postal codes, and other junk when searching for a PO#.  The objective is to search for something specific and find it, not to see how many results you can scare up. 
Recommended reading: Insights into ABBYY FlexiCapture

 

Recognition Server on SSDs

With the increased use of solid state drives (SSDs) in servers we should soon be find OCR and ICR recognition speeds increasing significantly.  If you are considering using SSDs in servers, give consideration to using them in conjunction with products like Recognition Servers and FlexiCapture.  While we haven't benchmarked a server using solely SSDs at this time, we have added processing stations to our demo server to test throughput and one of the processing stations consisted of a gaming PC with an SSD.  What we observed is the CPU rarely maxed out even momentarily as is often the case with regular drives.  We believe that the IO of the drive was fast enough to prevent the CPU from topping off.  Because the stations are load balanced it is next to impossible to compare the overall results of the one with the SSD to the other or the main server.  It just appeared to finish sooner.  If you've never seen a demonstration of the storage speeds acheivable by connecting multiple drives together in a RAID configuration, have a look at this demonstration from 2011 ...

What happens when you RAID 24 SSD Hard Disks - YouTube
https://www.youtube.com/watch?v=eULFf6F5Ri8

PC Magazine uses ABBYY FineReader in benchmarking CPUs due to how well it balances the load on multiple cores and comes close to saturating the capacity of the CPU overall.  We're waiting for a customer to ask us for a high-performance server for running ABBYY Recognition Server.  We'd like to burn that in for them and do some throughput timing.  It doesn't have to be a large server if your output goes to another server or volume.

 

Currently discounted items

We work with a syndicated reseller of ABBYY products that gives us access to discounts they work out with ABBYY USA which allows us to present several of ABBYY's retail products with steep discounts.  The sales are generally one to three weeks in duration, so if you're interested in FineReader for Windows or Mac, PDF Transformer+, Screenshot Reader, or various bundled deals, keep an eye on our On Sale page.

 

If you're looking for a new scanner, be sure to shop with us for really good prices and the best side-by-side comparison page anywhere.  You'll find this and our unique application enablers for DocuShare and ABBYY products on our CriteriaFirstWare site.

If you're looking for a new scanner, be sure to shop with us for really good prices and the best side-by-side comparison page anywhere.  You'll find this and our unique application enablers for DocuShare and ABBYY products on our CriteriaFirstWare site.