#Best Intelligent Document Processing tool in the market

20 messages · Page 1 of 1 (latest)

random widget
#

I have seen some of the potential of Abbyy Vantage and UiPath Document Understanding yet I know there are many other vendors offering other products that can compete with these especially adding pretrainned models and robust AI features... just wanted to hear from the community what would be the best IDP tool out there in the market for enterprises (5000+ employees) as it relates to user experience, accuracy in results, integration with other apps, pricing, complexity and others.

potent loom
#

Pretty sure the UiPath one is just Azure AI Document Intelligence, maybe with some custom training...there's also google cloud vision, tesserect v4, abbyy etc but my experience is you just have to test them yourself for accuracy

#

they perform differently based on document type/content/layout so it's not really one size fits all

storm oasis
#

I've seen some interesting thing going on recently where you can create your own version of openai's LLM on azure and use that without fearing data will be used to train publicly available chat gpt and leak somewhere. See: #automation-anywhere-help message

dusk reef
#

Best hands down is Abby.. (but it's pricey)

A new kid on the block with some really cool features like Invoice data extraction without training is given by "nano-Nets"

zinc olive
potent loom
#

Realistically it's a GUI over OpenCV...nothing that can't be done in a few lines of python regardless. Waste of money.

zinc olive
potent loom
#

Also interested in Re:Infer (the company UiPath acquired for NLP / communications mining) and whether UiPath is just going to acquire instead of build from now on, which is probably more sensible.

zinc olive
# potent loom Thanks for the details! You seem to know a lot - is that info from a tech talk s...

It was mentioned here; https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/release-notes-cloud-december-2022#13-december-2022 .
A lot of the development is not so much of making the models, in a sense anyone can do that, especcially with all the open source stuff that is available. A lot more is about the usability, trying to have not so technical users use the technology. You can see this for example with DU, which is now much more user friendly than it was 2 years ago, or something like forms-ai. For Re:infer it was actually the same, they have a great interface/UI , statistics and easy training workflow around the NLP model, that users/partners after some training can use themselves. Would cost UiPath a long time to build something like that. So that is where the value is, if that is enough is of course for anyone to decide.
Current developments are a lot centered around LLM models, integrating with all kinds of LLM models but also enhancing the platform, like with the just anounced Autopilot. For DU you now have an activity package where you can just formulate some questions and it will extract the value from a document like a contract. So for example; "What is the start date of the contract ?" and the output will be a variable startdate. Its also not only openai, its using smaller propriarity models, specialized models and integrations with openai, AWS sagemaker etc.

potent loom
#

Wow thanks for the detail and perspective. I'm not an expert on the ML stuff but I have dabbled in some basics. I'm actually really interested in just how much fine-tuning actually gets done on top of the transformers, because enterprise software always tries to oversell their capabilities. My personal opinion is still that vendors are rushing to satisfy industry hype/buzzwords than adequately solving any actual use cases. Some other companies are extremely transparent about their ML metrics (eg compute power, accuracy, white papers on their methods etc) but I'm not sure about RPA vendors lol.

zinc olive
#

The challenge is training data and labelling. That can sometimes take a lot of work. And by sales and marketing people that is sometimes oversold, like its plug and play. But for specialized models, like invoices that is often not the case.

#

So you want a friendly user inteface for that instead of more expensive technical people having to deal with that.

random widget
random widget
potent loom
#

I have been researching this myself...there's some established datasets that all the models get benchmarked on but the resources are very patchy.
https://paperswithcode.com/dataset/funsd
is one example of a benchmark. I'll post here if I find anything more comprehensive.

Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis,...

#

Eventually planning to make a video about these solutions and their value-add...

storm oasis