I have seen some of the potential of Abbyy Vantage and UiPath Document Understanding yet I know there are many other vendors offering other products that can compete with these especially adding pretrainned models and robust AI features... just wanted to hear from the community what would be the best IDP tool out there in the market for enterprises (5000+ employees) as it relates to user experience, accuracy in results, integration with other apps, pricing, complexity and others.
#Best Intelligent Document Processing tool in the market
20 messages · Page 1 of 1 (latest)
Pretty sure the UiPath one is just Azure AI Document Intelligence, maybe with some custom training...there's also google cloud vision, tesserect v4, abbyy etc but my experience is you just have to test them yourself for accuracy
they perform differently based on document type/content/layout so it's not really one size fits all
I've seen some interesting thing going on recently where you can create your own version of openai's LLM on azure and use that without fearing data will be used to train publicly available chat gpt and leak somewhere. See: #automation-anywhere-help message
Best hands down is Abby.. (but it's pricey)
A new kid on the block with some really cool features like Invoice data extraction without training is given by "nano-Nets"
Its a completely different product with its own AI models and training pipelines/workflows.
Realistically it's a GUI over OpenCV...nothing that can't be done in a few lines of python regardless. Waste of money.
No, also not correct. The models are inhouse UiPath build and trained models, based on the LayoutML architecture. Including techniques like Frozen backbone.
Thanks for the details! You seem to know a lot - is that info from a tech talk somewhere? It's something I might look into further. My point still kinda stands though, if it's open source PyTorch + LayoutLM, which are both free and have free pre-trained models (https://huggingface.co/models?sort=trending&search=layoutlmv3)... then how does one explain the value proposition of the product?
Also interested in Re:Infer (the company UiPath acquired for NLP / communications mining) and whether UiPath is just going to acquire instead of build from now on, which is probably more sensible.
It was mentioned here; https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/release-notes-cloud-december-2022#13-december-2022 .
A lot of the development is not so much of making the models, in a sense anyone can do that, especcially with all the open source stuff that is available. A lot more is about the usability, trying to have not so technical users use the technology. You can see this for example with DU, which is now much more user friendly than it was 2 years ago, or something like forms-ai. For Re:infer it was actually the same, they have a great interface/UI , statistics and easy training workflow around the NLP model, that users/partners after some training can use themselves. Would cost UiPath a long time to build something like that. So that is where the value is, if that is enough is of course for anyone to decide.
Current developments are a lot centered around LLM models, integrating with all kinds of LLM models but also enhancing the platform, like with the just anounced Autopilot. For DU you now have an activity package where you can just formulate some questions and it will extract the value from a document like a contract. So for example; "What is the start date of the contract ?" and the output will be a variable startdate. Its also not only openai, its using smaller propriarity models, specialized models and integrations with openai, AWS sagemaker etc.
Wow thanks for the detail and perspective. I'm not an expert on the ML stuff but I have dabbled in some basics. I'm actually really interested in just how much fine-tuning actually gets done on top of the transformers, because enterprise software always tries to oversell their capabilities. My personal opinion is still that vendors are rushing to satisfy industry hype/buzzwords than adequately solving any actual use cases. Some other companies are extremely transparent about their ML metrics (eg compute power, accuracy, white papers on their methods etc) but I'm not sure about RPA vendors lol.
The challenge is training data and labelling. That can sometimes take a lot of work. And by sales and marketing people that is sometimes oversold, like its plug and play. But for specialized models, like invoices that is often not the case.
So you want a friendly user inteface for that instead of more expensive technical people having to deal with that.
I do not have access to link posted 😕
Has anyone tried Document AI from Google? 🧐
The Document AI solutions suite includes pre-trained models for document processing, Workbench for custom models and Warehouse to search and store.
I have been researching this myself...there's some established datasets that all the models get benchmarked on but the resources are very patchy.
https://paperswithcode.com/dataset/funsd
is one example of a benchmark. I'll post here if I find anything more comprehensive.
Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis,...
Eventually planning to make a video about these solutions and their value-add...
That's because you don't have Automation Anywhere role. I suggest adding that role yourself if you want follow my link, see screenshot.