The team from Loughborough University and Xceptor – which describes itself as a ‘no-code data automation platform’ – has created a deep learning model for natural language processing (NLP) that can analyse the content and structure of invoices, tax forms and other digital documents and sort the information into categories.
The improved system will streamline processes such as setting up bank accounts, approving mortgages, responding to customer queries and processing insurance claims by speeding up fraud checking and extracting details from identity documents.
Lead developer Dr Chao Zhang, of Loughborough’s Department of Computer Science, said the technology was faster and cheaper than current systems which perform the same task and would benefit similar tasks in the banking, financial service and insurance sectors.
He said: “Compared with the traditional rule-based or pattern matching approaches, the developed NLP can identify terms, learn language structures, extract contextual correlation and classify texts into semantic groups and clauses, such as invoice numbers, payee addresses, counterparty names as well as distinguishing due date with invoice date.”
The AI model was trained to deal with complex freeform contents and robustly extract information linking with context rather than relying on pre-defined templates in texts and is built on state-of-the-art deep learning technology.
The concept of graph modelling was introduced in the learning process, to improve the model performance on complex documents which may include tables, and blocked texts with spatial alignment information. Such documents are more difficult to process than plain texts in paragraphs.
Image: The new Xceptor system extracts customised key information from invoices
This research was conducted as part of an 18-month KTP (Knowledge Transfer Partnership) project, jointly funded by Xceptor and Innovate UK.
The academic lead Professor Baihua Li, from Loughborough’s School of Science, said: “Extracting required information from a large number of documents is currently a very time-consuming manual process. Developing AI solutions to learn contextual meaning and correlation presented in complexly structured documents is extremely challenging.
“We are pleased that Loughborough University’s specialists in NLP and machine learning are working with Xceptor on this game changing innovation, and can successfully integrate the AI automation function into the company’s smart document analysis platform for improved speed and accuracy.”
Dr Rob Lowe, Chief Architect at Xceptor, said: “The power of this AI-based technology is that it can adapt to work with a wide range of documents.
“For the complex and rapidly changing environments that are inherent in Banking, Financial Services and Insurance, this technology makes it simpler and faster to automate processes and keep them working efficiently over time.
“Ultimately our customers become more responsive and agile, and their experts can concentrate on higher-value tasks.”
Former Loughborough academic Professor Eran Edirisinghe, now at Keele University, added: “We hope such collaborative projects will help the industry improve their competitiveness and develop new services through the better use of knowledge, technology and skills that transfer from research.”Researchers at Loughborough University and Xceptor have developed an AI-based solution that can automatically analyse and extract large amounts of information from computer documents.