Fake documents could spur an AI arms race

Inscribe is developing technology that can help insurers and lenders detect forged and altered documents.

[Photo: Free-Photos/Pixabay]

If you’ve ever taken out a loan, you’ve probably been asked for copies of documents that show your income and savings, like a recent paystub, a W-2 form, or a bank statement. And if you’ve ever had to file an insurance claim after an accident or fire, you may have been asked to submit receipts or invoices verifying the expenses you’re asking to have reimbursed.


The trouble for insurers and lenders is that it’s not necessarily easy to know if those forms people submit are genuine. There’s always a risk of unscrupulous customers using Photoshop or other editing tools to artificially boost their salaries or the costs of their replacement items.

A Bay Area startup called Inscribe, which recently participated in the Y Combinator accelerator program, is using digital forensics and machine learning techniques to help companies figure out when documents are forged and altered. The company is currently working with tech-savvy lenders to test its technology now, and as of September, those early clients should be able to upload documents on their own to have the system highlight areas of potential fraud concern.

“A user uploads a document, and we reveal all the information they need to make a decision,” says CEO and cofounder Ronan Burke.

Techniques include looking for places where parts of an image appear to be duplicated, such as if a number were copied from one place to another on a form, and looking for inconsistencies in coloring and fonts that could indicate tampering. More technical aspects of images, like levels of JPEG compression across a file or metadata that indicates certain tools were used to edit the files, can also be used.

An analysis of a manipulated document by Inscribe’s software. “Each line of text should have the same color. However, in this example, the years and balances have been tampered with and are detected using font discrepancies as shown by the change of color on the same line,” says Ronan Burke.

Future versions of the product will also be able to spot inconsistencies between documents, like if the amount listed on a pay stub doesn’t make sense in conjunction with a W-2 or offer letter, Burke says. They might also be able to sync up with various databases to verify whether figures or other data on documents are accurate.

“If somebody forges their bank statement, they also need that to correlate with their tax form and their employer offer letter,” he says.


Burke says Conor Burke, his twin brother and cofounder and Inscribe’s CTO, did university work on training machine learning systems to work with highly compressed images. This led to Inscribe’s technology, which can use discrepancies in compression to detect areas where parts of one image have been inserted into another.

The company also hopes to offer an API to customers in the future so that they can automatically upload sets of documents and receive a confidence score indicating how likely the documents are to be genuine.

While Inscribe is focusing on document forgeries, it’s just one of a number of projects looking at curbing fake images and videos, as experts fear increasingly sophisticated techniques could forge anything from political speeches to evidence of crimes or misconduct. The Defense Advanced Research Projects Agency is funding research to spot fake imagery that could be used for propaganda. This comes as so-called deep fake technology that uses neural networks to generate realistic-looking fake videos is rapidly advancing.

It’s possible that automatic document forgery detection could spur its own arms race, at least with more sophisticated fraudsters, experts say.

“It’s kind of a cat and mouse game, when the mouse is smart,” says Nasir Memon, a professor of computer science and engineering at the New York University Tandon School of Engineering, who has studied image manipulation. And there’s also always a risk of false positives, which can be more or less of an issue, depending on the application, says Memon.

But Burke says Inscribe experts will likely study documented forgery techniques to make sure its software can defeat them before they can be widely used. The company already creates its own sample of forged documents to train and test its software, he says.


Using multiple tests to determine if a document or another image is genuine can be a good way to root out even skilled adversaries, since it only takes one oversight to make fraud detectable, says Memon. “All the bad guy has to do,” he says, “is make one mistake, and we have the opportunity to catch the bad guy.”

About the author

Steven Melendez is an independent journalist living in New Orleans.