As the number of chatbots employing generative artificial intelligence continues to increase, there is also a growing interest in developing strategies, methods, and tools to detect them. However, currently, there is no readily available solution for quickly identifying these chatbots, even though it only takes a few words and a simple click to generate incredibly realistic images using AI models such as Midjourney, DALL-E, Craiyon, or Stable Diffusion in a matter of seconds.
After delving into the recognition of AI-generated texts, this second part aims to provide an overview of the tactics, techniques, and procedures that can be utilized to identify visual deepfakes.
Essayist Raphaël Doan recently expressed that “images don’t prove anything, and perhaps that’s a good thing,” referring to the proliferation of AI-generated images. While we cannot predict the future, it is evident that the saying “I only believe what I see” no longer carries the same weight unless we have verified, cross-checked, and contextualized the subject matter.
Initially, the faces of fabricated individuals created by AIs and used to generate fake LinkedIn profiles since 2019 were relatively easy to discern and pinpoint. Key indicators included unusual accessories, headwear, backgrounds, untidy hair, ill-fitting collars, and color inconsistencies.
On the other hand, real photos can be identified by the texture of the clothing and the overall plausibility of accessories, such as earrings, headgear, and backgrounds, especially when other individuals are present. This is because the AIs were trained solely to replicate faces, one at a time, without considering the context or the surrounding elements.
In essence, to differentiate a real face from a deepfake, the focus was on the “sides” of the face and attempting to identify either small details that an AI couldn’t have generated or, conversely, small imperfections that likely originated from an AI. This process was made easier by websites like Which Face is Real, which juxtaposed a real photo with a deepfake for comparison.
However, this approach was applicable in the past, before generative AIs were trained to create not just faces but various types of images using significantly more powerful algorithms and language models. These newer models, such as the one used by Stable Diffusion, are trained on billions of photographs and leverage databases like LAION (Large-scale Artificial Intelligence Open Network), which contains around 5.85 billion images.
This photo generated by MidJourney appears to be quite eerie. At first glance, it may appear like a “normal” photo, but upon closer inspection, a sense of unease sets in. In fact, it evokes a feeling akin to witnessing a nightmare. The unsettling nature of the image becomes apparent when one zooms in and explores the details.