ChatGPT? Stable Diffusion? Generative AI jargon, explained
While ChatGPT and text-to-image tools are among the buzziest developments in tech right now, comprehending what they are and how they work can be an exercise in frustration.
The field of AI is a rabbit hole of technical and mathematical jargon, and simple explanations of even the most fundamental concepts is in short supply. As a result, tools like ChatGPT and Stable Diffusion can feel like mystical black boxes, and it’s easy to lose track of the differences between them and the companies involved.
To help make sense of it all, here’s a plain English glossary of notable AI terms, products, and companies, along with links to where you can learn more.
Basic AI terms
AI: Short for artificial intelligence, this broadly refers to the idea of computers that can learn and make decisions in a human-like way.
Machine learning: A subfield of artificial intelligence, this is the practice of teaching computers to recognize patterns through data and algorithms. It differs from traditional programming in that the computer doesn’t need to be explicitly coded to address every potential scenario.
Neural network: A type of machine learning model that mimics the neurons in the human brain, using a network of nodes to process data through algorithms. This allows the computer to make connections between lots of different data points and learn which ones are the most important when responding to query.
Deep learning: Describes a neural network whose data passes through several layers of processing—some of which are hidden from the programmer—before arriving at a response. AI tools such as ChatGPT and Stable Diffusion are examples of applications that use deep learning techniques.
GPT and conversational AI
GPT: Short for “Generative Pre-Trained Transformer,” this is an AI model that uses deep learning to generate human-like text, created by OpenAI. The name itself requires some unpacking:
Language modeling: A technique for determining the order of words in a sentence, based on the probability that those words will make sense.
ChatGPT: A conversational chatbot created by OpenAI, using a language model that emphasizes back-and-forth dialog. As of now, you can try it for free.
GPT-3: The third-generation language model created by OpenAI. It forms the basis for a slew of AI writing tools that have launched over the past two years, using OpenAI’s API. (ChatGPT uses an improved version, called GPT-3.5, while GPT-4 is in development.)
OpenAI: The AI research company behind GPT-3, ChatGPT and DALL-E. It began as a non-profit group, but now operates a “capped-profit” company that employs most of its staff. Notably, Elon Musk was a cofounder, but resigned from OpenAI’s board in 2018.
DALL-E, Stable Diffusion, and AI art
Diffusion model: A method for creating images from text prompts. It works by adding random noise to a set of training images, then learning how to remove noise to construct the desired image.
Several companies are now using the diffusion model to offer text-to-image tools, most notably:
Dreambooth: A deep learning model, developed by Google, that can fine-tune images created through diffusion. Its most notable use case is the ability to generate new pictures of specific people based on existing photos—for better or worse. Although Google itself has not released Dreambooth for public use, an implementation of it has been released as an open source project.
Lensa: An image editing app for iOS and Android from Prisma Labs that first launched in 2018. It has gone viral in recent weeks thanks to a new “Magic Avatar” feature, whose effects are similar to that of Stable Diffusion and Dreambooth. It’s been criticized for creating overly sexualized images—particularly for women—along with accidental nudes.
(65)