AI Terms Glossary

This glossary of AI terms was drafted by ChatGPT (GPT4), with prompts, edits, and the addition of more recent terms from me. I asked Claude 2 to check and correct the definitions.

I’ve divided them into beginner and advanced terms, so if you are well-versed in the topic, skip down to the advanced section. Are there any terms you find helpful that are missing? Let me know!

Designed by ChatGPT (GPT-4, Sept 25 version) as the prompt “Create an image of a digital library. Visualize a sleek, futuristic tablet or digital screen floating against a soft gradient background. On the screen, display a grid of glowing, holographic icons representing AI concepts. Include icons such as a brain (for AI), a gear (for algorithms), a speech bubble (for NLP), a book (for datasets), and a magnifying glass (for analysis). The overall feel should be modern, with a touch of sci-fi, emphasizing the digital and innovative nature of AI.” and created by Ideogram.ai on Sept 27, 2023.

Beginner Terms

TermDefinition
Advanced Data AnalysisA mode integrated into ChatGPT Plus (GPT-4) that can produce data analysis and visualizations. This feature was previously a plugin called Code Interpreter. It allows the user to upload files and it can perform data visualization.
AlgorithmIn machine learning, refers to a set of rules or instructions given to an AI, neural network, or other machine to help it learn on its own.
ArchitectureThe structure of a machine learning model, including the number and arrangement of layers and nodes.
Artificial Intelligence (AI)The simulation of human intelligence processes by machines, especially computer systems.
Bard (obsolete)An AI chatbot based on Google’s PaLM 2 LLM. Replaced by Gemini.
BiasWhen a machine learning model produces results that are systematically prejudiced due to inherent flaws in the training data or the model design.
Bing Chat (obsolete)Microsoft’s free chatbot that uses OpenAI’s GPT-3.5 and GPT-4 models. Replaced by Copilot.
ChatGPTAn AI language model developed by OpenAI, which uses machine learning to write human-like text based on prompts. Currently allows use of either GPT-3.5 (free) or GPT-4 (paid).
ClaudeAn LLM AI assistant created by Anthropic.
Code Interpreter (obsolete)Previous name of a plug-in available to paid ChatGPT users that was renamed Advanced Data Analysis (see above).
CopilotMicrosoft’s AI service, including a web AI chatbot, an operating system chatbot, and integrated LLM capability inside Microsoft products
Dataset ShiftWhen the data the model is working with changes or drifts over time, leading to a decrease in the model’s performance.
Deep LearningA subset of Machine Learning, it imitates the workings of the human brain in processing data for use in decision making.
Expert systemsTraditional AI systems that work based on rule-based knowledge and logic.
ExplainabilityMethods for understanding and articulating the reasons behind model behavior and predictions.
Few-Shot PromptingFew-shot prompting is when you show the model 2 or more examples in your prompt.
Fine-TuningThe process of training a pre-existing model on a new, often smaller, dataset to improve its performance on specific tasks.
Foundation ModelsModels like GPT-4 that are trained on a broad data corpus and can be fine-tuned for specific tasks.
GeminiA series of foundation LLMs from Google, also the name of their chatbot and paid service (Gemini Advanced).
Generative AIA type of AI that can create new content, it can range from text to images, music, or even video.
GPT-3.5OpenAI’s LLM that was the model used in the free version of ChatGPT.
GPT-4o, GPT-4o miniThe OpenAI language models available in ChatGPT.
HallucinationA term used in AI to describe when the model generates incorrect or imaginary content not based on evidence.
Hidden LayersThe layers in a neural network between the input and output layers that perform computations and transformations on the input data.
InferenceThe process where a machine learning model makes predictions or generates outputs based on new data.
Input LayerThe first layer of a neural network that receives the initial data the network will learn from.
Knowledge GraphA network of real-world entities (like people, places, or concepts) and their interrelations, used by AI to provide context-based answers.
Large Language Models (LLMs)These are language models that have been trained on vast amounts of text data and can generate human-like text based on the input they’re given. They can answer questions, write essays, summarize texts, translate languages, and even generate poetry.
LLAMAA series of open-source LLMs from Meta.
Machine Learning (ML)A subset of AI, Machine Learning involves the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something.
Mixture of ExpertsAI model that uses multiple specialized sub-models, each “expert” in a specific area, and dynamically selects which ones to use for a given task, resulting in more efficient and specialized processing of complex problems
Natural Language Generation (NLG)The use of artificial intelligence programming to produce written or spoken narrative from a dataset.
Natural Language Processing (NLP)This is an AI method of communicating with an intelligent system using a natural language.
Neural NetworkInspired by the human brain, a Neural Network is a series of algorithms that attempts to recognize relationships in a set of data through a process that mimics how the human brain works.
NodesThe points of connection and computation in a neural network, similar to neurons in a human brain.
Output LayerThe final layer in a neural network that produces the results of the computation.
PaLM 2PaLM (Pathways Language Model) is an LLM from Google.
ParametersThe parts of a machine learning model that are learned from the training data, such as the weights and biases in a neural network.
Plug-insPrograms that ChatGPT Plus users can add to ChatGPT to add functionality or access to third-party services.
PretrainingThe initial phase of training a machine learning model, usually done on a large, general dataset before being fine-tuned for a specific task.
PromptThe initial input given to an AI model, to which it responds by generating output.
Prompt EngineeringThe practice of designing prompts effectively to get better and more useful outputs from AI models.
Reasoning EngineAn artificial intelligence component that simulates the human ability to reason and make decisions.
Reinforcement LearningA type of Machine Learning where an agent learns how to behave in an environment by performing actions and seeing the results.
Reinforcement Learning with Human Feedback (RLHF)A method used to fine-tune foundation models (like GPT-4) where humans evaluate the model’s outputs.
ResponseThe output generated by an AI model in response to a prompt.
Retrieval Augmented Generation (RAG)RAG retrieves relevant fragments of existing content and combines them with the user prompt to produce a more informed and accurate response.
Strong AIThis kind of AI can understand, learn, adapt, and implement knowledge from one domain into another, much like a human.
Supervised LearningA type of Machine Learning where the AI is trained using labeled data, i.e., data paired with the correct answer or outcome.
Symbolic AIThis is the traditional kind of AI, which is based on explicit symbolic representations of problems, logic, and search.
TemperatureA parameter in language models that controls the randomness of the output. Higher temperatures result in more diverse outputs, while lower values make the output more predictable.
Theory of MindThe ability to understand and attribute mental states to oneself and others, an attribute currently lacking in AI models.
The PileThe Pile is a diverse, 825GB set of English language text for training large language models (LLMs). It consists of a collection of many similar datasets, including books, websites, and other texts, providing a broad base of knowledge for models trained on it.
TokenA single unit of input to a language model. This can be a part of a word as short as one character or as long as one word.
TokenizationThe process of breaking down text into smaller pieces (tokens) that can be processed by a language model, such as words or parts of words.
TransformersA type of model architecture used in machine learning. They handle variable-sized input using the mechanism of attention, selectively focusing on parts of the input data.
Unsupervised LearningA type of Machine Learning where AI learns from unlabeled data and finds patterns and relationships therein.
Weak AIAlso known as Narrow AI, this kind of AI is designed to perform a narrow task, like voice recognition, and lacks general intelligence.
WeightsValues in a neural network that transform input data within the network’s hidden layers.
Zero-Shot LearningThe ability of a machine learning model to perform tasks or solve problems it has not been trained on.

Advanced Terms

TermDefinition
AttentionA technique used in neural networks allowing focus on specific parts of the input most relevant to the desired output.
BackpropagationAn algorithm used during the training of neural networks, which adjusts the weights of the neurons to improve the accuracy of predictions.
Bias-Variance TradeoffA fundamental problem in machine learning regarding the balance between a model’s ability to generalize from the data (bias) and its ability to capture the data’s complexity (variance).
BookCorpusA dataset consisting of 11,038 books in 16 different genres. The dataset, used often in language model training, provides diverse long-form text data.
ClassificationA type of machine learning model that predicts discrete values, used for making decisions or predictions.
Common CrawlCommon Crawl is an open repository of web crawl data that can be accessed and analyzed by anyone. The dataset includes raw web page data, metadata, and text. It’s frequently used for training language models due to its size and diversity.
Convolutional Neural Network (CNN)A class of deep learning neural networks, most commonly applied to analyzing visual imagery.
Generative Adversarial Network (GAN)A class of machine learning frameworks where two neural networks contest with each other in a game. The generative network generates predictions while the discriminative network evaluates them.
Gradient DescentAn optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent. It’s a method to optimize the performance of a neural network.
HyperparametersThese are the parameters of the learning algorithm itself, which influence the speed and quality of the learning process. They are set before training starts.
K-Means ClusteringA type of unsupervised machine learning algorithm used to group data into different clusters based on their similarities.
K-Nearest Neighbors (K-NN)A simple, flexible machine learning algorithm that uses a group of data points in close proximity (neighbors) to predict the value or class of a given data point.
Naive BayesA group of simple, fast, and efficient classification algorithms that use a common principle of assuming the features are independent of each other. It’s based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features.
Overfitting and UnderfittingOverfitting occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. Underfitting occurs when a model is too simple, unable to capture the structure in the data.
PruningRemoving redundant or less important parts of a neural network to increase efficiency without losing accuracy.
Recurrent Neural Network (RNN)A type of artificial neural network designed to recognize patterns in sequences of data, such as text, genomes, handwriting, or the spoken word.
RegressionA statistical method used in machine learning and data analysis that attempts to predict a continuous outcome variable (Y) based on the value of one or multiple predictor variables (X).
Sentiment AnalysisThe process of computationally determining whether a piece of writing is positive, negative, or neutral.
Support Vector Machine (SVM)A Support Vector Machine is a supervised machine learning model that uses classification algorithms for two-group classification problems.
Wikipedia DumpThis is a dataset that consists of a downloadable version of all the text in Wikipedia. Despite being narrower in scope than web crawl datasets or The Pile, it’s widely used in natural language processing and provides a useful base of factual knowledge.