Glossary

From Our Knowledge Base

Welcome to the Augmetrics® AI Glossary, where clarity meets capability.

We’ve defined the most important AI terms in plain language—organized for quick access and real-world understanding.

Whether you’re new to AI or deep in deployment, this glossary is your go-to reference for cutting through the noise.

To make it even easier to navigate, we’ve grouped entries into key categories like Compute & Processing, AI Fundamentals, Sustainability, Comparisons & Tradeoffs, and Inference & Deployment—so you can find exactly what you need, faster.

Jump to:

Compute & AI Processing
AI Fundamentals
Sustainability

Comparisons & Tradeoffs
Inference & Deployment
Recently Updated

Compute & AI Processing Definitions

This section covers the nuts and bolts of how AI models are trained, optimized, and deployed—from GPUs and TPUs to key algorithms like gradient descent and backpropagation.

If it powers the AI engine, you’ll find it here.

What is AI Inference?
AI Inference is the phase where a trained model generates predictions or outputs from new input data. It’s often the most compute-intensive and cost-sensitive part of deployment.

What is AI Training?
AI Training is the process where a model learns from data by adjusting internal weights to improve prediction accuracy through repeated optimization.

What is a GPU compared to a TPU?
A GPU (Graphics Processing Unit) is optimized for parallel processing and widely used in AI training. A TPU (Tensor Processing Unit) is a custom chip by Google designed specifically to accelerate machine learning.

What is Backpropagation?
Backpropagation is an algorithm that adjusts neural network weights by calculating and distributing errors backward through the model after each prediction.

What is Gradient Descent?
Gradient Descent is an optimization technique used to minimize a model’s error by iteratively updating weights based on the slope of the loss function.

What is the Cost of Delivery in AI?
The Cost of Delivery refers to the energy, infrastructure, and compute resources required to serve AI outputs in real-time, including inference compute, bandwidth, and storage.

Post-Preprocessing Stage: What It Means & What It Does

Once raw data has been cleaned, normalized, and structured during pre-processing, the next step is…Read Article

What is Quantization
Quantization reduces the precision of model weights (e.g., from 32-bit to 8-bit) to shrink models and accelerate inference without major accuracy loss.

What is a Vector Database?
A Vector Database stores high-dimensional embeddings, enabling fast similarity searches for AI applications like recommendation, semantic search, and RAG.

What is Latency in AI?
Latency refers to the time it takes for an AI system to respond with an output after receiving an input—critical for real-time applications.

What is an AI Accelerator?
An AI Accelerator is a specialized chip (like a GPU, TPU, or ASIC) designed to optimize the performance of AI workloads, especially matrix-heavy operations.

Inference & Deployment

From embeddings to explainability and real-time prediction costs, this category focuses on what happens when the model goes live.

If it impacts speed, trust, or performance at scale, you’ll find it here.

What is Embedding in AI?
An Embedding is a numerical representation of text, image, or other data that preserves semantic meaning—used to power search, recommendations, and similarity comparisons.

What is Explainability in AI?
Explainability refers to understanding how and why an AI made a decision, increasing transparency and trust—especially in regulated or high-stakes settings.

What is Bias in AI?
Bias in AI refers to unfair patterns in model outputs caused by skewed or incomplete training data.

What is Attribution?
Attribution tracks the influence of specific inputs, data sources, or model components on a given AI output—used in auditing, compliance, and model debugging.

Recently Added

What is Real-Time Inference?
Real-Time Inference delivers model outputs instantly or within milliseconds, crucial for chatbots, autonomous systems, and dynamic decision-making.

What is an Embedding Store?
An Embedding Store is a database optimized for storing and retrieving vectorized representations used in search, personalization, and retrieval-augmented generation.

What is Monitoring in AI Deployment?
Monitoring tracks the performance, accuracy, and reliability of deployed models to detect drift, errors, or system degradation over time.

AI Fundamentals Definitions

Here’s where you’ll find the core concepts behind AI systems, from neural networks to language models and the math that drives them.

It’s the essential vocabulary for anyone working with—or learning about—modern AI.

What is a Neural Network?
A Neural Network is a layered model inspired by the human brain, built from artificial neurons that process data and learn patterns.

What is a Large Language Model (LLM)?
An LLM is an AI system trained on vast text corpora to understand, predict, and generate human-like language for tasks like chat, summarization, and Q&A.

What is an Activation Function?
An Activation Function controls whether a neuron in a neural network should ‘fire’ based on its input, enabling non-linear decision-making.

What is Prompt Engineering?
Prompt Engineering is the art and science of crafting inputs to guide AI models toward better, more accurate, or more creative outputs.

What is a Parameter in an AI Model?
A Parameter is a learned value in a model (like a weight) that gets updated during training and influences how inputs are transformed into outputs.

What is an Epoch in AI Training?
An Epoch is one complete pass through the entire training dataset. Multiple epochs are often used to improve model accuracy.

What is a Loss Function?
A Loss Function measures the gap between predicted and actual values, guiding model optimization during training.

What is a Token in NLP?
A Token is the smallest unit of text processed by a language model—usually a word, subword, or character depending on the tokenizer used.

What is Multi-Modality in AI?
Multi-Modality refers to models that process and understand multiple types of data—text, images, audio, video—often at once.

What is a Language Tokenizer?
A Tokenizer breaks down text into smaller chunks (tokens) so an AI model can process and understand them efficiently during training or inference.

Sustainability Definitions

AI has a compute problem—and this section explains how smarter training, model compression, and transfer learning can help solve it.

Explore the strategies that improve performance while reducing waste, cost, and energy consumption.

What is Pre-Processing?
Pre-Processing is the stage where raw data is cleaned and structured before training. It includes normalization, tokenization, and removing duplicates.

Pre-Processing Stage

Pre-processing is the critical first step in preparing raw data for AI model training, ensuring…Read Article

What is Green AI?
Green AI focuses on reducing the environmental impact of training and running AI models by improving energy efficiency and minimizing carbon emissions.

What is Data Deduplication?
Data Deduplication removes redundant information from datasets, lowering storage, processing time, and energy use during training.

What is Model Compression?
Model Compression reduces the size of an AI model to improve efficiency, speed up inference, and enable deployment on smaller devices.

What is Fine-Tuning?
Fine-Tuning adjusts a pre-trained model using new, domain-specific data to improve accuracy for a specific task without retraining from scratch.

What is Transfer Learning?
Transfer Learning reuses knowledge from a model trained on one task and applies it to a new task—saving time, compute, and data.

What is Compute Efficiency?
Compute Efficiency measures how effectively an AI system uses hardware to produce accurate results with minimal energy or cost.

What is Edge AI?
Edge AI runs models on local devices (like phones or sensors) instead of cloud servers, reducing latency and energy usage for sustainable deployment.

Comparisons & Tradeoffs

Understanding AI means understanding its choices—this section breaks down the differences between model types, training styles, and task-specific strategies.

Think of it as your side-by-side guide to what works, when, and why.

What is Overfitting vs. Underfitting?
Overfitting happens when a model learns training data too precisely and fails to generalize. Underfitting occurs when a model is too simplistic and misses important patterns.

What is Supervised Learning vs. Unsupervised Learning?
Supervised Learning uses labeled data to train a model, while Unsupervised Learning finds patterns in unlabeled data, often for clustering or anomaly detection.

What is a CNN compared to an RNN?
CNNs are designed for spatial data like images; RNNs process sequential data like text or time series.

What is Named Entity Recognition (NER) vs. Sentiment Analysis?
NER identifies and categorizes names (people, places, orgs); Sentiment Analysis determines emotional tone (positive, negative, neutral) in text.

What is Speech-to-Text vs. Text-to-Speech?
Speech-to-Text converts spoken audio into written words. Text-to-Speech does the reverse, turning text into synthesized voice.

What is a Pre-Trained Model vs. a Custom Model?
A Pre-Trained Model is trained on general data and can be adapted for new use cases. A Custom Model is built and trained from scratch for a specific task.

What is Centralized AI vs. Federated AI?
Centralized AI processes data on a central server, while Federated AI trains models across distributed devices without moving raw data.

What is Explainability vs. Interpretability?
Explainability describes why a model made a decision; interpretability focuses on how the model works internally—both are key to trust and transparency.

What is a Model vs. an Algorithm?
An Algorithm is a set of rules for solving a problem; a Model is the learned output of training data processed by an algorithm.

We want to hear from you.

We know that Augmetrics® is not a universal solution to sustainability problems that we face, but we also know it is a start; one that took over 10 years to develop.