Search Results
12 results for “RL”
A/B Testing (ML)
A/B testing in machine learning is a controlled experiment method that compares two or more model variants in production to determine which delivers superior performance on real-world business metrics.
AI Alignment
AI alignment is the field of research dedicated to ensuring that artificial intelligence systems pursue goals, values, and behaviours that are consistent with human intentions.
Chroma
An open-source vector database designed for embedding-based applications, optimised for developer ergonomics and increasingly for large-scale serverless retrieval through a 2025 Rust-core rewrite.
Computer Vision
Computer vision is the field of artificial intelligence that enables machines to interpret and act upon visual information from the world — including images, video, and depth data.
Generative AI
Generative AI refers to artificial intelligence systems capable of producing new content — text, images, audio, video, or code — by learning the underlying distribution of training data.
Inference (Machine Learning)
Inference is the phase in which a trained machine learning model is used to generate predictions or outputs from new input data, distinct from the earlier training phase.
Monte Carlo Methods
A broad class of computational algorithms that use repeated random sampling to obtain numerical results, widely used in machine learning for Bayesian inference, reinforcement learning, and uncertainty estimation.
Reinforcement Learning
A machine learning paradigm in which an agent learns to make sequential decisions by interacting with an environment and optimising for cumulative reward through trial and error.
Reinforcement Learning from Human Feedback
A machine learning technique that trains a reward model from human preference data and uses it to align large language models with human values, safety requirements, and intended behaviour through reinforcement learning.
Scale AI
An American data labelling, evaluation, and AI infrastructure company that supplies training data and evaluation services to leading AI laboratories, autonomous vehicle developers, and government agencies.
Tensor Processing Unit
A tensor processing unit (TPU) is a custom application-specific integrated circuit developed by Google for accelerating machine learning workloads, particularly neural network training and inference.
Weights and Biases
Weights and Biases (W&B) is a machine learning developer platform for experiment tracking, model versioning, dataset management, and collaborative model evaluation used by over 200,000 ML practitioners worldwide.