AIWiki
Malaysia

Search Results

12 results for RL

Infrastructure

A/B Testing (ML)

A/B testing in machine learning is a controlled experiment method that compares two or more model variants in production to determine which delivers superior performance on real-world business metrics.

6 min readUpdated June 2026
Foundations

AI Alignment

AI alignment is the field of research dedicated to ensuring that artificial intelligence systems pursue goals, values, and behaviours that are consistent with human intentions.

5 min readUpdated May 2026
Companies & Tools

Chroma

An open-source vector database designed for embedding-based applications, optimised for developer ergonomics and increasingly for large-scale serverless retrieval through a 2025 Rust-core rewrite.

4 min readUpdated May 2026
Applications

Computer Vision

Computer vision is the field of artificial intelligence that enables machines to interpret and act upon visual information from the world — including images, video, and depth data.

3 min readUpdated May 2026
Applications

Generative AI

Generative AI refers to artificial intelligence systems capable of producing new content — text, images, audio, video, or code — by learning the underlying distribution of training data.

4 min readUpdated May 2026
Infrastructure

Inference (Machine Learning)

Inference is the phase in which a trained machine learning model is used to generate predictions or outputs from new input data, distinct from the earlier training phase.

5 min readUpdated May 2026
Foundations

Monte Carlo Methods

A broad class of computational algorithms that use repeated random sampling to obtain numerical results, widely used in machine learning for Bayesian inference, reinforcement learning, and uncertainty estimation.

5 min readUpdated May 2026
Foundations

Reinforcement Learning

A machine learning paradigm in which an agent learns to make sequential decisions by interacting with an environment and optimising for cumulative reward through trial and error.

7 min readUpdated June 2026
Foundations

Reinforcement Learning from Human Feedback

A machine learning technique that trains a reward model from human preference data and uses it to align large language models with human values, safety requirements, and intended behaviour through reinforcement learning.

7 min readUpdated May 2026
Companies & Tools

Scale AI

An American data labelling, evaluation, and AI infrastructure company that supplies training data and evaluation services to leading AI laboratories, autonomous vehicle developers, and government agencies.

5 min readUpdated June 2026
Infrastructure

Tensor Processing Unit

A tensor processing unit (TPU) is a custom application-specific integrated circuit developed by Google for accelerating machine learning workloads, particularly neural network training and inference.

4 min readUpdated May 2026
Companies & Tools

Weights and Biases

Weights and Biases (W&B) is a machine learning developer platform for experiment tracking, model versioning, dataset management, and collaborative model evaluation used by over 200,000 ML practitioners worldwide.

5 min readUpdated May 2026