AIWiki
Malaysia

Search Results

7 results for alignment

Foundations

AI Alignment

AI alignment is the field of research dedicated to ensuring that artificial intelligence systems pursue goals, values, and behaviours that are consistent with human intentions.

5 min readUpdated May 2026
Applications

AI Red Teaming

A structured adversarial evaluation practice in which testers attempt to elicit harmful, unsafe, or policy-violating behaviour from AI systems in order to surface risks before deployment.

5 min readUpdated May 2026
Applications

AI Safety

AI safety is a field of research and practice concerned with the development of artificial intelligence systems that behave reliably, avoid harmful outputs, and remain aligned with human values, especially as systems become more capable.

6 min readUpdated May 2026
Companies & Tools

Anthropic

Anthropic is an American AI safety company and large language model developer founded in 2021 by former OpenAI researchers, best known for developing the Claude family of AI assistants and the Constitutional AI alignment technique.

7 min readUpdated May 2026
Foundations

Constitutional AI

Constitutional AI is an alignment method developed by Anthropic that trains language models to follow a set of written ethical principles by using the model itself to critique and revise its own outputs, reducing dependence on human feedback for harmlessness.

6 min readUpdated May 2026
Models

DALL-E

DALL-E is a series of text-to-image generative AI models developed by OpenAI that create photorealistic and artistic images from natural language prompts using diffusion and language-vision alignment techniques.

6 min readUpdated May 2026
Foundations

Reinforcement Learning from Human Feedback

A machine learning technique that trains a reward model from human preference data and uses it to align large language models with human values, safety requirements, and intended behaviour through reinforcement learning.

7 min readUpdated May 2026