What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

OpenVINO

OpenVINO is an open-source toolkit developed by Intel for optimising and deploying deep learning inference across Intel hardware, including CPUs, GPUs, Neural Processing Units, and FPGAs, with broad support for major AI frameworks and model formats.

6 min readLast updated June 2026Infrastructure

OpenVINO (Open Visual Inference and Neural network Optimisation) is an open-source toolkit created by Intel for accelerating and optimising deep learning inference on Intel hardware. Released publicly in 2018, it provides a unified API and set of tools for converting models trained in popular deep learning frameworks into a hardware-optimised format, then deploying them at high throughput and low latency on Intel processors — including CPUs, integrated and discrete GPUs, Neural Processing Units (NPUs), and Field-Programmable Gate Arrays (FPGAs).

OpenVINO's design philosophy separates model training from model deployment. Practitioners train models using PyTorch, TensorFlow, or other frameworks, then use OpenVINO to convert, optimise, and serve those models in production, potentially on hardware very different from the GPU cluster used for training. This separation is particularly valuable in edge AI scenarios where inference must run on Intel-based industrial PCs, embedded systems, or client-side hardware rather than on cloud servers.

Architecture

Inference Engine

The OpenVINO Inference Engine is the runtime component that executes optimised models on target hardware. It exposes a device-agnostic API that abstracts hardware differences: the same application code runs on a CPU, GPU, or NPU simply by specifying a different device string ("CPU", "GPU", "NPU") at initialisation. The Inference Engine automatically selects the most efficient execution path for each hardware target, applying device-specific kernel optimisations, memory allocation strategies, and throughput tuning.

Heterogeneous execution allows different layers of a single model to run on different devices simultaneously — for example, CPU-unsupported operations falling back to GPU while the bulk of computation runs on the NPU — maximising hardware utilisation.

Model Optimisation Tools

OpenVINO provides model compression and optimisation tools beyond simple format conversion:

Post-Training Quantisation (PTQ): Converts FP32 weights to INT8 with minimal accuracy loss using a small calibration dataset, reducing memory usage and increasing throughput on hardware with integer arithmetic acceleration
Quantisation-Aware Training (QAT): Integration with PyTorch and TensorFlow training pipelines to simulate quantisation during training for higher accuracy at INT8 precision
Filter Pruning: Removes redundant convolutional filters, reducing the computational cost of inference
Weight Compression: 4-bit and 8-bit weight compression for large language models, enabling LLM inference on Intel CPUs and client-side hardware

Supported Model Types

OpenVINO's 2025 releases have substantially expanded support for generative AI models. The toolkit now supports a large catalogue of LLMs including Llama, Qwen, Mistral, and Phi families, as well as diffusion models, vision-language models, and speech recognition models. The openvino_genai library provides high-level pipelines for LLM text generation, image generation, speech recognition, and visual question answering with OpenVINO-optimised execution.

Conversion Workflow

A typical OpenVINO deployment follows this sequence: the practitioner trains or downloads a model in PyTorch or another framework, converts it to OpenVINO's Intermediate Representation (IR) format using the Model Conversion API (previously the Model Optimizer), optionally applies quantisation or other optimisations using the Neural Network Compression Framework (NNCF), and then loads and runs the model using the OpenVINO Runtime Python or C++ API.

The resulting IR format consists of an .xml file describing the model topology and a .bin file containing the binary weights.

Target Applications

OpenVINO was originally developed with computer vision inference in mind — accelerating object detection, face recognition, pose estimation, and video analytics on Intel hardware at the edge. It has since expanded to cover all major AI domains:

Industrial machine vision systems using Intel Core or Xeon processors
Smart city camera analytics on Intel OpenVINO-certified hardware
Healthcare imaging on hospital workstations and diagnostic equipment
In-vehicle AI using Intel processors in automotive platforms
On-premises LLM inference on Intel Xeon servers and Intel Arc GPUs
Client-side AI on PCs with Intel Core Ultra NPUs (AI PC market segment)

Recent Developments

Intel's 2025 releases have focused on generative AI acceleration. OpenVINO 2025.0 through 2025.4 delivered expanded NPU support for Intel Core Ultra platforms, improved LLM performance, new model coverage including Qwen3 and recent Llama variants, and integration with agentic AI frameworks. Intel has positioned OpenVINO as the primary inference stack for AI PCs — a market segment defined by the presence of a dedicated NPU for on-device AI acceleration.

Malaysian Context — OpenVINO and Intel in Malaysia's Industrial and Technology Sectors

Intel has a significant industrial footprint in Malaysia, with major manufacturing and assembly, test, and packaging (AT&P) operations in Penang that have operated since 1972. Intel Penang is one of the company's largest manufacturing facilities globally and employs tens of thousands of engineers and technicians. This industrial presence creates a natural pathway for OpenVINO adoption in Malaysian manufacturing contexts, where Intel-based industrial PCs and edge computing platforms are commonly deployed.

In the context of Industry 4.0 adoption promoted under the MyDigital Blueprint, Malaysian manufacturers — particularly in the electronics, semiconductor, and automotive supply chain sectors in Penang and the Klang Valley — are deploying machine vision systems for defect detection, process monitoring, and quality assurance. These systems frequently run on Intel-based edge hardware, making OpenVINO a relevant inference platform for computer vision models deployed in factory settings.

Penang's established electronics manufacturing ecosystem, which includes Tier 1 suppliers to Apple, Google, and automotive OEMs, has driven interest in OpenVINO among companies implementing AI-based quality inspection. The Intel AI Developer Program, accessible to Malaysian developers and engineers, provides access to OpenVINO documentation, optimised model repositories, and developer hardware through the Intel DevCloud platform.

Malaysian universities with strong engineering programmes — Universiti Sains Malaysia (USM) in Penang and Universiti Teknologi Malaysia (UTM) — have incorporated Intel's AI tools, including OpenVINO, into their embedded systems and computer vision curricula as part of industry-aligned programmes. MDEC's Malaysia Digital initiative recognises AI infrastructure development as a qualifying activity, supporting companies deploying OpenVINO-based edge AI solutions for local and export markets.

The growth of Intel AI PC hardware in Malaysia's consumer and enterprise laptop market has also brought NPU-based local inference within reach of everyday computing platforms, enabling on-device AI applications that previously required cloud connectivity — a development relevant to Malaysia's privacy-conscious regulatory environment under the PDPA.

References

Intel Corporation. (2025). Intel Distribution of OpenVINO Toolkit. https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html
OpenVINO Toolkit. (2025). OpenVINO 2025.4: Faster Models, Smarter Agents. Medium. https://medium.com/openvino-toolkit
GitHub. (2025). openvinotoolkit/openvino. https://github.com/openvinotoolkit/openvino
Viso.ai. (2024). Intel OpenVINO Toolkit: A Comprehensive Overview. https://viso.ai/computer-vision/intel-openvino-toolkit-overview/
Intel Corporation. (2024). AI PC: On-Device AI with Intel Core Ultra. Intel Newsroom.

Tags:Intel inference optimisation edge AI model deployment deep learning

Developer	Intel Corporation
First released	2018
Licence	Apache 2.0
Supported hardware	Intel CPU, GPU, NPU, VPU, FPGA
Key frameworks supported	PyTorch, TensorFlow, ONNX, PaddlePaddle
Related	ONNX, TensorFlow Lite, CoreML, Edge AI, model compression