What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

TensorFlow Lite

TensorFlow Lite is an open-source deep learning framework from Google for running optimised machine learning models on mobile phones, microcontrollers, and other edge devices.

5 min readLast updated June 2026Infrastructure

TensorFlow Lite is the on-device runtime and toolchain in the TensorFlow ecosystem, designed to execute machine learning models efficiently on resource-constrained hardware such as Android and iOS smartphones, single-board computers, microcontrollers, and other embedded systems. It was first released by Google in 2017 as a successor to TensorFlow Mobile and was rebranded LiteRT in 2024 to reflect its expansion to accept models from PyTorch, JAX, and Keras in addition to TensorFlow.

Purpose and design goals

TensorFlow Lite is built for three operational requirements that distinguish edge inference from cloud serving: low latency, small binary footprint, and offline operation. The runtime is a fraction of the size of full TensorFlow and is compiled as a static library that ships inside mobile apps. By executing models locally, applications avoid the round-trip cost of cloud inference, continue to function without connectivity, and keep raw user data such as photos, voice, and biometrics on the device.

Workflow

The standard workflow has three stages. First, a model is trained in TensorFlow, Keras, JAX, or (via the AI Edge Torch converter) PyTorch. Second, the trained model is converted to the .tflite flatbuffer format, optionally applying optimisations such as post-training integer quantisation, weight pruning, or operator fusion. Third, the .tflite file is bundled with the application and loaded by the LiteRT interpreter, which dispatches operations to the most appropriate backend.

The interpreter supports a delegate mechanism that offloads computation to hardware accelerators: the GPU delegate uses OpenGL, OpenCL, or Metal; the NNAPI delegate routes through Android's Neural Networks API; the Core ML delegate targets Apple Neural Engine on iOS; and the Hexagon delegate uses Qualcomm DSPs. Custom delegates exist for Edge TPU, MediaTek APUs, and other vendor silicon.

Optimisation techniques

| Technique | Effect | |---|---| | Post-training quantisation | Convert float32 weights to int8 or float16; 2-4x size reduction, modest accuracy loss | | Quantisation-aware training | Train with simulated quantisation to preserve accuracy | | Pruning | Zero out small-magnitude weights; combine with sparse kernels for speedup | | Operator fusion | Merge adjacent ops (e.g., conv + bias + ReLU) at conversion time | | Selective build | Strip unused operators from the binary |

TensorFlow Lite for Microcontrollers

A subset of the runtime, TensorFlow Lite for Microcontrollers (TFLite Micro), targets devices with as little as a few kilobytes of RAM. It dispenses with dynamic memory allocation, supports a curated operator set, and underpins the TinyML movement, where keyword-spotting, gesture recognition, and anomaly detection run on Arm Cortex-M and ESP32 class chips drawing milliwatts of power.

Use cases

Common deployments include real-time computer vision in mobile camera apps, on-device speech recognition and keyword spotting, gesture and pose detection in fitness apps, OCR and document scanning, on-device translation, predictive text and smart reply, and physical-world inference in wearables, drones, and industrial sensors.

Malaysian Context — Mobile, Telco, and Smart Manufacturing

Malaysia has a high smartphone penetration rate (above 95 percent of households according to MCMC) and a strong consumer appetite for mobile applications, which has made on-device machine learning a practical channel for AI features. Grab uses on-device computer vision in its driver app for documentation capture and selfie verification across South-East Asia, including Malaysia. Touch 'n Go eWallet, Boost, Shopee, Lazada, and BigPay rely on on-device models for fraud signals, document scanning, and biometric flows. Local super apps and digital banks favour on-device inference for biometric eKYC because it keeps face data off the network and reduces latency on variable mobile connections.

The semiconductor sector concentrated in Penang and Kulim assembles and tests many of the AI accelerators that TensorFlow Lite ultimately targets. Companies including Intel, AMD, Infineon, MediaTek Malaysia, Broadcom, and Vitrox conduct chip-level test development that touches mobile and edge silicon. Vitrox and Pentamaster have publicly discussed using TensorFlow-based vision models on the factory floor.

The TinyML community in Malaysia is small but growing, anchored by university chapters at Universiti Sains Malaysia (USM), Universiti Teknologi Malaysia (UTM), Universiti Tunku Abdul Rahman (UTAR), and Multimedia University (MMU), and by MDEC-supported maker initiatives. Application areas include rice paddy monitoring, palm oil pest detection on edge gateways supported by FELDA and the Malaysian Palm Oil Board, energy monitoring in commercial buildings, and water-quality sensing for state water utilities. The HRD Corp levy scheme reimburses participating employers for Edge AI and TinyML training, which has supported workforce development around embedded ML deployment.

References

Google. LiteRT (formerly TensorFlow Lite) Documentation.
David, R. et al. (2021). TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems. MLSys.
Google Developers Blog. (2024). AI Edge Torch: High Performance Inference of PyTorch Models on Mobile Devices.
Jacob, B. et al. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. CVPR.
Lee, J. et al. (2019). On-device neural net inference with mobile GPUs. arXiv:1907.01989.

Tags:tensorflow edge-ai on-device-inference mobile-ml

Type	On-device inference framework
Developed by	Google
Released	2017 (now branded LiteRT)
Languages	C++, Java, Kotlin, Swift, Objective-C, Python
Target hardware	Android, iOS, microcontrollers, Linux SBCs
Related	TensorFlow, ONNX Runtime, Core ML, edge AI

Purpose and design goals

Workflow

Optimisation techniques

TensorFlow Lite for Microcontrollers

Use cases

See Also

References

References