What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Precision and Recall

Precision and recall are two complementary metrics used to evaluate classification models, measuring respectively the correctness of positive predictions and the completeness with which actual positives are identified.

4 min readLast updated June 2026Foundations

Precision and recall are paired evaluation metrics for classification and information-retrieval systems. Precision answers the question of how many of the items the model labelled positive are actually positive, while recall answers how many of the truly positive items the model managed to find. Because the two capture different kinds of error, they are almost always reported together, and improving one often comes at the expense of the other.

Definitions and the confusion matrix

Both metrics are computed from the four cells of a confusion matrix: true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN). Precision is defined as TP / (TP + FP), the fraction of positive predictions that are correct. Recall, also called sensitivity or the true positive rate, is TP / (TP + FN), the fraction of actual positives that were retrieved.

A model that predicts positive very rarely, only when extremely confident, will tend to have high precision but low recall, missing many genuine cases. A model that predicts positive liberally will have high recall but low precision, raising many false alarms. The right balance depends entirely on the costs attached to each type of error.

Why accuracy is not enough

Plain accuracy, the proportion of all predictions that are correct, can be deeply misleading when classes are imbalanced. If only one transaction in ten thousand is fraudulent, a model that labels every transaction as legitimate achieves 99.99 percent accuracy while detecting no fraud at all. Precision and recall expose this failure immediately, because recall on the fraud class would be zero. For this reason, imbalanced problems in fraud detection, medical screening and anomaly detection rely on precision and recall rather than accuracy.

Combining the two

To summarise both numbers in a single figure, practitioners use the F1 score, the harmonic mean of precision and recall, given by F1 = 2 * (precision * recall) / (precision + recall). The harmonic mean penalises large disparities, so a high F1 requires both metrics to be reasonably high. Where one concern dominates, the more general F-beta score weights recall more or less heavily than precision.

Because most classifiers output a continuous score rather than a hard label, the decision threshold can be tuned to trade precision against recall. Sweeping the threshold produces a precision-recall curve, and the area under it provides a threshold-independent measure of quality that is particularly informative for imbalanced data.

| Scenario | Priority | Rationale | | --- | --- | --- | | Medical screening | Recall | Missing a true case is costly | | Spam filtering | Precision | Blocking a real email is costly | | Search ranking | Both, via F1 | Relevance and completeness matter | | Fraud detection | Balanced, tuned | Errors carry asymmetric cost |

Use in information retrieval

The concepts originate in information retrieval, where precision measures the relevance of returned documents and recall measures coverage of all relevant documents. Modern search and recommendation systems, including those built on vector databases and semantic search, continue to report precision at k and recall at k to describe ranking quality.

Malaysian Context — Evaluating High-Stakes Models

Precision and recall are central to how Malaysian institutions assess AI systems where errors carry real consequences. In the financial sector, banks such as Maybank and CIMB deploy fraud-detection and anti-money-laundering models whose tuning is shaped by the differing costs of false positives, which inconvenience customers, and false negatives, which let illicit activity through. Bank Negara Malaysia and the Securities Commission expect such models to be validated with appropriate metrics rather than headline accuracy.

In healthcare, hospitals and analytics providers working with the Ministry of Health weigh recall heavily in diagnostic screening tools, since a missed condition is more dangerous than a false alarm that prompts a confirmatory test. Initiatives under the MyDigital Blueprint that apply AI to public services adopt similar evaluation discipline.

Telecommunications operators including Maxis, CelcomDigi and TM use precision and recall to judge churn-prediction and network-anomaly models, balancing the cost of unnecessary retention offers against missed at-risk customers. Training programmes supported by HRD Corp and MDEC increasingly teach these metrics as standard practice, reflecting their importance for any organisation deploying classification systems in regulated Malaysian industries.

References

Manning, C. D., Raghavan, P., and Schutze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies.
Davis, J. and Goadrich, M. (2006). The Relationship Between Precision-Recall and ROC Curves. Proceedings of the 23rd International Conference on Machine Learning.

Tags:model evaluation classification metrics confusion matrix F1 score

Field	Machine learning, information retrieval
Precision	TP / (TP + FP)
Recall	TP / (TP + FN)
Combined as	F1 score (harmonic mean)
Derived from	Confusion matrix
Related	Accuracy, ROC curve