What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

DataOps

DataOps is an engineering methodology that applies agile, DevOps, and lean manufacturing principles to data pipelines, aiming for rapid, reliable, and repeatable delivery of analytics and machine learning data.

4 min readLast updated June 2026Infrastructure

DataOps (a portmanteau of "data" and "operations") is a methodology and set of practices for delivering data and analytics at production quality with the speed and reliability that modern businesses expect. It applies the principles of DevOps, agile software development, and lean manufacturing to the full data lifecycle: ingestion, transformation, quality testing, deployment, and monitoring of pipelines that feed dashboards, machine learning models, and operational systems.

Origin and definition

The term was coined by analyst Lenny Liebmann in 2014 and popularised by the DataKitchen team and others who observed that traditional data warehousing projects suffered from the same coordination and quality problems that DevOps had addressed in software engineering. The DataOps Manifesto, published in 2017, codified 18 principles that emphasise continuous delivery of analytic insights, treating analytics as code, and building quality measurement into pipelines.

Core practices

A mature DataOps practice typically combines several disciplines:

| Practice | Purpose | |---|---| | Pipeline as code | Define ingestion and transformation in version-controlled SQL or Python | | Orchestration | Schedule and monitor DAGs using Airflow, Dagster, Prefect, or Argo | | Data quality testing | Assert schemas, freshness, row counts, and business rules (Great Expectations, dbt tests, Soda) | | Environment promotion | Develop, stage, and produce with isolated data | | Observability | Lineage, anomaly detection on data volumes, freshness SLAs | | Catalogue and contracts | Document datasets, owners, and producer-consumer contracts | | Incident response | Treat broken pipelines as production incidents with postmortems |

The supporting tool stack usually includes a cloud data warehouse or lakehouse (Snowflake, BigQuery, Databricks, Redshift, ClickHouse), a transformation layer (dbt, SQLMesh, Spark), an orchestration engine, a catalogue (Atlan, DataHub, OpenMetadata, Unity Catalog), and an observability platform (Monte Carlo, Bigeye, Acceldata, Sifflet).

Relationship to DevOps and MLOps

DataOps shares with DevOps a commitment to automation, continuous integration, version control, and shared accountability between teams. It differs in the centrality of data as the primary artefact: a deployment can succeed while the underlying data silently degrades, so DataOps adds explicit data testing and observability as first-class concerns. MLOps extends DataOps further to cover model training, evaluation, deployment, and drift monitoring; in practice the disciplines overlap heavily and many organisations run them as a single platform team.

Benefits and adoption challenges

Organisations that adopt DataOps typically report shorter time-to-insight, fewer broken dashboards, more reproducible analytics, and clearer ownership of data assets. Industry surveys regularly cite multi-fold productivity gains for teams that automate testing and deployment compared with those relying on manual processes.

Adoption challenges include legacy ETL systems that resist version control, organisational silos between data engineers and analysts, the cost of refactoring pipelines, and the difficulty of agreeing data contracts between producing and consuming teams.

Malaysian Context — Government and Enterprise Data Modernisation

DataOps adoption in Malaysia has been driven primarily by the financial services, telecommunications, government, and e-commerce sectors. Bank Negara Malaysia (BNM) and the Securities Commission Malaysia (SC) require licensed institutions to demonstrate data lineage, quality controls, and timely regulatory reporting, which has pushed banks toward pipeline-as-code and automated testing. Maybank, CIMB, RHB, Public Bank, and Hong Leong have all invested in modern data platforms built around cloud warehouses, dbt-style transformation, and observability tooling.

Telecommunications players (Maxis, Celcom Digi, U Mobile, TM) operate some of the country's largest data estates, processing network telemetry, billing, and customer interaction data, and rely heavily on orchestration and data quality automation. The merged CelcomDigi data engineering organisation has been a public reference for DataOps modernisation in the region.

In the public sector, the MyDigital Blueprint and the Public Sector Big Data Analytics (DRSA) programme led by MAMPU emphasise interoperable data sharing across ministries, which has surfaced classical DataOps requirements such as data catalogues, standardised schemas, and quality contracts. The Department of Statistics Malaysia (DOSM) publishes open datasets through the OpenDOSM portal, requiring pipeline reliability for trusted official statistics.

MDEC subsidises data engineering and DataOps training through HRD Corp claimable programmes and the eUsahawan and Premier Digital Tech Institution (PDTI) partner universities, while local consultancies and cloud partners (Fusionex, Silverlake Axis, Securemetric, Innov8tif, Naluri, Avanade Malaysia) deliver implementation services. Penang and Cyberjaya host a growing community of data engineers serving regional clients across ASEAN, and DataOps roles are among the fastest-growing job categories on Malaysian tech recruitment platforms.

References

Liebmann, L. (2014). DataOps: Why Big Data Infrastructure Matters. IBM Big Data and Analytics Hub.
DataKitchen. (2017). The DataOps Manifesto.
Bergh, C., Benghiat, G., and Strod, E. (2019). The DataOps Cookbook. DataKitchen.
Atlan. (2025). DataOps: Essential Guide and Principles.
MAMPU. Public Sector Big Data Analytics (DRSA) Strategic Plan.

Tags:data-engineering mlops devops data-pipelines

Type	Data engineering methodology
Coined	Lenny Liebmann, 2014
Builds on	DevOps, agile, lean manufacturing
Goal	Reliable, repeatable analytics delivery
Core practices	Pipeline orchestration, data testing, observability
Related	MLOps, DevOps, data engineering

Origin and definition

Core practices

Relationship to DevOps and MLOps

Benefits and adoption challenges

See Also

References

References