AIWiki
Malaysia

Falcon LLM

A family of open-weight large language models developed by the Technology Innovation Institute (TII) in Abu Dhabi, released under permissive licenses and used widely across enterprise and research applications.

6 min readLast updated June 2026Models

Falcon LLM is a family of open-weight large language models developed by the Technology Innovation Institute (TII), the applied research arm of the Advanced Technology Research Council in Abu Dhabi, United Arab Emirates. First released in 2023, Falcon was the first major foundation-model release from outside the United States, China, and Europe to top the Hugging Face open LLM leaderboard, and it played a significant role in establishing the practice of releasing strong open-weight models under permissive commercial licenses. The Falcon series has expanded to include base, instruction-tuned, and small-form-factor variants suitable for on-device and edge deployments.

Model family

The first Falcon release in mid-2023 introduced Falcon 7B and Falcon 40B, two autoregressive decoder-only transformer models trained on RefinedWeb, a large filtered corpus derived from CommonCrawl, supplemented by curated text. Falcon 40B briefly held the top position on the Hugging Face Open LLM Leaderboard, surpassing contemporaneous open models from Meta and academic groups. In September 2023, TII released Falcon 180B, a 180-billion-parameter model trained on roughly 3.5 trillion tokens that achieved performance comparable to PaLM 2 Large and approached the capabilities of contemporary closed models such as GPT-3.5 on standard benchmarks.

Subsequent generations have focused on efficiency. Falcon 2 introduced multimodal capabilities and competitive 11-billion-parameter models, while the Falcon 3 series, released in late 2024, emphasised smaller, more efficient models in the 1-billion to 10-billion parameter range optimised for edge inference and resource-constrained deployment. The lineage is sometimes grouped under the broader Falcon Perception programme, which extends the family to multimodal and on-device perception tasks.

Architecture

Falcon models follow the standard decoder-only transformer architecture with several efficiency-oriented modifications. Multi-query attention reduces memory bandwidth requirements during inference by sharing key and value projections across attention heads. Rotary positional embeddings and FlashAttention enable efficient long-context training. The training pipeline relied heavily on data quality engineering: RefinedWeb demonstrated that aggressively filtered and deduplicated web data could match or outperform curated corpora such as The Pile, a finding that influenced the broader open-weight community.

Licensing and ecosystem

A defining feature of Falcon is its Apache 2.0 license, which permits unrestricted commercial use without revenue caps or other constraints. This contrasts with the more restrictive community licences attached to some peer open-weight families and has made Falcon attractive to enterprise users requiring legal clarity. Falcon weights are distributed primarily through Hugging Face and have been integrated into popular inference stacks including vLLM, Text Generation Inference, llama.cpp, and Ollama. Cloud providers including AWS and Microsoft Azure have made Falcon variants available through their managed model catalogues.

Applications

Falcon models are used across general-purpose chatbots, code assistants, retrieval-augmented generation pipelines, document summarisation, and domain-specific fine-tunes. Their permissive licence has made them particularly common in regulated industries — finance, healthcare, and government — where on-premises deployment and clear commercial terms are required. The smaller Falcon 3 models have been deployed in edge and embedded scenarios, including industrial inspection, robotics, and devices with limited compute, where 180-billion-parameter models would be impractical.

Comparison to peer models

| Model family | Origin | Largest size | Licence | Notable feature | |---|---|---|---|---| | Falcon | TII (UAE) | 180B | Apache 2.0 | First major non-US/CN/EU release; permissive licence | | Llama | Meta AI (USA) | 405B+ | Llama Community Licence | Largest open ecosystem | | Mistral | Mistral AI (France) | 8x22B (Mixtral) | Apache 2.0 / commercial | Mixture-of-experts variants | | Qwen | Alibaba (China) | 72B+ | Apache 2.0 (some variants) | Strong multilingual coverage |

Significance

Falcon's release was strategically important for several reasons. It established the UAE as a major contributor to open foundation models, demonstrated that high-quality web data engineering could rival curated corpora, and helped normalise the release of large frontier-scale models under permissive licences. It also catalysed regional investment in AI infrastructure and talent in the Gulf region and inspired peer initiatives across the Middle East and South-East Asia.

See Also

References

References

  1. Technology Innovation Institute. (2023). Falcon 180B: World's Most Powerful Open LLM. tii.ae.
  2. Penedo, G., et al. (2023). The RefinedWeb Dataset for Falcon LLM. arXiv.
  3. Almazrouei, E., et al. (2023). Falcon Series of Open Language Models. TII technical report.
  4. Hugging Face. (2024). Open LLM Leaderboard. huggingface.co.