Research into Language Difficulty

CEFR.AI calibrates language difficulty by combining text complexity with task demand.

CEFR.AI triangulation model Equilateral triangle with three evidence anchors: professional materials, CEFR.AI Test scores, and linguistic research. CEFR.AI Calibration Engine Professional Materials CEFR.AI Test Scores Linguistic Research
CEFR.AI Calibration Engine Professional Materials CEFR.AI Test Scores Linguistic Research

How We Calibrate

CEFR.AI Placement Scores

Real learner outcomes from CEFR.AI proprietary assessment.

Take the Test
Students reading in a classroom

Open Methodology

Scoring logic, assumptions, and calibration evidence are open for review so researchers can inspect, reproduce, and improve the method over time.

Official Tools

Analyse and Level Test are the first official CEFR.AI applications. They provide practical tools grounded in the same calibration framework and research methodology.

Coming Soon

An open CEFR.AI API and third-party app directory are in development so external teams can build language products powered by CEFR.AI scoring and calibration.

Frameworks we Trust

CEFR (A1-C2) maps to GSE (10-90) so broad levels can be calibrated with finer precision.

Latest Research

View all posts

2026-03-17

Framework Note

Zone of Proximal Development in Language Learning

The Zone of Proximal Development (ZPD) is one of the most useful ideas in language education, but it is often described too vaguely to guide real decisions. In practical terms, ZPD is the space between what a learner can do independently and what they can do with support. For CEFR.AI,...

Read post

2026-03-17

Method Note

Current Model: Score Engine v1

This note documents the current production scoring model as implemented in the score engine API (`meta.version = legacy-gse-v1`). The goal is methodological transparency: what v1 does well, what it does not do, and what evidence currently supports it. v1 estimates text difficulty from text-only inputs. It combines: - Flesch Reading...

Read post

2024-12-26

Research Note

Why Flesch-Kincaid Falls Short for CEFR Classification

Can native-speaker readability metrics really predict CEFR levels? Many online ESL/EFL text analysis tools have defaulted to using the Flesch-Kincaid readability index, simply because no specialized algorithms exist for language learners. Using 59 graded texts, we systematically test whether this widely-adopted solution actually works. While our investigation does reveal a...

Read post

2023-07-24

Framework Note

The Power of Language Frameworks

Language frameworks are not optional in language learning; they are the minimum structure required for reliable decisions. If we want to match learners with texts and tasks that are challenging but manageable, we need shared standards for what “difficulty” means. This note is a simple introduction to language frameworks: why...

Read post