Measuring AI Literacy
August 17, 205
Measuring AI Literacy can be challenging. In this post, I will summarize and connect to foundational learning theories, four recent articles on the topic and how they have created and validated measurement instruments on AI Literacy.
Big picture
What’s being measured: AI literacy spans factual/procedural knowledge (e.g., “What is AI?”, “How does it work?”) and evaluative/ethical understanding; some instruments also add self-efficacy and self-regulation components.
Psychometrics: Recent instruments show good structural validity and internal consistency, but many lack thorough content validity, cross-cultural invariance, and responsiveness testing; performance-based measures are scarce.
Why this matters for learning theory:
The knowledge components map cleanly to Information Processing (schema building) and Bloom’s taxonomy (from remembering/understanding to evaluating/ethics).
Self-efficacy, emotion regulation, and planning reflect Bandura’s social cognitive theory and self-regulated learning, shaping persistence and transfer beyond instruction.
Short vs. long tests trade precision for efficiency—an applied cognitive-load decision when assessments must fit real class/research time.
1) What do university students know about AI? (Hornberger et al., 2023)
What it is: A 31-item performance-based test aligned to Long & Magerko’s five areas (What is AI?; What can AI do?; How does AI work?; How should AI be used?; How do people perceive AI?). Table 1 lays out the mapping.
Psychometrics: IRT analysis supported unidimensionality with strong model fit; the 3-PL model fit best (cutoffs and comparisons reported).
Learning-theory link:
The single latent trait aligns with information-processing views—a coherent schema of AI knowledge; item-level modeling (IRT) supports mastery and adaptive diagnostics.
Reported relations to self-efficacy/interest/attitudes connect to motivation and SRL, predicting engagement and future learning behaviors.
2) Development and validation of a short AI literacy test (AILIT-S)(Hornberger et al., 2025)
What it is: A 10-item short form derived from the long test, keeping coverage across the same five competency areas (see Table 2) and targeting <5-minute administration.
Psychometrics: Maintains unidimensionality with good fit; expected lower reliability than the long form; high congruent validity with the long test (r = .91; Fig. 5).
Use guidance: Recommended mainly for group-level estimates (screening, pre/post in courses) given the reliability trade-off.
Learning-theory link:
Prioritizes cognitive-load management (short, time-efficient) while preserving core schema coverage—useful for retrieval practice and repeated measures in instructional cycles.
The short-form’s item selection by difficulty/discrimination is consistent with mastery learning and ZPD ideas—items span ability levels to inform instruction.
3) MAILS – Meta AI Literacy Scale (Carolus et al., 2023)
What it is: A self-report instrument with facets for Use & Apply, Understand, Detect, Ethics, plus Create AI as a related (separate) construct and psychological competencies: AI self-efficacy (learning/problem-solving) and AI self-competency (persuasion literacy, emotion regulation); validated with CFA/SEM.
Key findings: Evidence supports a higher-order AI literacy factor, with Create AI modeled separately; self-efficacy/competency clusters are distinct yet correlated.
Learning-theory link:
Explicit grounding in Bloom’s taxonomy for competency structuring.
Strong integration of Theory of Planned Behavior and Bandura’s self-efficacy, highlighting how perceived control, emotion regulation, and learning strategies drive long-term adoption—core to self-regulated learning.
4) A systematic review of AI literacy scales (Lintner, 2024; Science of Learning)
Scope & method: 22 studies, 16 scales; assessed with COSMIN and GRADE principles; inclusion and evaluation criteria detailed.
Findings:
Overall structural validity/internal consistency are good; content validity, reliability (test–retest), responsiveness, cross-cultural validity, and measurement error are under-reported.
Only 3 performance-based vs 13 self-report scales; calls for more performance-based tools and cross-validation between formats.
For higher ed, the AI Literacy Test (performance-based) and ChatGPT Literacy show the strongest current evidence; Intelligent TPACK is presently the option for teachers (readiness).
Learning-theory link:
The review’s critique (few performance tasks; limited invariance) matters because authentic performance better taps procedural knowledge and transfer (constructivist/experiential learning) than self-beliefs alone—important when instruction aims to change long-term behavior and competence schemas.
Practical takeaways for instruction & assessment
Choose format by purpose:
Performance-based (e.g., long test; AILIT-S for quick checks) for learning gains and schema-level understanding; lean on IRT-based forms for pre/post with item difficulty spread.
Self-report (MAILS) to monitor self-efficacy, emotion regulation, and attitudes that mediate usage and transfer; pair with a knowledge test to align motivation + mastery.
Design for cognitive load: Prefer short, reliable screeners (AILIT-S) for frequent formative checkpoints; reserve the long test for high-stakes diagnostics.
Map to Bloom + SRL: Build modules that progress from remember/understand → apply/evaluate/ethics, while teaching planning, monitoring, and emotion regulation to elevate self-efficacy (predictor of sustained AI use).
Mind validity gaps: If you work across contexts, plan for measurement invariance checks (e.g., DIF by major/level) or use instruments with emerging evidence in your population.
References
Carolus, A., Koch, M. J., Straka, S., Latoschik, M. E., & Wienrich, C. (2023). MAILS – meta AI literacy scale: Development and testing of an AI literacy questionnaire based on well-founded competency models and psychological change- and meta-competencies. Computers in Human Behavior: Artificial Humans, 1(2), 100014. https://doi.org/10.1016/j.chbah.2023.100014
Hornberger, M., Bewersdorff, A., & Nerdel, C. (2023). What do university students know about artificial intelligence? Development and validation of an AI literacy test. Computers and Education: Artificial Intelligence, 5, 100165. https://doi.org/10.1016/j.caeai.2023.100165
Hornberger, M., Bewersdorff, A., Schiff, D. S., & Nerdel, C. (2025). Development and validation of a short AI literacy test (AILIT-S) for university students. Computers in Human Behavior: Artificial Humans, 5, 100176. https://doi.org/10.1016/j.chbah.2025.100176
Lintner, T. (2024). A systematic review of AI literacy scales. npj Science of Learning, 9(1), 50. https://doi.org/10.1038/s41539-024-00264-4

