Digital Library
-
Aryadoust, V., Zakaria, A., & Jia, Y. (2024). Investigating the affordances of OpenAI’s large language model in developing listening assessments, Computers and Education: Artificial Intelligence, 6, https://doi.org/10.1016/j.caeai.2024.100204
Tags: listening, task and item generation
-
Attali, Y., Runge, A., LaFlair, G. T., Yancey, K., Goodwin, S., Park, Y., & von Davier, A. A. (2022). The interactive reading task: Transformer-based automatic item generation. Frontiers in Artificial Intelligence, 5, https://doi.org/10.3389/frai.2022.903077
Tags: reading, task and item generation
-
Belzak, W.C.M., Naismith, B., Burstein, J. (2023). Ensuring Fairness of Human- and AI-Generated Test Items. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_108
Tags: fairness, bias, differential item functioning, task and item generation
-
Bezirhan, U., & von Davier, M. (2023). Automated reading passage generation with OpenAI’s large language model. Computers and Education: Artificial Intelligence, 5, 100161. https://doi.org/10.1016/j.caeai.2023.100161
Tags: text generation, reading
-
Bolender, B., Foster, C. & Vispoel, S. (2023). The Criticality of Implementing Principled Design When Using AI Technologies in Test Development, Language Assessment Quarterly, 20(4-5), 512-519, https://doi.org/10.1080/15434303.2023.2288266
Tags: task and item generation, test development
-
Choi, I. & Zu, J. (2022), The impact of using synthetically generated listening stimuli on test-taker performance: A case study with multiple-choice, single-selection items. ETS Research Report Series, 2022(1), 1–14. https://doi.org/10.1002/ets2.12347
Tags: text generation, listening
-
Felice, M., Taslimipoor, S & Buttery, P (2022). Constructing Open Cloze Tests Using Generation and Discrimination Capabilities of Transformers. Association for Computational Linguistics. In Findings of the Association for Computational Linguistics. https://arxiv.org/pdf/2204.07237.pdf
Tags: AIG
-
Ferrara, S., & Qunbar, S. (2022). Validity arguments for AI‐based automated scores: Essay scoring as an illustration. Journal of Educational Measurement, 59(3), 288–313. https://doi.org/10.1111/jedm.12333
Tags: writing, scoring, validity
-
Hannah, L., Kim, H., & Jang, E. (2022) Investigating the Effects of Task Type and Linguistic Background on Accuracy in Automated Speech Recognition Systems: Implications for Use in Language Assessment of Young Learners, Language Assessment Quarterly, 19(3), 289-313, https://doi.org/10.1080/15434303.2022.2038172
Tags: scoring, speaking, young learners
-
Isaacs, T., Hu, R., Trenkic, D., & Varga, J. (2023). Examining the predictive validity of the Duolingo English Test: Evidence from a major UK university. Language Testing, 40(3), 748-770. https://doi.org/10.1177/02655322231158550
Tags: criterion related evidence, predictive validity
-
Jin, Y. & Fan, J. (2023). Test-Taker Engagement in AI Technology-Mediated Language Assessment, Language Assessment Quarterly, 20(4-5), 488-500, https://doi.org/10.1080/15434303.2023.2291731
Tags: test taker engagement, validation, test development
-
Khademi, A. (2023). Can ChatGPT and Bard generate aligned assessment items? A reliability analysis against human performance. Journal of Applied Learning and Teaching, 6(1), 75-80. https://journals.sfu.ca/jalt/index.php/jalt/article/view/783
Tags: writing, task and item generation
-
LaFlair, G., Runge, A., Attali, Y., Park, Y., Church, J., & Goodwin, S. (2023). Interactive listening–The Duolingo English Test (Duolingo Research Report DRR-23-01; pp. 1–17). Duolingo. https://go.duolingo.com/interactive-listening-whitepaper
Tags: task design, listening
-
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring, Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
Tags: scoring, writing
-
O’Grady, S. (2023). An AI Generated Test of Pragmatic Competence and Connected Speech. Language Teaching Research Quarterly, 37, 188-203. https://doi.org/10.32038/ltrq.2023.37.10
Tags: listening, task and item generation
-
O’Sullivan, B. (2023). Reflections on the Application and Validation of Technology in Language Testing, Language Assessment Quarterly, 20(4-5), 501-511, https://doi.org/10.1080/15434303.2023.2291486
Tags: validity, validation
-
O’Sullivan, B., Breakspear, T. & Bayliss, W. (2023). Validating an AI-driven scoring system: The Model Card approach. In K. Sadeghi & D. Douglas (Eds.), Fundamental considerations in technology mediated language assessment (pp.115-134). Routledge.
Tags: scoring
-
Park, Y., Cardwell, R., Goodwin, S., Naismith, B., LaFlair, G., Lo, K., & Yancey, K. (2023). Assessing speaking on the Duolingo English Test (Duolingo Research Report DRR-23-03; pp. 1-15). Duolingo. https://duolingo-testcenter.s3.amazonaws.com/media/resources/speaking-whitepaper.pdf.
Tags: task design, speaking
-
Sayin, A., & Gierl, M. (2024). Using OpenAI GPT to generate reading comprehension items. Educational Measurement: Issues and Practice, Early view. https://doi.org/10.1111/emip.12590
Tags: item generation, reading
-
Shi, H., & Aryadoust, V. (2024). A systematic review of AI-based automated written feedback research. ReCALL, 1–23. https://www.cambridge.org/core/journals/recall/article/systematic-review-of-aibased-automated-written-feedback-research/28A670C4C7F2F1F30C7EA36EC489F867
Tags: feedback
-
Shin, I., & Gierl, M. (2022). Generating reading comprehension items using automated processes. International Journal of Testing, 22(3-4), 289-311. https://doi.org/10.1080/15305058.2022.2070755
Tags: item generation, reading
-
Van Moere, A., & Downey, R. (2016). 21. Technology and artificial intelligence in language assessment. Handbook of second language assessment, 341-358.
Tags: scoring
-
Voss, E., Cushing, S., Ockey, G. & Yan, X. (2023). The Use of Assistive Technologies Including Generative AI by Test Takers in Language Assessment: A Debate of Theory and Practice, Language Assessment Quarterly, 20(4-5), 520-532, https://doi.org/10.1080/15434303.2023.2288256
Tags: construct definition, scoring and rubric design, validity, fairness, equity, bias, copyright
-
Xi, X. (2023). Advancing Language Assessment with AI and ML–Leaning into AI is Inevitable, but Can Theory Keep Up?, Language Assessment Quarterly, 20(4-5), 357-376, https://doi.org/10.1080/15434303.2023.2291488
Tags: validity
-
Xi, X. (2022). Validity and the automated scoring of performance tests. In G. Fulcher & L. Harding (Eds.), The Routledge Handbook of Language Testing (pp. 513-529). Routledge.
Tags: scoring, validity
-
Yunjiu, L., Wei, W., & Zheng, Y. (2022). Artificial intelligence-generated and human expert-designed vocabulary tests: A comparative study. SAGE Open, 12(1). https://doi.org/10.1177/21582440221082130
Tags: vocabulary, task and item generation
-
Zhao, R., Zhuang, Y., Zou, D. et al. (2023) AI-assisted automated scoring of picture-cued writing tasks for language assessment. Education and Information Technologies, 28, 7031–7063. https://doi.org/10.1007/s10639-022-11473-y
Tags: writing, scoring
