XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

A.J. McCracken; E. Papatheou; I. Karachalios; N.D. Tantaroudas; V. Pastrikakis

arxiv: 2604.06901 · v1 · submitted 2026-04-08 · 💻 cs.CE · cs.AI· cs.CV· cs.CY· cs.ET

XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

N.D. Tantaroudas , A.J. McCracken , I. Karachalios , E. Papatheou , V. Pastrikakis This is my paper

Pith reviewed 2026-05-10 17:41 UTC · model grok-4.3

classification 💻 cs.CE cs.AIcs.CVcs.CYcs.ET

keywords extended realitycareer guidancemultimodal AIimmersive platformspersonalised advicevirtual realityAI integrationuser evaluation

0 comments

The pith

The fusion of extended reality with five AI modules produces an immersive platform for personalised career guidance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces XR-CareerAssist as a platform that merges extended reality with several artificial intelligence components to offer immersive career guidance. Users interact through speech with a three-dimensional avatar, view career paths as animated diagrams drawn from a large database of professional histories, and receive support in multiple languages. This setup addresses the limitations of text-only systems by making career exploration more interactive and narrative-focused, potentially increasing user engagement and accessibility. Early testing with a small group indicated strong performance in speech handling and overall satisfaction, pointing to the value of combining these technologies in one environment.

Core claim

XR-CareerAssist demonstrates the integration of Extended Reality with five distinct AI modules—automatic speech recognition, neural machine translation, a conversational training assistant, a vision-language model, and text-to-speech synthesis—within a single immersive application. Built using Unity for the Meta Quest 3 headset and supported by AWS backend services, the platform renders career trajectories as dynamic Sankey diagrams sourced from over 100,000 anonymised professional profiles. A pilot evaluation involving 23 participants yielded high speech recognition accuracy and favourable user ratings for responsiveness and satisfaction, leading the authors to conclude that this multimodal

What carries the argument

The XR-CareerAssist platform, which unifies five AI modules (speech recognition, translation, conversational dialogue, vision-language processing, and speech synthesis) inside an XR environment to enable voice-driven interaction with a 3D avatar and dynamic career data visualizations.

If this is right

Career guidance incorporates real profile data through dynamic Sankey diagrams of professional trajectories.
Multilingual voice interaction supports users across English, Greek, French, and Italian without text input.
Deployment on consumer headsets with cloud backend makes the system scalable for wider access.
Pilot feedback directly informs refinements to motion comfort, audio, and text display for better usability.
The single-environment integration of multiple AI tools creates a template for other guidance applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same combination of XR immersion and modular AI could extend to narrative-driven advice in education planning or financial decisions.
Expanding the underlying profile repository with more recent or diverse data would likely strengthen the personalisation of career paths.
Direct comparisons in future tests could clarify whether the headset format or the specific AI features drive reported satisfaction levels.

Load-bearing premise

The assumption that results from a small uncontrolled pilot study with 23 participants demonstrate greater engagement and effectiveness than conventional platforms without any baseline comparison.

What would settle it

A larger randomised controlled trial that directly compares XR-CareerAssist against standard text-based career platforms on measures such as user retention, clarity of career decisions, or follow-up actions would settle the claim; no advantage for the immersive system would challenge its superiority.

Figures

Figures reproduced from arXiv: 2604.06901 by A.J. McCracken, E. Papatheou, I. Karachalios, N.D. Tantaroudas, V. Pastrikakis.

**Figure 2.** Figure 2: Envisioned scenario of XR-CareerAssist showing the complete user journey and AI integration [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: XR ASR pipeline: English voice input to text transcription and NMT translation to Italian. The [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: VL model extended through fine-tuning to respond to Sankey diagram queries. Two examples [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Dialogue system training manual for XR-CareerAssist, providing structured guidance for interact [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Local testing of PIPER TTS with XR-CareerAssist backend, demonstrating the initial evaluation [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Sample Sankey diagram generated from CVCOSMOS data for a user profile input with 25 years [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Career map concept note for location shift target showing detailed metrics for career progression [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Job role evolution career map for specified user input showing 10-year progression. The left column [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Industry shift evolution career map for specified user input over 10 years, showing how profes [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Questionnaire flow within the XR environment showing two sequential input screens: role selection [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: XR-CareerAssist user interface within the XR environment as experienced on the Meta Quest [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 13.** Figure 13: Load testing results for 10,000 concurrent users. The top graph shows sustained throughput of [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗

**Figure 14.** Figure 14: Pilot demonstration setup at the University of Exeter showing the equipment arrangement and [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗

read the original abstract

Conventional career guidance platforms rely on static, text-driven interfaces that struggle to engage users or deliver personalised, evidence-based insights. Although Computer-Assisted Career Guidance Systems have evolved since the 1960s, they remain limited in interactivity and pay little attention to the narrative dimensions of career development. We introduce XR-CareerAssist, a platform that unifies Extended Reality (XR) with several Artificial Intelligence (AI) modules to deliver immersive, multilingual career guidance. The system integrates Automatic Speech Recognition for voice-driven interaction, Neural Machine Translation across English, Greek, French, and Italian, a Langchain-based conversational Training Assistant for personalised dialogue, a BLIP-based Vision-Language model for career visualisations, and AWS Polly Text-to-Speech delivered through an interactive 3D avatar. Career trajectories are rendered as dynamic Sankey diagrams derived from a repository of more than 100,000 anonymised professional profiles. The application was built in Unity for Meta Quest 3, with backend services hosted on AWS. A pilot evaluation at the University of Exeter with 23 participants returned 95.6% speech recognition accuracy, 78.3% overall user satisfaction, and 91.3% favourable ratings for system responsiveness, with feedback informing subsequent improvements to motion comfort, audio clarity, and text legibility. XR-CareerAssist demonstrates how the fusion of XR and AI can produce more engaging, accessible, and effective career development tools, with the integration of five AI modules within a single immersive environment yielding a multimodal interaction experience that distinguishes it from existing career guidance platforms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

XR-CareerAssist is a working Unity prototype that bundles standard AI tools into an XR career advisor with a small pilot, but the evaluation gives only weak support for claims of superior engagement or effectiveness.

read the letter

The paper delivers a concrete system rather than a new algorithm or theory. They wired Automatic Speech Recognition, neural translation across four languages, a Langchain conversational agent, BLIP vision-language processing, and AWS Polly TTS into a single Unity app running on Meta Quest 3, with Sankey diagrams drawn from over 100,000 real career profiles. That specific assembly for immersive guidance is new enough to be worth describing.

Referee Report

1 major / 1 minor

Summary. The paper introduces XR-CareerAssist, a Unity-based Meta Quest 3 platform that integrates five AI modules—Automatic Speech Recognition, Neural Machine Translation (English/Greek/French/Italian), a Langchain conversational Training Assistant, a BLIP vision-language model for career visualisations, and AWS Polly TTS via a 3D avatar—within an immersive XR environment. Career trajectories are rendered as dynamic Sankey diagrams derived from a repository of more than 100,000 anonymised professional profiles. The system is described in detail with AWS backend services, followed by a pilot evaluation at the University of Exeter involving 23 participants that reports 95.6% speech recognition accuracy, 78.3% overall user satisfaction, and 91.3% favourable responsiveness ratings, leading to the claim that this XR-AI fusion produces more engaging, accessible, and effective career development tools that distinguish it from existing platforms.

Significance. If the central claims hold, the work provides a concrete demonstration of technical feasibility in fusing multiple AI modalities (speech, translation, dialogue, vision, and synthesis) into a single XR interface for career guidance, extending beyond static text-based systems by incorporating narrative and visual elements via Sankey diagrams. The clear reporting of pilot metrics offers initial evidence of usability and technical performance that could inform future multimodal XR applications in education and professional development.

major comments (1)

[Abstract] Abstract: The assertion that the platform produces 'more engaging, accessible, and effective career development tools' and yields 'a multimodal interaction experience that distinguishes it from existing career guidance platforms' is not supported by the pilot evaluation, which provides only absolute metrics (95.6% speech accuracy, 78.3% satisfaction, 91.3% responsiveness) without any control arm, baseline comparison to conventional text-driven platforms, pre/post measures, or objective engagement indicators such as session duration or insight retention.

minor comments (1)

[Evaluation] The pilot evaluation section would benefit from explicit details on participant demographics, recruitment procedures, exact questionnaire items, and any statistical analysis performed on the reported percentages.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and the recommendation for major revision. We agree that the abstract's comparative claims exceed what the pilot data can support and will revise the manuscript to align language with the evidence presented.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that the platform produces 'more engaging, accessible, and effective career development tools' and yields 'a multimodal interaction experience that distinguishes it from existing career guidance platforms' is not supported by the pilot evaluation, which provides only absolute metrics (95.6% speech accuracy, 78.3% satisfaction, 91.3% responsiveness) without any control arm, baseline comparison to conventional text-driven platforms, pre/post measures, or objective engagement indicators such as session duration or insight retention.

Authors: We concur with the referee that the pilot provides only absolute performance and satisfaction metrics from 23 participants without a control condition, baseline comparison to text-based systems, or objective engagement measures. The study was conceived as an initial feasibility demonstration of the technical integration rather than a comparative efficacy trial. We will therefore revise the abstract to remove the unsubstantiated comparative assertions. The revised wording will describe the platform's multimodal XR-AI architecture and report the pilot outcomes as preliminary evidence of technical viability and user acceptance, while explicitly noting the absence of comparative data. We will also expand the discussion section to acknowledge this limitation and outline plans for future controlled studies that include baseline comparisons and objective metrics such as session duration and insight retention. revision: yes

Circularity Check

0 steps flagged

No circularity: paper describes system construction and uncontrolled pilot evaluation with no derivations, equations, or fitted predictions

full rationale

The manuscript presents the design and implementation of XR-CareerAssist (Unity/Meta Quest 3 with five integrated AI modules: ASR, NMT, Langchain assistant, BLIP vision-language, and TTS avatar) plus a small pilot (n=23) reporting satisfaction and accuracy percentages. No mathematical derivation chain, parameter fitting, or predictive claims exist that could reduce to inputs by construction. Self-citations, if present, are not load-bearing for any uniqueness theorem or ansatz. The comparative claims of superiority rest on unblinded feedback rather than circular logic.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a descriptive paper on a software platform and user study. No mathematical models, derivations, or theoretical constructs are present that would require free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5623 in / 1198 out tokens · 115093 ms · 2026-05-10T17:41:38.988160+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 2 internal anchors

[1]

Springer, Cham, 2026a

Lecture Notes in Computer Science, volume 15742. Springer, Cham, 2026a. doi: 10.1007/978-3-031-97778-7_7. N. D. Tantaroudas, A. J. McCracken, I. Karachalios, and E. Papatheou. INTERACT: AI-powered extended reality platform for inclusive communication with real-time sign language translation and sentiment anal- ysis.Open Research Europe, 6:71, 2026b. doi: ...

work page doi:10.1007/978-3-031-97778-7_7
[2]

doi: 10.1002/cdq.12142. 19 J. G. Maree.Counselling for Career Construction: Connecting Life Themes to Construct Life Portraits. Sense Publishers, Rotterdam,

work page doi:10.1002/cdq.12142
[3]

doi: 10.1007/978-94-6209-272-3. S. A. Leung. New frontiers in computer-assisted career guidance systems (CACGS): Implications from career construction theory.Frontiers in Psychology, 13:786232,

work page doi:10.1007/978-94-6209-272-3
[4]

doi: 10.3389/fpsyg.2022.786232. Mark L. Savickas. Life design: A paradigm for career intervention in the 21st century.Journal of Counseling & Development, 90(1):13–19,

work page doi:10.3389/fpsyg.2022.786232 2022
[5]

doi: 10.1111/j.1556-6676.2012.00002.x. J. Garcia Estrada and E. Prasolova-Førland. Developing VR content for digital career guidance in the context of the pandemic. In2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pages 38–43. IEEE,

work page doi:10.1111/j.1556-6676.2012.00002.x 2012
[6]

doi: 10.1109/VRW52623.2021.00013. D. S. Shepiliev, S. O. Semerikov, Y. V. Yechkalo, V. V. Tkachuk, O. M. Markova, Y. O. Modlo, I. S. Mintii, M. M. Mintii, T. V. Selivanova, N. K. Maksyshko, T. A. Vakaliuk, V. V. Osadchyi, R. O. Tarasenko, S. M. Amelina, and A. E. Kiv. Development of career guidance quests using WebAR. InJournal of Physics: Conference Seri...

work page doi:10.1109/vrw52623.2021.00013 2021
[7]

On the Opportunities and Risks of Foundation Models

doi: 10.1088/1742-6596/1840/1/012028. Rishi Bommasani, Daniel A. Hudson, Ehsan Adeli, Russ Altman, Simge Arber, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1742-6596/1840/1/012028
[8]

doi: 10.48550/arXiv.2108.07258. Jr. Sampson, James P., Gary W. Peterson, Robert C. Reardon, and Janet G. Lenz. Using readiness assess- ment to improve career services: A cognitive information processing approach.The Career Development Quarterly, 49(2):146–174,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258
[9]

Steven D

doi: 10.1002/j.2161-0045.2000.tb00556.x. Steven D. Brown and Robert W. Lent. Vocational psychology: Agency, equity, and well-being.Annual Review of Psychology, 67:541–565,

work page doi:10.1002/j.2161-0045.2000.tb00556.x 2000
[10]

AdriánJosé-García, AdamSneyd, AnaMelro, AnaïsOllagnier, GraemeTarling, HaoZhang, MarkStevenson, Richard Everson, and Ruben Arthur

doi: 10.1146/annurev-psych-122414-033237. AdriánJosé-García, AdamSneyd, AnaMelro, AnaïsOllagnier, GraemeTarling, HaoZhang, MarkStevenson, Richard Everson, and Ruben Arthur. C3-IoC: A career guidance system for assessing student skills using machine learning and network visualisation.International Journal of Artificial Intelligence in Education, 33(4):1092–1119,

work page doi:10.1146/annurev-psych-122414-033237
[11]

Jaziar Radianti, Tim A

doi: 10.1007/s40593-022-00317-y. Jaziar Radianti, Tim A. Majchrzak, Jennifer Fromm, and Isabell Wohlgenannt. A systematic review of immersive virtual reality applications for higher education: Design elements, lessons learned, and research agenda.Computers & Education, 147:103778,

work page doi:10.1007/s40593-022-00317-y
[12]

A Systematic Review of Immersive Virtual Reality Applications for Higher Education: Design Elements, Lessons Learned, and Research Agenda

doi: 10.1016/j.compedu.2019.103778. Lasse Jensen and Flemming Konradsen. A review of the use of virtual reality head-mounted displays in education and training.Education and Information Technologies, 23(4):1515–1529,

work page doi:10.1016/j.compedu.2019.103778 2019
[13]

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever

doi: 10.1007/ s10639-017-9676-0. Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InProceedings of the 40th International Conference on Machine Learning (ICML 2023), volume 202 ofPMLR, pages 28492–28518,

work page 2023
[14]

org/papers/v22/20-1307.html

URLhttps://jmlr. org/papers/v22/20-1307.html. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. InAdvances in Neural Information Processing Systems, volume 33, pages 1877–1901,

work page 1901
[15]

Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, and Ronan Collobert

URLhttps:// proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html. Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, and Ronan Collobert. MLS: A large-scale multilingual dataset for speech research. InInterspeech 2020, pages 2757–2761,

work page 2020
[16]

20 Yifan Du, Zikang Liu, Junyi Li, and Wayne Xin Zhao

doi: 10.21437/ Interspeech.2020-2826. 20 Yifan Du, Zikang Liu, Junyi Li, and Wayne Xin Zhao. A survey of vision-language pre-trained models. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), pages 5436–5443,

work page 2020
[17]

JunnanLi, DongxuLi, CaimingXiong, andStevenHoi

doi: 10.24963/ijcai.2022/762. JunnanLi, DongxuLi, CaimingXiong, andStevenHoi. BLIP:Bootstrappinglanguage-imagepre-trainingfor unified vision-language understanding and generation. InProceedings of the 39th International Conference on Machine Learning (ICML 2022), volume 162 ofPMLR, pages 12888–12900,

work page doi:10.24963/ijcai.2022/762 2022
[18]

Soong, and Tie-Yan Liu

URLhttps:// proceedings.mlr.press/v162/li22n.html. Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu. A survey on neural speech synthesis.arXiv preprint arXiv:2106.15561,

work page arXiv
[19]

Soong, and Tie-Yan Liu

doi: 10.48550/arXiv.2106.15561. CVCOSMOS. Career visualisation and analysis platform,

work page doi:10.48550/arxiv.2106.15561

[1] [1]

Springer, Cham, 2026a

Lecture Notes in Computer Science, volume 15742. Springer, Cham, 2026a. doi: 10.1007/978-3-031-97778-7_7. N. D. Tantaroudas, A. J. McCracken, I. Karachalios, and E. Papatheou. INTERACT: AI-powered extended reality platform for inclusive communication with real-time sign language translation and sentiment anal- ysis.Open Research Europe, 6:71, 2026b. doi: ...

work page doi:10.1007/978-3-031-97778-7_7

[2] [2]

doi: 10.1002/cdq.12142. 19 J. G. Maree.Counselling for Career Construction: Connecting Life Themes to Construct Life Portraits. Sense Publishers, Rotterdam,

work page doi:10.1002/cdq.12142

[3] [3]

doi: 10.1007/978-94-6209-272-3. S. A. Leung. New frontiers in computer-assisted career guidance systems (CACGS): Implications from career construction theory.Frontiers in Psychology, 13:786232,

work page doi:10.1007/978-94-6209-272-3

[4] [4]

doi: 10.3389/fpsyg.2022.786232. Mark L. Savickas. Life design: A paradigm for career intervention in the 21st century.Journal of Counseling & Development, 90(1):13–19,

work page doi:10.3389/fpsyg.2022.786232 2022

[5] [5]

doi: 10.1111/j.1556-6676.2012.00002.x. J. Garcia Estrada and E. Prasolova-Førland. Developing VR content for digital career guidance in the context of the pandemic. In2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pages 38–43. IEEE,

work page doi:10.1111/j.1556-6676.2012.00002.x 2012

[6] [6]

doi: 10.1109/VRW52623.2021.00013. D. S. Shepiliev, S. O. Semerikov, Y. V. Yechkalo, V. V. Tkachuk, O. M. Markova, Y. O. Modlo, I. S. Mintii, M. M. Mintii, T. V. Selivanova, N. K. Maksyshko, T. A. Vakaliuk, V. V. Osadchyi, R. O. Tarasenko, S. M. Amelina, and A. E. Kiv. Development of career guidance quests using WebAR. InJournal of Physics: Conference Seri...

work page doi:10.1109/vrw52623.2021.00013 2021

[7] [7]

On the Opportunities and Risks of Foundation Models

doi: 10.1088/1742-6596/1840/1/012028. Rishi Bommasani, Daniel A. Hudson, Ehsan Adeli, Russ Altman, Simge Arber, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1742-6596/1840/1/012028

[8] [8]

doi: 10.48550/arXiv.2108.07258. Jr. Sampson, James P., Gary W. Peterson, Robert C. Reardon, and Janet G. Lenz. Using readiness assess- ment to improve career services: A cognitive information processing approach.The Career Development Quarterly, 49(2):146–174,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258

[9] [9]

Steven D

doi: 10.1002/j.2161-0045.2000.tb00556.x. Steven D. Brown and Robert W. Lent. Vocational psychology: Agency, equity, and well-being.Annual Review of Psychology, 67:541–565,

work page doi:10.1002/j.2161-0045.2000.tb00556.x 2000

[10] [10]

AdriánJosé-García, AdamSneyd, AnaMelro, AnaïsOllagnier, GraemeTarling, HaoZhang, MarkStevenson, Richard Everson, and Ruben Arthur

doi: 10.1146/annurev-psych-122414-033237. AdriánJosé-García, AdamSneyd, AnaMelro, AnaïsOllagnier, GraemeTarling, HaoZhang, MarkStevenson, Richard Everson, and Ruben Arthur. C3-IoC: A career guidance system for assessing student skills using machine learning and network visualisation.International Journal of Artificial Intelligence in Education, 33(4):1092–1119,

work page doi:10.1146/annurev-psych-122414-033237

[11] [11]

Jaziar Radianti, Tim A

doi: 10.1007/s40593-022-00317-y. Jaziar Radianti, Tim A. Majchrzak, Jennifer Fromm, and Isabell Wohlgenannt. A systematic review of immersive virtual reality applications for higher education: Design elements, lessons learned, and research agenda.Computers & Education, 147:103778,

work page doi:10.1007/s40593-022-00317-y

[12] [12]

A Systematic Review of Immersive Virtual Reality Applications for Higher Education: Design Elements, Lessons Learned, and Research Agenda

doi: 10.1016/j.compedu.2019.103778. Lasse Jensen and Flemming Konradsen. A review of the use of virtual reality head-mounted displays in education and training.Education and Information Technologies, 23(4):1515–1529,

work page doi:10.1016/j.compedu.2019.103778 2019

[13] [13]

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever

doi: 10.1007/ s10639-017-9676-0. Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InProceedings of the 40th International Conference on Machine Learning (ICML 2023), volume 202 ofPMLR, pages 28492–28518,

work page 2023

[14] [14]

org/papers/v22/20-1307.html

URLhttps://jmlr. org/papers/v22/20-1307.html. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. InAdvances in Neural Information Processing Systems, volume 33, pages 1877–1901,

work page 1901

[15] [15]

Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, and Ronan Collobert

URLhttps:// proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html. Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, and Ronan Collobert. MLS: A large-scale multilingual dataset for speech research. InInterspeech 2020, pages 2757–2761,

work page 2020

[16] [16]

20 Yifan Du, Zikang Liu, Junyi Li, and Wayne Xin Zhao

doi: 10.21437/ Interspeech.2020-2826. 20 Yifan Du, Zikang Liu, Junyi Li, and Wayne Xin Zhao. A survey of vision-language pre-trained models. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), pages 5436–5443,

work page 2020

[17] [17]

JunnanLi, DongxuLi, CaimingXiong, andStevenHoi

doi: 10.24963/ijcai.2022/762. JunnanLi, DongxuLi, CaimingXiong, andStevenHoi. BLIP:Bootstrappinglanguage-imagepre-trainingfor unified vision-language understanding and generation. InProceedings of the 39th International Conference on Machine Learning (ICML 2022), volume 162 ofPMLR, pages 12888–12900,

work page doi:10.24963/ijcai.2022/762 2022

[18] [18]

Soong, and Tie-Yan Liu

URLhttps:// proceedings.mlr.press/v162/li22n.html. Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu. A survey on neural speech synthesis.arXiv preprint arXiv:2106.15561,

work page arXiv

[19] [19]

Soong, and Tie-Yan Liu

doi: 10.48550/arXiv.2106.15561. CVCOSMOS. Career visualisation and analysis platform,

work page doi:10.48550/arxiv.2106.15561