14 May 2026
The Centre for Media and Communication Research (CMCR) at the School of Communication, Hong Kong Baptist University (HKBU), hosted a research talk titled “Beyond the Black Box: Human-Centered Explainability for Online Safety” on 14 May 2026. The event featured Professor Roy Ka-Wei Lee, Associate Professor and Associate Head (Research) at the Information Systems Technology and Design Pillar, Singapore University of Technology and Design (SUTD). Held at CVA504, the seminar brought together faculty members and research postgraduate students for an engaging discussion on the role of explainable AI in online safety and content moderation.
In his talk, Professor Lee examined how artificial intelligence is increasingly involved in high-stakes decisions about online speech, including hate speech, harassment, misinformation, and other forms of harmful content. He noted that while automated moderation systems have become essential for large-scale platforms, accuracy alone is insufficient. Human moderators, policy teams, and affected users also need explanations that are understandable, actionable, and contestable.
Professor Lee began by outlining the practical workflow of online content moderation, from automated detection and platform-level enforcement actions to user appeals and human review. He emphasized that human judgment remains critical, particularly in cases involving ambiguity, cultural context, implicit meaning, or borderline content. However, relying solely on human moderators is not scalable and may also impose substantial psychological burdens on those who must repeatedly review harmful material.
Against this backdrop, Professor Lee introduced his research on using AI-generated explanations to support content moderation. Drawing on studies of text-based hate speech detection, he discussed how large language models can generate explanations for why certain content may be considered harmful. His findings suggest that such explanations can be fluent and informative, but they may also influence human judgments in unintended ways. Depending on how prompts are designed, AI-generated explanations may persuade users or moderators toward particular interpretations, highlighting both the promise and the risk of explainable AI in real-world moderation settings.
The talk further addressed the growing complexity of online content across text, images, memes, and videos. Professor Lee explained that harmful meanings often emerge not from a single textual or visual element, but from the interaction between language, imagery, cultural references, and platform context. This is especially challenging for multilingual and multicultural environments, where moderators or models may not always understand the relevant social background, humor, or implicit references.
Professor Lee also presented ongoing work on multimodal and video-based content moderation. His research explores how AI systems can process textual, visual, and audio information, identify potentially harmful segments, and provide contextual explanations for moderation decisions. He argued that future online safety systems should move beyond black-box classification and instead provide evidence-based, human-centered explanations that can support more accountable decision-making.
Following the presentation, attendees engaged in a thoughtful discussion with Professor Lee on the importance of context in interpreting online content. Questions addressed whether models and annotators should consider broader conversational context, how cultural knowledge affects judgments of offensiveness, and how humor, sarcasm, and local references complicate moderation decisions. Professor Lee noted that while providing fuller context may improve interpretation, it also increases the cost and complexity of annotation and system design.
The seminar offered a valuable platform for interdisciplinary exchange across communication, computational social science, and AI research. It highlighted the urgent need to design online safety technologies that are not only technically accurate, but also transparent, culturally sensitive, and responsive to human needs.