UCLA HCI Research

×

OOPrompt: Reifying Intents into Structured Artifacts for Modular and Iterative Prompting

Tengyou Xu, UCLA HCI Research
Detao Ma, UCLA
Xiang 'Anthony' Chen, UCLA HCI Research

The rise of large language models (LLMs) has given rise to a class of prompt-based interactive systems where users primarily express their input in natural language. However, composing a prompt as a linear text string becomes unwieldy when capturing users’ multifaceted intents. We present Object-Oriented Prompting (OOPrompt), an emergent interaction paradigm that enables users to create, edit, iterate, and reuse prompts as structured, manipulable artifacts, unifying and generalizing several existing point systems. We first outlined a design space from existing work and built an early prototype, which we deployed as a probe in a formative study with 20 participants. Their feedback informed an expanded OOPrompt design space. We then developed the full OOPrompt prototype and conducted a validation study to further understand OOPrompt’s added values and trade-offs. We expect the OOPrompt design space to provide theoretical and empirical guidance to the design and engineering of prompt-based, LLM-enabled interactive systems.

Tengyou Xu, Detao Ma, and Xiang 'Anthony' Chen. 2026. OOPrompt: Reifying Intents into Structured Artifacts for Modular and Iterative Prompting. Proc. ACM Hum.-Comput. Interact. 10, 4, Article EICS009 (June 2026), 30 pages. https://doi.org/10.1145/3816761

@article{10.1145/3816761,
author = {Xu, Tengyou and Ma, Detao and Chen, Xiang 'Anthony'},
title = {OOPrompt: Reifying Intents into Structured Artifacts for Modular and Iterative Prompting},
year = {2026},
issue_date = {June 2026},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {10},
number = {4},
url = {https://doi.org/10.1145/3816761},
doi = {10.1145/3816761},
journal = {Proc. ACM Hum.-Comput. Interact.},
month = jun,
articleno = {EICS009},
numpages = {30},
keywords = {Large language models; human-AI interaction; structured prompting; design space}
}

×

Behavioral Indicators of Overreliance During Interaction with Conversational Language Models

Chang Liu, Tsinghua University
Qinyi Zhou, Hong Kong University of Science and Technology
Xinjie Shen, Georgia Institute of Technology
Xingyu Bruce Liu, UCLA HCI Research
Tongshuang Wu, Carnegie Mellon University HCII
Xiang 'Anthony' Chen, UCLA HCI Research

LLMs are now embedded in a wide range of everyday scenarios. However, their inherent hallucinations risk hiding misinformation in fluent responses, raising concerns about overreliance on AI. Detecting overreliance is challenging, as it often arises in complex, dynamic contexts and cannot be easily captured by post-hoc task outcomes. In this work, we aim to investigate how users’ behavioral patterns correlate with overreliance. We collected interaction logs from 77 participants working with an LLM injected plausible misinformation across three real-world tasks and we assessed overreliance by whether participants detected and corrected these errors. By semantically encoding and clustering segments of user interactions, we identified five behavioral patterns linked to overreliance: users with low overreliance show careful task comprehension and fine-grained navigation; users with high overreliance show frequent copy-paste, skipping initial comprehension, repeated LLM references, coarse locating, and accepting misinformation despite hesitation. We discuss design implications for mitigation.

Chang Liu, Qinyi Zhou, Xinjie Shen, Xingyu Bruce Liu, Tongshuang Wu, and Xiang 'Anthony' Chen. 2026. Behavioral Indicators of Overreliance During Interaction with Conversational Language Models. In Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI '26). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3772318.3790332

@inproceedings{liu2026behavioral,
title={Behavioral indicators of overreliance during interaction with conversational language models},
author={Liu, Chang and Zhou, Qinyi and Shen, Xinjie and Bruce Liu, Xingyu and Wu, Tongshuang and Chen, Xiang'Anthony},
booktitle={Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems},
pages={1--23},
year={2026}
}

×

RAGPPI: RAG Benchmark for Protein-Protein Interactions in Drug Discovery

Youngseung Jeon, UCLA HCI Research
Ziwen Li, UCLA HCI Research
Thomas Li, Palo Alto High School
JiaSyuan Chang, UCLA HCI Research
Morteza Ziyadi, Amazon AGI
Xiang 'Anthony' Chen, UCLA HCI Research

Retrieving the biological impacts of protein-protein interactions (PPIs) is essential for target identification (Target ID) in drug development. Given the vast number of proteins involved, this process remains time-consuming and challenging. Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) frameworks have supported Target ID; however, no benchmark currently exists for identifying the biological impacts of PPIs. To bridge this gap, we introduce the RAG Benchmark for PPIs (RAGPPI), a factual question-answer benchmark of 4,420 question-answer pairs that focus on the potential biological impacts of PPIs. Through interviews with experts, we identified criteria for a benchmark dataset, such as a type of QA and source. We built a gold-standard dataset (500 QA pairs) through expert-driven data annotation. We developed an ensemble auto-evaluation LLM that incorporates expert labeling characteristics, average fact–abstract similarity (F1), and low-similarity fact counts (F2), enabling the construction of a silver-standard dataset (3,720 QA pairs).

Youngseung Jeon, Ziwen Li, Thomas Li, JiaSyuan Chang, Morteza Ziyadi, and Xiang'Anthony Chen. RAGPPI: RAG Benchmark for Protein-Protein Interactions in Drug Discovery. arXiv preprint arXiv:2505.23823 (2025).

@article{jeon2025ragppi,
title={RAGPPI: RAG Benchmark for Protein-Protein Interactions in Drug Discovery},
author={Jeon, Youngseung and Li, Ziwen and Li, Thomas and Chang, JiaSyuan and Ziyadi, Morteza and Chen, Xiang'Anthony'},
journal={arXiv preprint arXiv:2505.23823},
year={2025}
}

×

The GenUI Study: Exploring the Design of Generative UI Tools to Support UX Practitioners and Beyond

Xiang 'Anthony' Chen, UCLA HCI Research
Tiffany Knearem, Google
Yang Li, Google DeepMind

AI can now generate high-fidelity UI mock-up screens from a high-level textual description, promising to support UX practitioners’ work. However, it remains unclear how UX practitioners would adopt such Generative UI (GenUI) models in a way that is integral and beneficial to their work. To answer this question, we conducted a formative study with 37 UX-related professionals that consisted of four roles: UX designers, UX researchers, software engineers, and product managers. Using a state-of-the-art GenUI tool, each participant went through a week-long, individual mini-project exercise with role-specific tasks, keeping a daily journal of their usage and experiences with GenUI, followed by a semi-structured interview. We report findings on participants’ workflow using the GenUI tool, how GenUI can support all and each specific roles, and existing gaps between GenUI and users’ needs and expectations, which lead to design implications to inform future work on GenUI development.

Xiang 'Anthony Chen, Tiffany Knearem, and Yang Li. 2025. The GenUI Study: Exploring the Design of Generative UI Tools to Support UX Practitioners and Beyond. In Proceedings of the 2025 ACM Designing Interactive Systems Conference (DIS '25). Association for Computing Machinery, New York, NY, USA, 1179–1196. https://doi.org/10.1145/3715336.3735780

@inproceedings{10.1145/3715336.3735780,
author = {Chen, Xiang 'Anthony and Knearem, Tiffany and Li, Yang},
title = {The GenUI Study: Exploring the Design of Generative UI Tools to Support UX Practitioners and Beyond},
year = {2025},
isbn = {9798400714856},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3715336.3735780},
doi = {10.1145/3715336.3735780},
abstract = {AI can now generate high-fidelity UI mock-up screens from a high-level textual description, promising to support UX practitioners’ work. However, it remains unclear how UX practitioners would adopt such Generative UI (GenUI) models in a way that is integral and beneficial to their work. To answer this question, we conducted a formative study with 37 UX-related professionals that consisted of four roles: UX designers, UX researchers, software engineers, and product managers. Using a state-of-the-art GenUI tool, each participant went through a week-long, individual mini-project exercise with role-specific tasks, keeping a daily journal of their usage and experiences with GenUI, followed by a semi-structured interview. We report findings on participants’ workflow using the GenUI tool, how GenUI can support all and each specific roles, and existing gaps between GenUI and users’ needs and expectations, which lead to design implications to inform future work on GenUI development.},
booktitle = {Proceedings of the 2025 ACM Designing Interactive Systems Conference},
pages = {1179–1196},
numpages = {18},
keywords = {GenUI, Generative AI, User Experience Design, Diary Study},
location = {
},
series = {DIS '25}
}

×

Proactive Conversational Agents with Inner Thoughts

Xingyu Bruce Liu, UCLA HCI Research
Shitao Fang, The University of Tokyo
Weiyan Shi, Northeastern University
Chien-Sheng Wu, Salesforce AI
Takeo Igarashi, The University of Tokyo
Xiang 'Anthony' Chen, UCLA HCI Research

One of the long-standing aspirations in conversational AI is to allow them to autonomously take initiatives in conversations, i.e., being proactive. This is especially challenging for multi-party conversations. Prior NLP research focused mainly on predicting the next speaker from contexts like preceding conversations. In this paper, we demonstrate the limitations of such methods and rethink what it means for AI to be proactive in multi-party, human-AI conversations. We propose that just like humans, rather than merely reacting to turn-taking cues, a proactive AI formulates its own inner thoughts during a conversation, and seeks the right moment to contribute. Through a formative study with 24 participants and inspiration from linguistics and cognitive psychology, we introduce the Inner Thoughts framework. Our framework equips AI with a continuous, covert train of thoughts in parallel to the overt communication process, which enables it to proactively engage by modeling its intrinsic motivation to express these thoughts. We instantiated this framework into two real-time systems: an AI playground web app and a chatbot. Through a technical evaluation and user studies with human participants, our framework significantly surpasses existing baselines on aspects like anthropomorphism, coherence, intelligence, and turn-taking appropriateness.

Xingyu Bruce Liu, Shitao Fang, Weiyan Shi, Chien-Sheng Wu, Takeo Igarashi, and Xiang 'Anthony' Chen. 2025. Proactive Conversational Agents with Inner Thoughts. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI '25). Association for Computing Machinery, New York, NY, USA, Article 184, 1–19. https://doi.org/10.1145/3706598.3713760

@inproceedings{10.1145/3706598.3713760,
author = {Liu, Xingyu Bruce and Fang, Shitao and Shi, Weiyan and Wu, Chien-Sheng and Igarashi, Takeo and Chen, Xiang 'Anthony'},
title = {Proactive Conversational Agents with Inner Thoughts},
year = {2025},
isbn = {9798400713941},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3706598.3713760},
doi = {10.1145/3706598.3713760},
booktitle = {Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems},
articleno = {184},
numpages = {19},
keywords = {Conversational Agent, Multi-Agent, Multi-Party Conversation, Inner Thoughts, Mixed-initiative Interaction, Proactive AI},
location = {
},
series = {CHI '25}
}

×

GraPPI: A Retrieve-Divide-Solve GraphRAG Framework for Large-scale Protein-protein Interaction Exploration

Ziwen Li, UCLA HCI Research
Xiang 'Anthony' Chen, UCLA HCI Research
Youngseung Jeon, UCLA HCI Research

Drug discovery (DD) has tremendously contributed to maintaining and improving public health. Hypothesizing that inhibiting protein misfolding can slow disease progression, researchers focus on target identification (Target ID) to find protein structures for drug binding. While Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) frameworks have accelerated drug discovery, integrating models into cohesive workflows remains challenging. We conducted a user study with drug discovery researchers to identify the applicability of LLMs and RAGs in Target ID. We identified two main findings: 1) an LLM should provide multiple Protein-Protein Interactions (PPIs) based on an initial protein and protein candidates that have a therapeutic impact; 2) the model must provide the PPI and relevant explanations for better understanding. Based on these observations, we identified three limitations in previous approaches for Target ID: 1) semantic ambiguity, 2) lack of explainability, and 3) short retrieval units. To address these issues, we propose GraPPI, a large-scale knowledge graph (KG)-based retrieve-divide-solve agent pipeline RAG framework to support large-scale PPI signaling pathway exploration in understanding therapeutic impacts by decomposing the analysis of entire PPI pathways into sub-tasks focused on the analysis of PPI edges.

Ziwen Li, Xiang Chen, and Youngseung Jeon. 2025. GraPPI: A Retrieve-Divide-Solve GraphRAG Framework for Large-scale Protein-protein Interaction Exploration. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 3635–3648, Albuquerque, New Mexico. Association for Computational Linguistics.

@article{li2025grappi,
title={GraPPI: A Retrieve-Divide-Solve GraphRAG Framework for Large-scale Protein-protein Interaction Exploration},
author={Li, Ziwen and Chen, XiangAnthony and Jeon, Youngseung},
journal={arXiv preprint arXiv:2501.16382},
year={2025}

×

Empowering Medical Data Labeling for Non-Experts with DANNY: Enhancing Accuracy and Mitigating Over-Reliance on AI

Youngseung Jeon, UCLA HCI Research
Christopher Hwang, UCLA HCI Research
Xiang ‘Anthony’ Chen, UCLA HCI Research

Economic constraints on recruiting experts hinder efforts to build qualified datasets for utilizing AI in professional domains (e.g., medical diagnosis), which could provide societal benefits. To solve this issue, previous studies introduced crowdsourcing and AI to enable non-experts to perform expert-level data labeling. Yet, they encountered three challenges: 1) the limited applicability of crowdsourcing in less specialized domains (e.g., identifying animal species); 2) the chicken-and-egg problem, a paradox where high-performance AI is required to build a dataset to train such AI; and 3) over-reliance on AI, where non-experts, lacking expertise, may incorrectly label data when guided by sub-optimal AI. To address this, we introduce DANNY (Data ANnotation for Non-experts made easY), an AI-based tool designed to help non-experts label an arthritis dataset, aiming to increase labeling accuracy and mitigate over-reliance on AI. By externalizing a cognitive forcing intervention to foster critical thinking, DANNY provides two visualizations: 1) the Criteria phase, where non-experts define criteria across four arthritis features, and 2) the Correction phase, where they refine these criteria by comparing them to AI suggestions. In a study with 28 participants, DANNY users achieved higher accuracy and a more appropriate reliance on AI dependency than control groups. A follow-up study with 12 participants demonstrates how DANNY can be used to improve AI with an ensemble method. Our findings contribute new insights into using AI to support non-experts in labeling domain-specific data when expert resources are limited.

Youngseung Jeon, Christopher Hwang, and Xiang 'Anthony' Chen. 2025. Empowering Medical Data Labeling for Non-Experts with DANNY: Enhancing Accuracy and Mitigating Over-Reliance on AI. In Proceedings of the 30th International Conference on Intelligent User Interfaces (IUI '25). Association for Computing Machinery, New York, NY, USA, 624–640. https://doi.org/10.1145/3708359.3712161

@inproceedings{jeon2025empowering,
title={Empowering Medical Data Labeling for Non-Experts with DANNY: Enhancing Accuracy and Mitigating Over-Reliance on AI},
author={Jeon, Youngseung and Hwang, Christopher and Chen, XiangAnthony},
booktitle={Proceedings of the 30th International Conference on Intelligent User Interfaces},
pages={624--640},
year={2025}
}

×

Majority Voting of Doctors Improves Appropriateness of AI Reliance in Pathology

Hongyan Gu, UCLA HCI Research
Chunxu Yang, UCLA HCI Research
Shino Magaki, UCLA David Geffen School of Medicine
Neda Zarrin-Khameh, Baylor College of Medicine
Nelli S. Lakis, University of Kansas Medical Center
Imna Cobos, Stanford School of Medicine
Negar Khanlou, UCLA David Geffen School of Medicine
Xinhai R. Zhang, University of Texas Health Science Center at Houston
Jasmeet Assi, University of Kansas Medical Center
Joshua T. Byers, University of California, San Francisco
Ameer Hamza, University of Kansas Medical Center
Karam Han, University of Wisconsin-Madison
Anders Meyer, University of Kansas Medical Center
Hilda Mirbaha, UCLA David Geffen School of Medicine
Carrie A. Mohila, Baylor College of Medicine
Todd M. Stevens, University of Kansas Medical Center
Sara L. Stone, Hospital of the University of Pennsylvania
Wenzhong Yan, UCLA ECE
Mohammad Haeri, University of Kansas Medical Center
Xiang ‘Anthony’ Chen, UCLA HCI Research

As Artificial Intelligence (AI) making advancements in medical decision-making, there is a growing need to ensure doctors develop appropriate reliance on AI to avoid adverse outcomes. However, existing methods in enabling appropriate AI reliance might encounter challenges while being applied in the medical domain. With this regard, this work employs and provides the validation of an alternative approach – majority voting – to facilitate appropriate reliance on AI in medical decision-making. This is achieved by a multi-institutional user study involving 32 medical professionals with various backgrounds, focusing on the pathology task of visually detecting a pattern, mitoses, in tumor images. Here, the majority voting process was conducted by synthesizing decisions under AI assistance from a group of pathology doctors (pathologists). Two metrics were used to evaluate the appropriateness of AI reliance: Relative AI Reliance (RAIR) and Relative Self-Reliance (RSR). Results showed that even with groups of three pathologists, majority-voted decisions significantly increased both RAIR and RSR – by approximately 9% and 31%, respectively – compared to decisions made by one pathologist collaborating with AI. This increased appropriateness resulted in better precision and recall in the detection of mitoses. While our study is centered on pathology, we believe these insights can be extended to general high-stakes decision-making processes involving similar visual tasks.

Hongyan Gu, Chunxu Yang, Shino Magaki, Neda Zarrin-Khameh, Nelli S. Lakis, Inma Cobos, Negar Khanlou, Xinhai R. Zhang, Jasmeet Assi, Joshua T. Byers, Ameer Hamza, Karam Han, Anders Meyer, Hilda Mirbaha, Carrie A. Mohila, Todd M. Stevens, Sara L. Stone, Wenzhong Yan, Mohammad Haeri, Xiang ‘Anthony’ Chen, Majority voting of doctors improves appropriateness of AI reliance in pathology, International Journal of Human-Computer Studies, Volume 190, 2024, 103315, ISSN 1071-5819, https://doi.org/10.1016/j.ijhcs.2024.103315.

@article{GU2024103315,
title = {Majority voting of doctors improves appropriateness of AI reliance in pathology},
journal = {International Journal of Human-Computer Studies},
volume = {190},
pages = {103315},
year = {2024},
issn = {1071-5819},
doi = {https://doi.org/10.1016/j.ijhcs.2024.103315},
url = {https://www.sciencedirect.com/science/article/pii/S1071581924000995},
author = {Hongyan Gu and Chunxu Yang and Shino Magaki and Neda Zarrin-Khameh and Nelli S. Lakis and Inma Cobos and Negar Khanlou and Xinhai R. Zhang and Jasmeet Assi and Joshua T. Byers and Ameer Hamza and Karam Han and Anders Meyer and Hilda Mirbaha and Carrie A. Mohila and Todd M. Stevens and Sara L. Stone and Wenzhong Yan and Mohammad Haeri and Xiang ‘Anthony’ Chen},
keywords = {Appropriate reliance, Artificial intelligence, Majority voting, Pathology},
abstract = {As Artificial Intelligence (AI) making advancements in medical decision-making, there is a growing need to ensure doctors develop appropriate reliance on AI to avoid adverse outcomes. However, existing methods in enabling appropriate AI reliance might encounter challenges while being applied in the medical domain. With this regard, this work employs and provides the validation of an alternative approach – majority voting – to facilitate appropriate reliance on AI in medical decision-making. This is achieved by a multi-institutional user study involving 32 medical professionals with various backgrounds, focusing on the pathology task of visually detecting a pattern, mitoses, in tumor images. Here, the majority voting process was conducted by synthesizing decisions under AI assistance from a group of pathology doctors (pathologists). Two metrics were used to evaluate the appropriateness of AI reliance: Relative AI Reliance (RAIR) and Relative Self-Reliance (RSR). Results showed that even with groups of three pathologists, majority-voted decisions significantly increased both RAIR and RSR – by approximately 9% and 31%, respectively – compared to decisions made by one pathologist collaborating with AI. This increased appropriateness resulted in better precision and recall in the detection of mitoses. While our study is centered on pathology, we believe these insights can be extended to general high-stakes decision-making processes involving similar visual tasks.}
}

×

Human I/O: Towards a Unified Approach to Detecting Situational Impairments

Xingyu Bruce Liu, UCLA HCI Research
Jiaohao Nick Li, UCLA HCI Research
David Kim, Google
Xiang 'Anthony' Chen, UCLA HCI Research
Ruofei Du, Google

Situationally Induced Impairments and Disabilities (SIIDs) can significantly hinder user experience in contexts such as poor lighting, noise, and multi-tasking. While prior research has introduced algorithms and systems to address these impairments, they predominantly cater to specific tasks or environments and fail to accommodate the diverse and dynamic nature of SIIDs. We introduce Human I/O, a unified approach to detecting a wide range of SIIDs by gauging the availability of human input/output channels. Leveraging egocentric vision, multimodal sensing and reasoning with large language models, Human I/O achieves a 0.22 mean absolute error and a 82% accuracy in availability prediction across 60 in-the-wild egocentric video recordings in 32 different scenarios. Furthermore, while the core focus of our work is on the detection of SIIDs rather than the creation of adaptive user interfaces, we showcase the efficacy of our prototype via a user study with 10 participants. Findings suggest that Human I/O significantly reduces effort and improves user experience in the presence of SIIDs, paving the way for more adaptive and accessible interactive systems in the future.

Xingyu Bruce Liu, Jiahao Nick Li, David Kim, Xiang 'Anthony' Chen, and Ruofei Du. 2024. Human I/O: Towards a Unified Approach to Detecting Situational Impairments. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI '24). Association for Computing Machinery, New York, NY, USA, Article 965, 1–18. https://doi.org/10.1145/3613904.3642065

@inproceedings{10.1145/3613904.3642065,
author = {Liu, Xingyu Bruce and Li, Jiahao Nick and Kim, David and Chen, Xiang 'Anthony' and Du, Ruofei},
title = {Human I/O: Towards a Unified Approach to Detecting Situational Impairments},
year = {2024},
isbn = {9798400703300},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3613904.3642065},
doi = {10.1145/3613904.3642065},
booktitle = {Proceedings of the CHI Conference on Human Factors in Computing Systems},
articleno = {965},
numpages = {18},
keywords = {augmented reality, context awareness, large language models, multimodal sensing, situational impairments},
location = {Honolulu, HI, USA},
series = {CHI '24}
}

×

Enhancing Mitosis Count Assessment in Meningiomas with Computational Digital Pathology

Hongyan Gu, UCLA HCI Research
Chunxu Yang, UCLA HCI Research
Issa Al-kharouf, University of Kansas Medical Center
Shino Magaki, UCLA David Geffen School of Medicine
Nelli Lakis, University of Kansas Medical Center
Christopher Kazu Williams, UCLA David Geffen School of Medicine
Sallam Mohammad Alrosan, University of Kansas Medical Center
Ellie K. Onstott, University of Kansas Medical Center
Wenzhong Yan, UCLA ECE
Negar Khanlou, UCLA David Geffen School of Medicine
Imna Cobos, Stanford School of Medicine
Xinhai R. Zhang, University of Texas Health Science Center at Houston
Neda Zarrin-Khameh, Baylor College of Medicine
Harry V. Vinters, UCLA David Geffen School of Medicine
Xiang 'Anthony' Chen, UCLA HCI Research
Mohammad Haeri, University of Kansas Medical Center

Mitosis is a critical criterion for meningioma grading. However, pathologists’ assessment of mitoses is subject to significant inter-observer variation due to challenges in locating mitosis hotspots and accurately detecting mitotic figures. To address this issue, we leverage digital pathology and propose a computational strategy to enhance pathologists’ mitosis assessment. The strategy has two components: (1) A depth-first search algorithm that quantifies the mathematically maximum mitotic count in 10 consecutive high-power fields, which can enhance the preciseness, especially in cases with borderline mitotic count. (2) Implementing a collaborative sphere to group a set of pathologists to detect mitoses under each high-power field, which can mitigate subjective random errors in mitosis detection originating from individual detection errors. By depth-first search algorithm (1) , we analyzed 19 meningioma slides and discovered that the proposed algorithm upgraded two borderline cases verified at consensus conferences. This improvement is attributed to the algorithm’s ability to quantify the mitotic count more comprehensively compared to other conventional methods of counting mitoses. In implementing a collaborative sphere (2) , we evaluated the correctness of mitosis detection from grouped pathologists and/or pathology residents, where each member of the group annotated a set of 48 high-power field images for mitotic figures independently. We report that groups with sizes of three can achieve an average precision of 0.897 and sensitivity of 0.699 in mitosis detection, which is higher than an average pathologist in this study (precision: 0.750, sensitivity: 0.667). The proposed computational strategy can be integrated with artificial intelligence workflow, which envisions the future of achieving a rapid and robust mitosis assessment by interactive assisting algorithms that can ultimately benefit patient management.

Gu H, Yang C, Al-Kharouf I, Magaki S, Lakis N, Williams CK, Alrosan SM, Onstott EK, Yan W, Khanlou N, Cobos I, Zhang XR, Zarrin-Khameh N, Vinters HV, Chen XA, Haeri M. Enhancing mitosis quantification and detection in meningiomas with computational digital pathology. Acta Neuropathol Commun. 2024 Jan 11;12(1):7. doi: 10.1186/s40478-023-01707-6. PMID: 38212848; PMCID: PMC10782692.

@article{gu2024enhancing,
title={Enhancing mitosis quantification and detection in meningiomas with computational digital pathology},
author={Gu, Hongyan and Yang, Chunxu and Al-Kharouf, Issa and Magaki, Shino and Lakis, Nelli and Williams, Christopher Kazu and Alrosan, Sallam Mohammad and Onstott, Ellie Kate and Yan, Wenzhong and Khanlou, Negar and others},
journal={Acta Neuropathologica Communications},
volume={12},
number={1},
pages={7},
year={2024},
publisher={Springer}
}

×

From Text to Pixels: Enhancing User Understanding through Text-to-Image Model Explanations

Noyan Evirgen (UCLA HCI Research)
Ruolin Wang (UCLA HCI Research)
Xiang 'Anthony' Chen (UCLA HCI Research)

Recent progress in Text-to-Image (T2I) models promises transformative applications in art, design, education, medicine, and entertainment. These models, exemplified by Dall-e, Imagen, and Stable Diffusion, have the potential to revolutionize various industries. However, a primary concern is their operation as a 'black-box' for many users. Without understanding the underlying mechanics, users are unable to harness the full potential of these models. This study focuses on bridging this gap by developing and evaluating explanation techniques for T2I models, targeting inexperienced end users. While prior works have delved into Explainable AI (XAI) methods for classification or regression tasks, T2I generation poses distinct challenges. Through formative studies with experts, we identified unique explanation goals and subsequently designed tailored explanation strategies. We then empirically evaluated these methods with a cohort of 473 participants from Amazon Mechanical Turk (AMT) across three tasks. Our results highlight users' ability to learn new keywords through explanations, a preference for example-based explanations, and challenges in comprehending explanations that significantly shift the image's theme. Moreover, findings suggest users benefit from a limited set of concurrent explanations. Our main contributions include a curated dataset for evaluating T2I explainability techniques, insights from a comprehensive AMT user study, and observations critical for future T2I model explainability research.

Noyan Evirgen, Ruolin Wang, and Xiang 'Anthony Chen. 2024. From Text to Pixels: Enhancing User Understanding through Text-to-Image Model Explanations. In Proceedings of the 29th International Conference on Intelligent User Interfaces (IUI '24). Association for Computing Machinery, New York, NY, USA, 74–87. https://doi.org/10.1145/3640543.3645173

@inproceedings{10.1145/3640543.3645173,
author = {Evirgen, Noyan and Wang, Ruolin and Chen, Xiang 'Anthony},
title = {From Text to Pixels: Enhancing User Understanding through Text-to-Image Model Explanations},
year = {2024},
isbn = {9798400705083},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3640543.3645173},
doi = {10.1145/3640543.3645173},
abstract = {Recent progress in Text-to-Image (T2I) models promises transformative applications in art, design, education, medicine, and entertainment. These models, exemplified by Dall-e, Imagen, and Stable Diffusion, have the potential to revolutionize various industries. However, a primary concern is their operation as a ‘black-box’ for many users. Without understanding the underlying mechanics, users are unable to harness the full potential of these models. This study focuses on bridging this gap by developing and evaluating explanation techniques for T2I models, targeting inexperienced end users. While prior works have delved into Explainable AI (XAI) methods for classification or regression tasks, T2I generation poses distinct challenges. Through formative studies with experts, we identified unique explanation goals and subsequently designed tailored explanation strategies. We then empirically evaluated these methods with a cohort of 473 participants from Amazon Mechanical Turk (AMT) across three tasks. Our results highlight users’ ability to learn new keywords through explanations, a preference for example-based explanations, and challenges in comprehending explanations that significantly shift the image’s theme. Moreover, findings suggest users benefit from a limited set of concurrent explanations. Our main contributions include a curated dataset for evaluating T2I explainability techniques, insights from a comprehensive AMT user study, and observations critical for future T2I model explainability research.},
booktitle = {Proceedings of the 29th International Conference on Intelligent User Interfaces},
pages = {74–87},
numpages = {14},
keywords = {Explainability Methods, Text-to-Image, User-Study, XAI},
location = {Greenville, SC, USA},
series = {IUI '24}
}

×

HCI Papers Cite HCI Papers, Increasingly So

Xiang 'Anthony' Chen (UCLA HCI Research)

To measure how HCI papers are cited across disciplinary boundaries, we collected a citation dataset of CHI, UIST, and CSCW papers published between 2010 and 2020. Our analysis indicates that HCI papers have been more and more likely to be cited by HCI papers rather than by non-HCI papers.

Xiang ‘Anthony’ Chen. HCI Papers Cite HCI Papers, Increasingly So. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24).

@misc{chen2024hci,
   title={HCI Papers Cite HCI Papers, Increasingly So},
   author={Xiang Anthony Chen},
   year={2024},
   eprint={2303.07539},
   archivePrefix={arXiv},
   primaryClass={cs.HC}
}

×

Marvista: Exploring the Design of a Human-AI Collaborative News Reading Tool

Xiang 'Anthony' Chen (UCLA HCI Research)
Chien-Sheng Wu, Lidiya Murakhovs'ka, Philippe Laban, Tong Niu, Wenhao Liu, Caiming Xiong (Salesforce Research)

We explore the design of Marvista—a human-AI collaborative tool that employs a suite of natural language processing models to provide end-to-end support for reading online news articles. Before reading an article, Marvista helps a user plan what to read by filtering text based on how much time one can spend and what questions one is interested to find out from the article. During reading, Marvista helps the user reflect on their understanding of each paragraph with AI-generated questions. After reading, Marvista generates an explainable human-AI summary that combines both AI’s processing of the text, the user’s reading behavior, and user-generated data in the reading process. In contrast to prior work that offered (content-independent) interaction techniques or devices for reading, Marvista takes a human-AI collaborative approach that contributes text-specific guidance (content-aware) to support the entire reading process.

Xiang ‘Anthony’ Chen, Chien-Sheng Wu, Lidiya Murakhovs’ka, Philippe Laban, Tong Niu, Wenhao Liu, and Caiming Xiong. 2023. Marvista: Exploring the Design of a Human-AI Collaborative News Reading Tool. ACM Trans. Comput.-Hum. Interact. Just Accepted (July 2023). https://doi.org/10.1145/3609331

@article{10.1145/3609331,
author = {Chen, Xiang ‘Anthony’ and Wu, Chien-Sheng and Murakhovs’ka, Lidiya and Laban, Philippe and Niu, Tong and Liu, Wenhao and Xiong, Caiming},
title = {Marvista: Exploring the Design of a Human-AI Collaborative News Reading Tool},
year = {2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1073-0516},
url = {https://doi.org/10.1145/3609331},
doi = {10.1145/3609331},
abstract = {We explore the design of Marvista—a human-AI collaborative tool that employs a suite of natural language processing models to provide end-to-end support for reading online news articles. Before reading an article, Marvista helps a user plan what to read by filtering text based on how much time one can spend and what questions one is interested to find out from the article. During reading, Marvista helps the user reflect on their understanding of each paragraph with AI-generated questions. After reading, Marvista generates an explainable human-AI summary that combines both AI’s processing of the text, the user’s reading behavior, and user-generated data in the reading process. In contrast to prior work that offered (content-independent) interaction techniques or devices for reading, Marvista takes a human-AI collaborative approach that contributes text-specific guidance (content-aware) to support the entire reading process.},
note = {Just Accepted},
journal = {ACM Trans. Comput.-Hum. Interact.},
month = {jul},
keywords = {Tools, Human-AI Collaboration, Reading}
}

×

Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals

Xingyu "Bruce" Liu (UCLA HCI Research)
Vladimir Kirilyuk, Xiuxiu Yuan, Alex Olwal, Peggy Chi (Google)
Xiang 'Anthony' Chen (UCLA HCI Research)
Ruofei Du (Google)

Video conferencing solutions like Zoom, Google Meet, and Microsoft Teams are becoming increasingly popular for facilitating conversations, and recent advancements such as live captioning help people better understand each other. We believe that the addition of visuals based on the context of conversations could further improve comprehension of complex or unfamiliar concepts. To explore the potential of such capabilities, we conducted a formative study through remote interviews (N=10) and crowdsourced a dataset of over 1500 sentence-visual pairs across a wide range of contexts. These insights informed Visual Captions, a real-time system that integrates with a videoconferencing platform to enrich verbal communication. Visual Captions leverages a fine-tuned large language model to proactively suggest relevant visuals in open-vocabulary conversations. We present the findings from a lab study (N=26) and an in-the-wild case study (N=10), demonstrating how Visual Captions can help improve communication through visual augmentation in various scenarios.

Xingyu 'Bruce' Liu, Vladimir Kirilyuk, Xiuxiu Yuan, Alex Olwal, Peggy Chi, Xiang 'Anthony' Chen, and Ruofei Du. 2023. Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 108, 1-20. https://doi.org/10.1145/3544548.3581566

@inproceedings{10.1145/3544548.3581566,
author = {Xingyu, Bruce”@ and Kirilyuk, Vladimir and Yuan, Xiuxiu and Olwal, Alex and Chi, Peggy and Chen, Xiang 'Anthony' and Du, Ruofei},
title = {Visual Captions: Augmenting Verbal Communication with On-the-Fly Visuals},
year = {2023},
isbn = {9781450394215},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3544548.3581566},
doi = {10.1145/3544548.3581566},
abstract = {Video conferencing solutions like Zoom, Google Meet, and Microsoft Teams are becoming increasingly popular for facilitating conversations, and recent advancements such as live captioning help people better understand each other. We believe that the addition of visuals based on the context of conversations could further improve comprehension of complex or unfamiliar concepts. To explore the potential of such capabilities, we conducted a formative study through remote interviews (N=10) and crowdsourced a dataset of over 1500 sentence-visual pairs across a wide range of contexts. These insights informed Visual Captions, a real-time system that integrates with a video conferencing platform to enrich verbal communication. Visual Captions leverages a fine-tuned large language model to proactively suggest relevant visuals in open-vocabulary conversations. We present findings from a lab study (N=26) and an in-the-wild case study (N=10), demonstrating how Visual Captions can help improve communication through visual augmentation in various scenarios.},
booktitle = {Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems},
articleno = {108},
numpages = {20},
keywords = {large language models, augmented reality, video-mediated communication, augmented communication, AI agent, text-to-visual, online meeting, dataset, collaborative work},
location = {Hamburg, Germany},
series = {CHI '23}
}

×

Augmenting Pathologists with NaviPath: Design and Evaluation of a Human-AI Collaborative Navigation System

Hongyan Gu, Chunxu Yang (UCLA HCI Research)
Mohammad Haeri (University of Kansas Medical Center)
Jing Wang, Beijing Tongren Hospital, Capital Medical University
Shirley Tang, Wenzhong Yan (UCLA HCI Research)
Shujin He, Beijing Tongren Hospital, Capital Medical University
Christopher Kazu Williams, Shino Magaki (UCLA David Geffen School of Medicine)
Xiang 'Anthony' Chen (UCLA HCI Research)

Artificial Intelligence (AI) brings advancements to support pathologists in navigating high-resolution tumor images to search for pathology patterns of interest. However, existing AI-assisted tools have not realized this promised potential due to a lack of insight into pathology and HCI considerations for pathologists' navigation workflows in practice. We first conducted a formative study with six medical professionals in pathology to capture their navigation strategies. By incorporating our observations along with the pathologists' domain knowledge, we designed NaviPath — a human-AI collaborative navigation system. An evaluation study with 15 medical professionals in pathology indicated that: (i) compared to the manual navigation, participants saw more than twice the number of pathological patterns in unit time with NaviPath, and (ii) participants achieved higher precision and recall against the AI and the manual navigation on average. Further qualitative analysis revealed that navigation was more consistent with NaviPath, which can improve the overall examination quality.

Hongyan Gu, Chunxu Yang, Mohammad Haeri, Jing Wang, Shirley Tang, Wenzhong Yan, Shujin He, Christopher Kazu Williams, Shino Magaki, and Xiang 'Anthony' Chen. 2023. Augmenting Pathologists with NaviPath: Design and Evaluation of a Human-AI Collaborative Navigation System. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 349, 1-19. https://doi.org/10.1145/3544548.3580694

@inproceedings{10.1145/3544548.3580694,
author = {Gu, Hongyan and Yang, Chunxu and Haeri, Mohammad and Wang, Jing and Tang, Shirley and Yan, Wenzhong and He, Shujin and Williams, Christopher Kazu and Magaki, Shino and Chen, Xiang 'Anthony'},
title = {Augmenting Pathologists with NaviPath: Design and Evaluation of a Human-AI Collaborative Navigation System},
year = {2023},
isbn = {9781450394215},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3544548.3580694},
doi = {10.1145/3544548.3580694},
abstract = {Artificial Intelligence (AI) brings advancements to support pathologists in navigating high-resolution tumor images to search for pathology patterns of interest. However, existing AI-assisted tools have not realized this promised potential due to a lack of insight into pathology and HCI considerations for pathologists' navigation workflows in practice. We first conducted a formative study with six medical professionals in pathology to capture their navigation strategies. By incorporating our observations along with the pathologists' domain knowledge, we designed NaviPath — a human-AI collaborative navigation system. An evaluation study with 15 medical professionals in pathology indicated that: (i) compared to the manual navigation, participants saw more than twice the number of pathological patterns in unit time with NaviPath, and (ii) participants achieved higher precision and recall against the AI and the manual navigation on average. Further qualitative analysis revealed that navigation was more consistent with NaviPath, which can improve the overall examination quality.},
booktitle = {Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems},
articleno = {349},
numpages = {19},
keywords = {medical AI, Human-AI collaboration, navigation, digital pathology},
location = {Hamburg, Germany},
series = {CHI '23}
}

×

GANravel: User-Driven Direction Disentanglement in Generative Adversarial Networks

Noyan Evirgen (UCLA HCI Research)
Xiang 'Anthony' Chen (UCLA HCI Research)

Generative adversarial networks (GANs) have many application areas including image editing, domain translation, missing data imputation, and support for creative work. However, GANs are considered `black boxes'. Specifically, the end-users have little control over how to improve editing directions through disentanglement. Prior work focused on new GAN architectures to disentangle editing directions. Alternatively, we propose GANravel --a user-driven direction disentanglement tool that complements the existing GAN architectures and allows users to improve editing directions iteratively. In two user studies with 16 participants each, GANravel users were able to disentangle directions and outperformed the state-of-the-art direction discovery baselines in disentanglement performance. In the second user study, GANravel was used in a creative task of creating dog memes and was able to create high-quality edited images and GIFs.

Noyan Evirgen and Xiang 'Anthony Chen. 2023. GANravel: User-Driven Direction Disentanglement in Generative Adversarial Networks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23). Association for Computing Machinery, New York, NY, USA, Article 19, 1-15. https://doi.org/10.1145/3544548.3581226

@inproceedings{10.1145/3544548.3581226,
author = {Evirgen, Noyan and Chen, Xiang 'Anthony},
title = {GANravel: User-Driven Direction Disentanglement in Generative Adversarial Networks},
year = {2023},
isbn = {9781450394215},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3544548.3581226},
doi = {10.1145/3544548.3581226},
abstract = {Generative adversarial networks (GANs) have many application areas including image editing, domain translation, missing data imputation, and support for creative work. However, GANs are considered 'black boxes'. Specifically, the end-users have little control over how to improve editing directions through disentanglement. Prior work focused on new GAN architectures to disentangle editing directions. Alternatively, we propose GANravel—a user-driven direction disentanglement tool that complements the existing GAN architectures and allows users to improve editing directions iteratively. In two user studies with 16 participants each, GANravel users were able to disentangle directions and outperformed the state-of-the-art direction discovery baselines in disentanglement performance. In the second user study, GANravel was used in a creative task of creating dog memes and was able to create high-quality edited images and GIFs.},
booktitle = {Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems},
articleno = {19},
numpages = {15},
keywords = {Disentanglement, Generative Adversarial Networks, Interactive Systems, Explainable-AI},
location = {Hamburg, Germany},
series = {CHI '23}
}

×

Improving Workflow Integration with xPath: Design and Evaluation of a Human-AI Diagnosis System in Pathology

Hongyan Gu, Yuan Liang, Yifan Xu, Chunxu Yang, Wenzhong Yan (UCLA HCI Research)
Christopher Kazu Williams, Shino Magaki, Negar Khanlou, Harry Vinters, Zesheng Chen (UCLA David Geffen School of Medicine)
Shuo Ni (UCLA and USC)
Xinhai Robert Zhang (University of Texas Health Science Center at Houston)
Yang Li (Google Research)
Mohammad Haeri (University of Kansas Medical Center)
Xiang 'Anthony' Chen (UCLA HCI Research)

Recent developments in AI have provided assisting tools to support pathologists' diagnoses. However, it remains challenging to incorporate such tools into pathologists' practice; one main concern is AI's insufficient workflow integration with medical decisions. We observed pathologists' examination and discovered that the main hindering factor to integrate AI is its incompatibility with pathologists' workflow. To bridge the gap between pathologists and AI, we developed a human-AI collaborative diagnosis tool — xPath — that shares a similar examination process to that of pathologists, which can improve AI's integration into their routine examination. The viability of xPath is confirmed by a technical evaluation and work sessions with twelve medical professionals in pathology. This work identifies and addresses the challenge of incorporating AI models into pathology, which can offer first-hand knowledge about how HCI researchers can work with medical professionals side-by-side to bring technological advances to medical tasks towards practical applications.

Hongyan Gu, Yuan Liang, Yifan Xu, Christopher Kazu Williams, Shino Magaki, Negar Khanlou, Harry Vinters, Zesheng Chen, Shuo Ni, Chunxu Yang, Wenzhong Yan, Xinhai Robert Zhang, Yang Li, Mohammad Haeri, and Xiang 'Anthony' Chen. 2022. Improving Workflow Integration with xPath: Design and Evaluation of a Human-AI Diagnosis System in Pathology. ACM Trans. Comput.-Hum. Interact. Just Accepted (December 2022). https://doi.org/10.1145/3577011

@article{10.1145/3577011,
author = {Gu, Hongyan and Liang, Yuan and Xu, Yifan and Williams, Christopher Kazu and Magaki, Shino and Khanlou, Negar and Vinters, Harry and Chen, Zesheng and Ni, Shuo and Yang, Chunxu and Yan, Wenzhong and Zhang, Xinhai Robert and Li, Yang and Haeri, Mohammad and Chen, Xiang 'Anthony'},
title = {Improving Workflow Integration with XPath: Design and Evaluation of a Human-AI Diagnosis System in Pathology},
year = {2022},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1073-0516},
url = {https://doi.org/10.1145/3577011},
doi = {10.1145/3577011},
abstract = {Recent developments in AI have provided assisting tools to support pathologists' diagnoses. However, it remains challenging to incorporate such tools into pathologists' practice; one main concern is AI's insufficient workflow integration with medical decisions. We observed pathologists' examination and discovered that the main hindering factor to integrate AI is its incompatibility with pathologists' workflow. To bridge the gap between pathologists and AI, we developed a human-AI collaborative diagnosis tool — xPath— that shares a similar examination process to that of pathologists, which can improve AI's integration into their routine examination. The viability of xPath is confirmed by a technical evaluation and work sessions with twelve medical professionals in pathology. This work identifies and addresses the challenge of incorporating AI models into pathology, which can offer first-hand knowledge about how HCI researchers can work with medical professionals side-by-side to bring technological advances to medical tasks towards practical applications.},
note = {Just Accepted},
journal = {ACM Trans. Comput.-Hum. Interact.},
month = {dec},
keywords = {Human-AI collaboration; digital pathology; medical AI; meningioma}
}

×

CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding

Xingyu "Bruce" Liu (UCLA HCI Research)
Ruolin Wang (UCLA HCI Research)
Dingzeyu Li, Adobe Research
Xiang 'Anthony' Chen (UCLA HCI Research)
Amy Pavel, UT Austin

Authors make their videos visually accessible by adding audio descriptions (AD), and auditorily accessible by adding closed captions (CC). However, creating AD and CC is challenging and tedious, especially for non-professional describers and captioners, due to the difficulty of identifying accessibility problems in videos. A video author will have to watch the video through and manually check for inaccessible information frame-by-frame, for both visual and auditory modalities. In this paper, we present CrossA11y, a system that helps authors efficiently detect and address visual and auditory accessibility issues in videos. Using cross-modal grounding analysis, CrossA11y automatically measures accessibility of visual and audio segments in a video by checking for modality asymmetries. CrossA11y then displays these segments and surfaces visual and audio accessibility issues in a unified interface, making it intuitive to locate, review, script AD/CC in-place, and preview the described and captioned video immediately. We demonstrate the effectiveness of CrossA11y through a lab study with 11 participants, comparing to existing baseline.

Xingyu "Bruce" Liu, Ruolin Wang, Dingzeyu Li, Xiang Anthony Chen, and Amy Pavel. 2022. CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (UIST '22). Association for Computing Machinery, New York, NY, USA, Article 43, 1-14. https://doi.org/10.1145/3526113.3545703

@inproceedings{10.1145/3526113.3545703,
author = {Liu, Xingyu "Bruce" and Wang, Ruolin and Li, Dingzeyu and Chen, Xiang Anthony and Pavel, Amy},
title = {CrossA11y: Identifying Video Accessibility Issues via Cross-Modal Grounding},
year = {2022},
isbn = {9781450393201},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3526113.3545703},
doi = {10.1145/3526113.3545703},
abstract = {Authors make their videos visually accessible by adding audio descriptions (AD), and auditorily accessible by adding closed captions (CC). However, creating AD and CC is challenging and tedious, especially for non-professional describers and captioners, due to the difficulty of identifying accessibility problems in videos. A video author will have to watch the video through and manually check for inaccessible information frame-by-frame, for both visual and auditory modalities. In this paper, we present CrossA11y, a system that helps authors efficiently detect and address visual and auditory accessibility issues in videos. Using cross-modal grounding analysis, CrossA11y automatically measures accessibility of visual and audio segments in a video by checking for modality asymmetries. CrossA11y then displays these segments and surfaces visual and audio accessibility issues in a unified interface, making it intuitive to locate, review, script AD/CC in-place, and preview the described and captioned video immediately. We demonstrate the effectiveness of CrossA11y through a lab study with 11 participants, comparing to existing baseline.},
booktitle = {Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology},
articleno = {43},
numpages = {14},
keywords = {video, accessibility, closed caption, audio description},
location = {Bend, OR, USA},
series = {UIST '22}
}

×

GANzilla: User-Driven Direction Discovery in Generative Adversarial Networks

Noyan Evirgen (UCLA HCI Research)
Xiang 'Anthony' Chen (UCLA HCI Research)

Generative Adversarial Network (GAN) is widely adopted in numerous application areas, such as data preprocessing, image editing, and creativity support. However, GAN's 'black box' nature prevents non-expert users from controlling what data a model generates, spawning a plethora of prior work that focused on algorithm-driven approaches to extract editing directions to control GAN. Complementarily, we propose a GANzilla—a user-driven tool that empowers a user with the classic scatter/gather technique to iteratively discover directions to meet their editing goals. In a study with 12 participants, GANzilla users were able to discover directions that (i) edited images to match provided examples (closed-ended tasks) and that (ii) met a high-level goal, e.g., making the face happier, while showing diversity across individuals (open-ended tasks).

Noyan Evirgen and Xiang 'Anthony' Chen. 2022. GANzilla: User-Driven Direction Discovery in Generative Adversarial Networks. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (UIST '22). Association for Computing Machinery, New York, NY, USA, Article 75, 1-10. https://doi.org/10.1145/3526113.3545638

@inproceedings{10.1145/3526113.3545638,
author = {Evirgen, Noyan and Chen, Xiang 'Anthony'},
title = {GANzilla: User-Driven Direction Discovery in Generative Adversarial Networks},
year = {2022},
isbn = {9781450393201},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3526113.3545638},
doi = {10.1145/3526113.3545638},
abstract = {Generative Adversarial Network (GAN) is widely adopted in numerous application areas, such as data preprocessing, image editing, and creativity support. However, GAN's 'black box' nature prevents non-expert users from controlling what data a model generates, spawning a plethora of prior work that focused on algorithm-driven approaches to extract editing directions to control GAN. Complementarily, we propose a GANzilla—a user-driven tool that empowers a user with the classic scatter/gather technique to iteratively discover directions to meet their editing goals. In a study with 12 participants, GANzilla users were able to discover directions that (i) edited images to match provided examples (closed-ended tasks) and that (ii) met a high-level goal, e.g., making the face happier, while showing diversity across individuals (open-ended tasks).},
booktitle = {Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology},
articleno = {75},
numpages = {10},
keywords = {Direction Discovery, Interactive Systems, Explainable-AI, Generative Adversarial Networks},
location = {Bend, OR, USA},
series = {UIST '22}
}

×

EmoGlass: an End-to-End AI-Enabled Wearable Platform for Enhancing Self-Awareness of Emotional Health

Zihan Yan (UCLA HCI Research, Zhejiang University)
Yufei Wu (Zhejiang University)
Yang Zhang (UCLA HCI Research)
Xiang 'Anthony' Chen (UCLA HCI Research)

Often, emotional disorders are overlooked due to theirlack of awareness, resulting in potential mental issues. Recent advances in sensing and inference technology provide a viable path to wearable facial-expression-based emotion recognition. However, most prior work has explored only laboratory settings and few platforms are geared towards end-users in everyday lives or provide personalized emotional suggestions to promote self-regulation. We present EmoGlass, an end-to-end wearable platform that consists of emotion detection glasses and an accompanying mobile application. Our single-camera-mounted glasses can detect seven facial expressions based on partial face images. We conducted a three-day out-of-lab study (N=15) to evaluate the performance of EmoGlass. We iterated on the design of the EmoGlass application for efective self-monitoring and awareness of users' daily emotional states. We report quantitative and qualitative fndings, based on which we discuss design recommendations for future work on sensing and enhancing awareness of emotional health.

Zihan Yan, Yufei Wu, Yang Zhang, Xiang 'Anthony' Chen. EmoGlass: an End-to-End AI-Enabled Wearable Platform for Enhancing Self-Awareness of Emotional Health. In CHI Conference on Human Factors in Computing Systems, pp. 1-19. 2022. https://doi.org/10.1145/3491102.3501925

@inproceedings{10.1145/3491102.3501925,
author = {Yan, Zihan and Wu, Yufei and Zhang, Yang and Chen, Xiang 'Anthony'},
title = {EmoGlass: An End-to-End AI-Enabled Wearable Platform for Enhancing Self-Awareness of Emotional Health},
year = {2022},
isbn = {9781450391573},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3491102.3501925},
doi = {10.1145/3491102.3501925},
booktitle = {CHI Conference on Human Factors in Computing Systems},
articleno = {13},
numpages = {19},
keywords = {Facial expression detection, Wearable, Emotion sensing, Mobile health, Mental health},
location = {New Orleans, LA, USA},
series = {CHI '22}
}

×

Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on Mechanisms

Jiahao Li (UCLA HCI Research)
Alexis A Samoylov (UCLA HCI Research)
Jeeeun Kim (HCIED Lab, Texas A&M University)
Xiang 'Anthony' Chen (UCLA HCI Research)

One important vision of robotics is to provide physical assistance by manipulating different everyday objects, e.g., hand tools, kitchen utensils. However, many objects designed for dexterous hand-control are not easily manipulable by a single robotic arm with a generic parallel gripper. Complementary to existing research on developing grippers and control algorithms, we present Roman, a suite of hardware design and software tool support for robotic engineers to create 3D printable mechanisms attached to everyday handheld objects, making them easier to be manipulated by conventional robotic arms. The Roman hardware comes with a versatile magnetic gripper that can snap on/off handheld objects and drive add-on mechanisms to perform tasks. Roman also provides software support to register and author control programs. To validate our approach, we designed and fabricated Roman mechanisms for 14 everyday objects/tasks presented within a design space and conducted expert interviews with robotic engineers indicating that Roman serves as a practical alternative for enabling robotic manipulation of everyday objects.

Jiahao Li, Alexis Samoylov, Jeeeun Kim, and Xiang 'Anthony' Chen. Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on Mechanisms. In CHI Conference on Human Factors in Computing Systems, pp. 1-17. 2022. https://doi.org/10.1145/3491102.3501818

@inproceedings{10.1145/3491102.3501818,
author = {Li, Jiahao and Samoylov, Alexis and Kim, Jeeeun and Chen, Xiang 'Anthony'},
title = {Roman: Making Everyday Objects Robotically Manipulable with 3D-Printable Add-on Mechanisms},
year = {2022},
isbn = {9781450391573},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3491102.3501818},
doi = {10.1145/3491102.3501818},
booktitle = {CHI Conference on Human Factors in Computing Systems},
articleno = {272},
numpages = {17},
keywords = {mechanism design., handheld objects augmentation, Robotic grasping and manipulation},
location = {New Orleans, LA, USA},
series = {CHI '22}
}

×

Lessons Learned from Designing an AI-Enabled Diagnosis Tool for Pathologists

Hongyan Gu (UCLA HCI Research)
Jingbin Huang (UCLA HCI Research)
Lauren Hung (CMU HCII)
Xiang 'Anthony' Chen (UCLA HCI Research)

Despite the promises of data-driven artificial intelligence (AI), little is known about how we can bridge the gulf between traditional physician-driven diagnosis and a plausible future of medicine automated by AI. Specifically, how can we involve AI usefully in physicians' diagnosis workflow given that most AI is still nascent and error-prone (e.g., in digital pathology)? To explore this question, we first propose a series of collaborative techniques to engage human pathologists with AI given AI's capabilities and limitations, based on which we prototype Impetus — a tool where an AI takes various degrees of initiatives to provide various forms of assistance to a pathologist in detecting tumors from histological slides. We summarize observations and lessons learned from a study with eight pathologists and discuss recommendations for future work on human-centered medical AI systems.

Hongyan Gu, Jingbin Huang, Lauren Hung, and Xiang 'Anthony' Chen. 2021. Lessons Learned from Designing an AI-Enabled Diagnosis Tool for Pathologists. Proc. ACM Hum.-Comput. Interact. 5, CSCW 1, Article 10 (April 2021), 25 pages. https://doi.org/10.1145/3449084

@article{10.1145/3449084,
title={Lessons Learned from Designing an AI-Enabled Diagnosis Tool for Pathologists},
author={Gu, Hongyan and Huang, Jingbin and Hung, Lauren and Chen, Xiang Anthony},
journal={Proceedings of the ACM on Human-computer Interaction},
volume={5},
number={CSCW},
pages={1--25},
year={2021},
issue_date = {April 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3449084},
doi = {10.1145/3449084},
articleno = {10},
numpages = {25}
}

×

What Makes Videos Accessible to Blind and Visually Impaired People?

Xingyu Liu (UCLA HCI Research)
Patrick Carrington (Carnegie Mellon University)
Xiang 'Anthony' Chen (UCLA HCI Research)
Amy Pavel (Carnegie Mellon University)

User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description of the visual content, BVI people in our formative studies reported that they used a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7of the identified heuristics (2 audio-related, 2 video-related, and 3audio-visual) as automated metrics to assess video accessibility. We collected a dataset of accessibility ratings of videos by BVI people and found that our automatic video accessibility metrics correlated with the accessibility ratings (Adjusted R^2= 0.642). We augmented a video search interface with our video accessibility metrics and predictions. BVI people using our augmented video search interface selected an accessible video more efficiently than when using the original search interface. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.

Xingyu Liu, Patrick Carrington, Xiang 'Anthony' Chen, and Amy Pavel. 2021. What Makes Videos Accessible to Blind and Visually Impaired People? In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 272, 1-14. DOI:https://doi.org/10.1145/3411764.3445233

@inproceedings{10.1145/3411764.3445233,
author = {Liu, Xingyu and Carrington, Patrick and Chen, Xiang 'Anthony' and Pavel, Amy},
title = {What Makes Videos Accessible to Blind and Visually Impaired People?},
year = {2021},
isbn = {9781450380966},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3411764.3445233},
doi = {10.1145/3411764.3445233},
abstract = { User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description of the visual content, BVI people in our formative studies reported that they used a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7 of the identified heuristics (2 audio-related, 2 video-related, and 3 audio-visual) as automated metrics to assess video accessibility. We collected a dataset of accessibility ratings of videos by BVI people and found that our automatic video accessibility metrics correlated with the accessibility ratings (Adjusted R2 = 0.642). We augmented a video search interface with our video accessibility metrics and predictions. BVI people using our augmented video search interface selected an accessible video more efficiently than when using the original search interface. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.},
booktitle = {Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems},
articleno = {272},
numpages = {14},
keywords = {accessibility, visual impairments, online videos, blind},
location = {Yokohama, Japan},
series = {CHI '21}
}

×

Revamp: Enhancing Accessible Information Seeking Experience of Online Shopping for Blind or Low Vision Users

Ruolin Wang (UCLA HCI Research)
Zixuan Chen (UCLA HCI Research)
Mingrui 'Ray' Zhang (The Information School, University of Washington)
Zhaoheng Li (Department of Computer Science and Technology, Tsinghua University)
Zhixiu Liu (Computer Science Department, Stanford University)
Zihan Dang (UCLA HCI Research)
Chun Yu (Department of Computer science and Technology, Tsinghua University)
Xiang 'Anthony' Chen (UCLA HCI Research)

Online shopping has become a valuable modern convenience, but blind or low vision (BLV) users still face significant challenges using it, because of: 1) inadequate image descriptions and 2) the inability to filter large amounts of information using screen readers. To address those challenges, we propose Revamp, a system that leverages customer reviews for interactive information retrieval. Revamp is a browser integration that supports review-based question-answering interactions on a reconstructed product page. From our interview, we identified four main aspects (color, logo, shape, and size) that are vital for BLV users to understand the visual appearance of a product. Based on the findings, we formulated syntactic rules to extract review snippets, which were used to generate image descriptions and responses to users' queries. Evaluations with eight BLV users showed that Revamp 1) provided useful descriptive information for understanding product appearance and 2) helped the participants locate key information efficiently.

Ruolin Wang, Zixuan Chen, Mingrui Zhang, Mingrui Ray Zhang, Zhaoheng Li, Zhixiu Liu, Zihan Dang, Chun Yu, and Xiang 'Anthony' Chen. 2021. Revamp: Enhancing Accessible Information Seeking Experience of Online Shopping for Blind or Low Vision Users. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1-14. CHI '21. New York, NY, USA: ACM, 2021. https://doi.org/10.1145/3411764.3445547.

@inproceedings{Wang:2021:Revamp,
author = {Wang, Ruolin and Chen, Zixuan and Zhang, Mingrui and Zhang, Mingrui Ray and Li, Zhaoheng and Liu, Zhixiu and Dang, Zihan and Yu, Chun and Chen, Xiang Anthony.},
title = {Revamp: Enhancing Accessible Information Seeking Experience of Online Shopping for Blind or Low Vision Users},
year = {2021},
isbn = {978-1-4503-8096-6/21/05},
publisher = {ACM},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3411764.3445547},
doi = {10.1145/3411764.3445547},
booktitle = {Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems},
pages = {1-14},
numpages = {14},
keywords = {Online shopping; Information Retrieval; Accessibility; Blind or Low Vision Users; Reviews; Image Description; Question-answering},
location = {Tokohama, Japan},
series = {CHI '21}
}

×

XAlgo: a Design Probe of Explaining Algorithms' Internal States via Question-Answering

Juan Rebanal (UCLA HCI Research)
Jordan Combitsis (UCLA HCI Research)
Yuqi Tang (UCLA HCI Research)
Xiang 'Anthony' Chen (UCLA HCI Research)

Algorithms often appear as 'black boxes' to non-expert users. While prior work focuses on explainable representations and expert-oriented exploration, we propose and study an interactive approach using question answering to explain deterministic algorithms to non-expert users who need to understand the algorithms' internal states (e.g., students learning algorithms, operators monitoring robots, admins troubleshooting network routing). We construct XAlgo---a formal model that first classifies the type of question based on a taxonomy and generates an answer based on a set of rules that extract information from representations of an algorithm's internal states, e.g., the pseudocode. A design probe in an algorithm learning scenario with 18 participants (9 for a Wizard-of-Oz XAlgo and 9 as a control group) reports findings and design implications based on what kinds of questions people ask, how well XAlgo responds, and what remain as challenges to bridge users' gulf of understanding algorithms.

Juan Rebanal, Jordan Combitsis, Yuqi Tang, Xiang 'Anthony' Chen. XAlgo: a Design Probe of Explaining Algorithms' Internal States via Question-Answering. In Proceedings of the 26th International Conference on Intelligent User Interfaces (IUI '21), April 14--17, 2021, College Station, TX, USA. https://doi.org/10.1145/3397481.3450676.

@inproceedings{10.1145/2856767.2856812,
author = {Rebanal, Juan and Tang, Yuqi and Combitsis, Jordan and Chen, Xiang 'Anthony'},
title = {XAlgo: a Design Probe of Explaining Algorithms' Internal States via Question-Answering},
year = {2021},
isbn = {978-1-4503-8017-1/21/04},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3397481.3450676},
doi = {10.1145/3397481.3450676},
booktitle = {Proceedings of the 26st International Conference on Intelligent User Interfaces},
series = {IUI '21}
}

×

OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-ray

Yuan Liang (UCLA HCI Research)
Liang Qiu (UCLA)
Tiancheng Lu (University of Pittsburgh)
Zhujun Fang, Dezhan Tu, Jiawei Yang, Yiting Shao, Kun Wang (UCLA)
Xiang 'Anthony' Chen (UCLA HCI Research)
Lei He (UCLA)

Patient's understanding on forthcoming dental surgeries is required by patient-centered care and helps reduce anxiety. Due to the complexity of dental surgeries and the patient-dentist expertise gap, conventional techniques of patient education are usually not effective for explaining surgical steps. In this paper, we present OralViewer—the first interactive application that enables dentist's demonstration of dental surgeries in 3D to promote patients' understanding. OralViewer takes a single 2D panoramic dental X-ray to reconstruct patient-specific 3D teeth structures, which are then assembled with registered gum and jaw bone models for complete oral cavity modeling. During the demonstration, OralViewer enables dentists to show surgery steps with virtual dental instruments that can animate effects on a 3D model in real-time. A technical evaluation shows that our deep learning model achieves a mean Intersection over Union (IoU) of 0.771 for 3D teeth reconstruction. A patient study with 12 participants shows OralViewer can improve patients' understanding of surgeries. A preliminary expert study with 3 board-certified dentists further verifies the clinical validity of our system.

Yuan Liang, Liang Qiu, Tiancheng Lu, Zhujun Fang, Dezhan Tu, Jiawei Yang, Yiting Shao, Kun Wang, Xiang 'Anthony' Chen, and Lei He. 2021. OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-ray. In 26th International Conference on Intelligent User Interfaces (IUI '21). Association for Computing Machinery, New York, NY, USA, 553-563. DOI:https://doi.org/10.1145/3397481.3450695

@inproceedings{10.1145/3397481.3450695,
author = {Liang, Yuan and Qiu, Liang and Lu, Tiancheng and Fang, Zhujun and Tu, Dezhan and Yang, Jiawei and Shao, Yiting and Wang, Kun and Chen, Xiang 'Anthony' and He, Lei},
title = {OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-Ray},
year = {2021},
isbn = {9781450380171},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3397481.3450695},
doi = {10.1145/3397481.3450695},
abstract = { Patient's understanding on forthcoming dental surgeries is required by patient-centered care and helps reduce anxiety. Due to the complexity of dental surgeries and the patient-dentist expertise gap, conventional techniques of patient education are usually not effective for explaining surgical steps. In this paper, we present OralViewer—the first interactive application that enables dentist's demonstration of dental surgeries in 3D to promote patients' understanding. OralViewer takes a single 2D panoramic dental X-ray to reconstruct patient-specific 3D teeth structures, which are then assembled with registered gum and jaw bone models for complete oral cavity modeling. During the demonstration, OralViewer enables dentists to show surgery steps with virtual dental instruments that can animate effects on a 3D model in real-time. A technical evaluation shows that our deep learning model achieves a mean Intersection over Union (IoU) of 0.771 for 3D teeth reconstruction. A patient study with 12 participants shows OralViewer can improve patients' understanding of surgeries. A preliminary expert study with 3 board-certified dentists further verifies the clinical validity of our system.},
booktitle = {26th International Conference on Intelligent User Interfaces},
pages = {553-563},
numpages = {11},
keywords = {deep learning, 3D visualization, patient education},
location = {College Station, TX, USA},
series = {IUI '21}
}

×

DualVib: Simulating Haptic Sensation of Dynamic Mass by Combining Pseudo-Force and Texture Feedback

Yudai Tanaka (University of Chicago)
Arata Horie (The University of Tokyo)
Xiang 'Anthony' Chen (UCLA HCI Research)

We present DualVib, a compact handheld device that simulates the haptic sensation of manipulating dynamic mass; mass that causes haptic feedback as the user's hand moves (e.g., shaking a jar and feeling coins rattling inside). Unlike other devices that require actual displacement of weight, DualVib dispenses with heavy and bulky mechanical structures and, instead, uses four vibration actuators. DualVib simulates a dynamic mass by simultaneously delivering two types of haptic feedback to the user's hand: (1) pseudo-force feedback created by asymmetric vibrations that render the kinesthetic force arising from the moving mass; and (2) texture feedback through acoustic vibrations that render the object's surface vibrations correlated with mass material properties. By means of our user study, we found out that DualVib allowed users to more effectively distinguish dynamic masses when compared to using either pseudo-force or texture feedback alone. We also report qualitative feedback from users who experienced five virtual reality applications with our device.

Yudai Tanaka, Arata Horie, Xiang 'Anthony' Chen. 2020. DualVib: Simulating Haptic Sensation of Dynamic Mass by Combining Pseudo-Force and Texture Feedback. In 26th ACM Symposium on Virtual Reality Software and Technology, 1-10. VRST '20. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3385956.3418964.

@inproceedings{10.1145/3385956.3418964,
   author = {Tanaka, Yudai and Horie, Arata and Chen, Xiang 'Anthony'},
   title = {DualVib: Simulating Haptic Sensation of Dynamic Mass by Combining Pseudo-Force and Texture Feedback},
   year = {2020},
   isbn = {9781450376198},
   publisher = {Association for Computing Machinery},
   address = {New York, NY, USA},
   url = {https://doi.org/10.1145/3385956.3418964},
   doi = {10.1145/3385956.3418964},
   booktitle = {26th ACM Symposium on Virtual Reality Software and Technology},
   pages = {1-10},
   numpages = {10},
   keywords = {haptics, virtual reality, mass perception, vibration},
   location = {Virtual Event, Canada},
   series = {VRST '20}
}

×

Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications

Ritam Jyoti Sarmah (UCLA HCI Research)
Yunpeng Ding (UCLA HCI Research)
Di Wang (UCSD Computer Science)
Cheuk Yin Phipson Lee (UCLA HCI Research)
Toby Jia-Jun Li (CMU HCII)
Xiang 'Anthony' Chen (UCLA HCI Research)

Supporting voice commands in applications presents significant benefits to users. However, adding such support to existing GUI-based web apps is effort-consuming with a high learning barrier, as shown in our formative study, due to the lack of unified support for creating multimodal interfaces. We present Geno---a developer tool for adding the voice input modality to existing web apps without requiring significant NLP expertise. Geno provides a high-level workflow for developers to specify functionalities to be supported by voice (intents), create language models for detecting intents and the relevant information (parameters) from user utterances, and fulfill the intents by either programmatically invoking the corresponding functions or replaying GUI actions on the web app. Geno further supports multimodal references to GUI context in voice commands (e.g. 'move this [event] to next week' while pointing at an event with the cursor). In a study, developers with little NLP expertise were able to add multimodal voice command support for two existing web apps using Geno.

Sarmah, R.J., Ding, Y., Wang, D., Lee, C.Y.P., Li, T.J.J. and Chen, X.A., 2020, October. Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (pp. 1169-1181).

@inproceedings{sarmah2020geno,
title={Geno: A Developer Tool for Authoring Multimodal Interaction on Existing Web Applications},
author={Sarmah, Ritam Jyoti and Ding, Yunpeng and Wang, Di and Lee, Cheuk Yin Phipson and Li, Toby Jia-Jun and Chen, XiangAnthony},
booktitle={Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology},
pages={1169--1181},
year={2020}
}

×

Romeo: A Design Tool for Embedding Transformable Parts in 3D Models to Robotically Augment Default Functionalities

Jiahao Li (UCLA HCI Research)
Meilin Cui (UCLA HCI Research)
Jeeeun Kim (Texas A&M University)
Xiang 'Anthony' Chen (UCLA HCI Research)

Reconfiguring shapes of objects enables transforming existing passive objects with robotic functionalities, e.g., a transformable coffee cup holder can be attached to a chair's armrest, a piggy bank can reach out an arm to 'steal' coins. Despite the advance in end-user 3D design and fabrication, it remains challenging for non-experts to create such 'transformables' using existing tools due to the requirement of specific engineering knowledge such as mechanisms and robotic design. We present Romeo -- a design tool for creating transformables to robotically augment objects' default functionalities. Romeo allows users to transform an object into a robotic arm by expressing at a high level what type of task is expected. Users can select which part of the object to be transformed, specify motion points in space for the transformed part to follow and the corresponding action to be taken. Romeo then automatically generates a robotic arm embedded in the transformable part ready for fabrication. A design session validated this tool where participants used Romeo to accomplish controlled design tasks and to open-endedly create coin-stealing piggy banks by transforming 3D objects of their own choice.

Li, J., Cui, M., Kim, J. and Chen, X.A., 2020, October. Romeo: A Design Tool for Embedding Transformable Parts in 3D Models to Robotically Augment Default Functionalities. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (pp. 897-911).

@inproceedings{li2020romeo,
title={Romeo: A Design Tool for Embedding Transformable Parts in 3D Models to Robotically Augment Default Functionalities},
author={Li, Jiahao and Cui, Meilin and Kim, Jeeeun and Chen, XiangAnthony},
booktitle={Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology},
pages={897--911},
year={2020}
}

×

CheXplain: Enabling Physicians to Explore and Understand Data-Driven, AI-Enabled Medical Imaging Analysis

Yao Xie (UCLA HCI Research)
Melody Chen (UCLA HCI Research)
David Kao (UCLA HCI Research)
Ge Gao (University of Maryland, College Park)
Xiang 'Anthony' Chen (UCLA HCI Research)

The recent development of data-driven AI promises to automate medical diagnosis; however, most AI functions as 'black boxes' to physicians with limited computational knowledge. Using medical imaging as a point of departure, we conducted three research activities to formulate the design of CheXplain---a system that enables physicians to explore and understand AI-enabled chest X-ray analysis: (i) a paired survey between referring physicians and radiologists reveals whether, when, and what kinds of explanations are needed; (ii) a low-fidelity prototype co-designed with three physicians formulates eight key features; and (iii) a high-fidelity prototype evaluated by another six medical professionals provides detailed summative insights on how each feature enables the exploration and understanding of AI. We summarize by discussing recommendations for future work to design and implement explainable medical AI systems that encompass four recurring themes: motivation, constraint, explanation, and justification.

Yao Xie, Melody Chen, David Kao, Ge Gao, and Xiang 'Anthony' Chen. 2020. CheXplain: Enabling Physicians to Explore and Understand Data-Driven, AI-Enabled Medical Imaging Analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI '20). Association for Computing Machinery, New York, NY, USA, 1-13. DOI:https://doi.org/10.1145/3313831.3376807

@inproceedings{10.1145/3313831.3376807,
   author = {Xie, Yao and Chen, Melody and Kao, David and Gao, Ge and Chen, Xiang 'Anthony'},
   title = {CheXplain: Enabling Physicians to Explore and Understand Data-Driven, AI-Enabled Medical Imaging Analysis},
   year = {2020},
   isbn = {9781450367080},
   publisher = {Association for Computing Machinery},
   address = {New York, NY, USA},
   url = {https://doi.org/10.1145/3313831.3376807},
   doi = {10.1145/3313831.3376807},
   booktitle = {Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems},
   pages = {1-13},
   numpages = {13},
   keywords = {explainable artificial intelligence, physician-centered design, system design},
   location = {Honolulu, HI, USA},
   series = {CHI '20}
}

×

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Yuan Liang (UCLA HCI Research & Design Automation Lab)
Hsuan Wei Fan (Tsinghua University)
Zhujun Fang (UC Davis)
Leiying Miao, Wen Li, Xuan Zhang, Weibin Sun (Nanjing Stomatological Hospital, Meidcal School)
Kun Wang, Lei He (UCLA Design Automation Lab)
Xiang 'Anthony' Chen (UCLA HCI Research)

Due to a lack of resources or awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support users to self-examine their oral health condition. We present OralCam, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity. OralCam allows a user to annotate additional information to augment the input image, and presents the output hierarchically, probabilistically and with visual explanations to help laymen users understand examination results. We describe a deep learning backend trained on a data set that consists of 3,182 oral photos. We report a technical evaluation, a week-long in-the-wild user study and an expert interview to validate OralCam.

Yuan Liang, Hsuan Wei Fan, Zhujun Fang, Leiying Miao, Wen Li, Xuan Zhang, Weibin Sun, Kun Wang, Lei He, and Xiang 'Anthony' Chen. 2020. OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI '20). Association for Computing Machinery, New York, NY, USA, 1-13. DOI:https://doi.org/10.1145/3313831.3376238

@inproceedings{10.1145/3313831.3376238,
   author = {Liang, Yuan and Fan, Hsuan Wei and Fang, Zhujun and Miao, Leiying and Li, Wen and Zhang, Xuan and Sun, Weibin and Wang, Kun and He, Lei and Chen, Xiang 'Anthony'},
   title = {OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera},
   year = {2020},
   isbn = {9781450367080},
   publisher = {Association for Computing Machinery},
   address = {New York, NY, USA},
   url = {https://doi.org/10.1145/3313831.3376238},
   doi = {10.1145/3313831.3376238},
   booktitle = {Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems},
   pages = {1-13},
   numpages = {13},
   keywords = {mobile health, oral health, artificial intelligence, deep learning},
   location = {Honolulu, HI, USA},
   series = {CHI '20}
}

×

Robiot: A Design Tool for Actuating Everyday Object with Automatically Generated 3D Printable Mechanism

Jiahao Li (UCLA HCI Research)
Jeeeun Kim (Texas A&M University)
Xiang 'Anthony' Chen (UCLA HCI Research)

Users can now easily communicate information with an Internet of Things; in contrast, there remains a lack of support to automate tasks that involve legacy static objects, e.g. adjusting a desk lamp's angle for optimal brightness, turning on/off a manual faucet when washing dishes, sliding a window to maintain a preferred indoor temperature. Automating these simple physical tasks has the potential to improve people's quality of life, which is particularly important for people with a disability or in situational impairment.

We present Robiot -- a design tool for generating mechanisms that can be attached to, motorized, and actuating legacy static objects to perform simple physical tasks. Users only need to take a short video manipulating an object to demonstrate an intended physical behavior. Robiot then extracts requisite parameters and automatically generates 3D models of the enabling actuation mechanisms by performing a scene and motion analysis of the 2D video in alignment with the object's 3D model. In an hour-long design session, six participants used Robiot to actuate seven everyday objects, imbuing them with the robotic capability to automate various physical tasks.

Li, J., Kim, J., & Chen, X. “Anthony.” (2019). Robiot: A Design Tool for Actuating Everyday Objects with Automatically Generated 3D Printable Mechanisms. In Proceedings of the 32Nd Annual ACM Symposium on User Interface Software and Technology (pp. 673-685). New York, NY, USA: ACM. https://doi.org/10.1145/3332165.3347894

@inproceedings{Li:2019:RDT:3332165.3347894,
author = {Li, Jiahao and Kim, Jeeeun and Chen, Xiang 'Anthony'},
title = {Robiot: A Design Tool for Actuating Everyday Objects with Automatically Generated 3D Printable Mechanisms},
booktitle = {Proceedings of the 32Nd Annual ACM Symposium on User Interface Software and Technology},
series = {UIST '19},
year = {2019},
isbn = {978-1-4503-6816-2},
location = {New Orleans, LA, USA},
pages = {673--685},
numpages = {13},
url = {http://doi.acm.org/10.1145/3332165.3347894},
doi = {10.1145/3332165.3347894},
acmid = {3347894},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {actuation, design tool, everyday objects, generative design},
}

×

Minuet: Multimodal Interaction with an Internet of Things

Runchang Kang & Anhong Guo (Carnegie Mellon University)
Gierard Laput (Apple)
Yang Li (Google)
Xiang 'Anthony' Chen (UCLA HCI Research)

A large number of Internet-of-Things (IoT) devices will soon populate our physical environments. Yet, IoT devices' reliance on mobile applications and voice-only assistants as the primary interface limits their scalability and expressiveness. Building off of the classic 'Put-That-There' system, we contribute an exploration of the design space of voice + gesture interaction with spatially-distributed IoT devices. Our design space decomposes users' IoT commands into two components---selection and interaction. We articulate how the permutations of voice and freehand gesture for these two components can complementarily afford interaction possibilities that go beyond current approaches. We instantiate this design space as a proof-of-concept sensing platform and demonstrate a series of novel IoT interaction scenarios, such as making 'dumb' objects smart, commanding robotic appliances, and resolving ambiguous pointing at cluttered devices.

Kang, R., Guo, A., Laput, G., Li, Y., & Chen, X. “Anthony.” (2019). Minuet: Multimodal Interaction with an Internet of Things. In Symposium on Spatial User Interaction (pp. 2:1--2:10). New York, NY, USA: ACM. https://doi.org/10.1145/3357251.3357581

@inproceedings{Kang:2019:MMI:3357251.3357581,
address = {New York, NY, USA},
author = {Kang, Runchang and Guo, Anhong and Laput, Gierad and Li, Yang and Chen, Xiang 'Anthony'},
booktitle = {Symposium on Spatial User Interaction},
doi = {10.1145/3357251.3357581},
isbn = {978-1-4503-6975-6},
keywords = { gesture, multimodal interaction, voice,Internet-of-Things},
pages = {2:1----2:10},
publisher = {ACM},
series = {SUI '19},
title = {{Minuet: Multimodal Interaction with an Internet of Things}},
url = {http://doi.acm.org/10.1145/3357251.3357581},
year = {2019}
}

×

Forte: User-Driven Generative Design

Xiang 'Anthony' Chen (UCLA HCI Research)
Ye Tao (Zhejiang University)
Guanyun Wang (Carnegie Mellon University)
Runchang Kang (Carnegie Mellon University)
Tovi Grossman (University of Toronto)
Stelian Coros (ETH Zurich)
Scott Hudson (Carnegie Mellon University)

Low-cost fabrication machines (e.g., 3D printers) offer the promise of creating custom-designed objects by a range of users. To maximize performance, generative design methods such as topology optimization can automatically optimize properties of a design based on high-level specifications. Though promising, such methods require people to map their design ideas--often unintuitively--to a small number of mathematical input parameters, and the relationship between those parameters and a generated design is often unclear, making it difficult to iterate a design. We present Forte, a sketch-based, real-time interactive tool for people to directly express and iterate on their designs via 2D topology optimization. Users can ask the system to add structures, provide a variation with better performance, or optimize internal material layouts. Users can globally control how much to `deviate' from the initial sketch, or perform local suggestive editing, which interactively prompts the system to update based on the new information. Design sessions with 10 participants demonstrate that Forte empowers designers to create and explore a range of optimized designs with custom forms and styles.

Chen, X., Tao, Y., Wang, G., Kang, R., Grossman, T., Coros, S., & Hudson, S. E. (2018). Forte: User-Driven Generative Design. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (p. 496).

@inproceedings{chen2018forte,
title={Forte: User-Driven Generative Design},
author={Chen, Xiang 'Anthony' and Tao, Ye and Wang, Guanyun and Kang, Runchang and Grossman, Tovi and Coros, Stelian and Hudson, Scott E},
booktitle={Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems},
pages={496},
year={2018},
organization={ACM}
}

×

Medley: A Library of Embeddables to Explore Rich Material Properties for 3D Printed Objects

Xiang 'Anthony' Chen (UCLA HCI Research)
Stelian Coros (ETH Zurich)
Scott Hudson (Carnegie Mellon University)

In our everyday life, we interact with and benefit from objects with a wide range of material properties. In contrast, personal fabrication machines (e.g., desktop 3D printers) currently only support a much smaller set of materials. Our goal is to close the gap between current limitations and the future of multi-material printing by enabling people to explore the reuse of material from everyday objects into their custom designs. To achieve this, we develop a library of embeddables--everyday objects that can be cut, worked and embedded into 3D printable designs. We describe a design space that characterizes the geometric and material properties of embeddables. We then develop Medley---a design tool whereby users can import a 3D model, search for embeddables with desired material properties, and interactively edit and integrate their geometry to fit into the original design. Medley also supports the final fabrication and embedding process, including instructions for carving or cutting the objects, and generating optimal paths for inserting embeddables. To validate the expressiveness of our library, we showcase numerous examples augmented by embeddables that go beyond the objects' original printed materials.

Chen, X., Coros, S., & Hudson, S. E. (2018). Medley: A Library of Embeddables to Explore Rich Material Properties for 3D Printed Objects. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (p. 162).

@inproceedings{chen2018medley,
title={Medley: A Library of Embeddables to Explore Rich Material Properties for 3D Printed Objects},
author={Chen, Xiang 'Anthony' and Coros, Stelian and Hudson, Scott E},
booktitle={Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems},
pages={162},
year={2018},
organization={ACM}
}

×

Improv: An Input Framework for Improvising Cross-Device Interaction by Demonstration

Xiang 'Anthony' Chen (UCLA HCI Research)
Yang Li (Google)

As computing devices become increasingly ubiquitous, it is now possible to combine the unique capabilities of different devices or Internet of Things to accomplish a task. However, there is currently a high technical barrier for creating cross-device interaction. This is especially challenging for end users who have limited technical expertise—end users would greatly benefit from custom cross-device interaction that best suits their needs. In this article, we present Improv, a cross-device input framework that allows a user to easily leverage the capability of additional devices to create new input methods for an existing, unmodified application, e.g., creating custom gestures on a smartphone to control a desktop presentation application. Instead of requiring developers to anticipate and program these cross-device behaviors in advance, Improv enables end users to improvise them on the fly by simple demonstration, for their particular needs and devices at hand. We showcase a range of scenarios where Improv is used to create a diverse set of useful cross-device input. Our study with 14 participants indicated that on average it took a participant 10 seconds to create a cross-device input technique. In addition, Improv achieved 93.7% accuracy in interpreting user demonstration of a target UI behavior by looking at the raw input events from a single example.

Chen, X., & Li, Y. (2017). Improv: An Input Framework for Improvising Cross-Device Interaction by Demonstration. ACM Transactions on Computer-Human Interaction (TOCHI), 24(2), 15.

@article{chen2017improv,
title={Improv: An Input Framework for Improvising Cross-Device Interaction by Demonstration},
author={Chen, Xiang'Anthony' and Li, Yang},
journal={ACM Transactions on Computer-Human Interaction (TOCHI)},
volume={24},
number={2},
pages={15},
year={2017},
publisher={ACM}
}

×

Reprise: A Design Tool for Specifying, Generating, and Customizing 3D Printable Adaptations on Everyday Objects

Xiang 'Anthony' Chen (UCLA HCI Research)
Jeeeun Kim (Texas A&M University)
Tovi Grossman (University of Toronto)
Stelian Coros (ETH Zurich)
Scott Hudson (Carnegie Mellon University)

In this paper, we describe Reprise--a design tool for specifying, generating, customizing and fitting adaptations onto existing household objects. Reprise allows users to express at a high level what type of action is applied to an object. Based on this high level specification, Reprise automatically generates adaptations. Users can use simple sliders to customize the adaptations to better suit their particular needs and preferences, such as increasing the tightness for gripping, enhancing torque for rotation, or making a larger base for stability. Finally, Reprise provides a toolkit of fastening methods and support structures for fitting the adaptations onto existing objects.

Chen, X., Kim, J., Mankoff, J., Grossman, T., Coros, S., & Hudson, S. E. (2016). Reprise: A Design Tool for Specifying, Generating, and Customizing 3D Printable Adaptations on Everyday Objects. In the 29th Annual ACM Symposium on User Interface Software and Technology.

@inproceedings{chen2016reprise,
title={Reprise: A Design Tool for Specifying, Generating, and Customizing 3D Printable Adaptations on Everyday Objects},
author={Chen, Xiang 'Anthony' and Kim, Jeeeun and Mankoff, Jennifer and Grossman, Tovi and Coros, Stelian and Hudson, Scott E},
booktitle={the 29th Annual ACM Symposium on User Interface Software and Technology},
year={2016},
organization={ACM}
}

×

Bootstrapping User-Defined Body Tapping Recognition with Offline-Learned Probabilistic Representation

Xiang 'Anthony' Chen (UCLA HCI Research)
Yang Li (Google)

To address the increasing functionality (or information) overload of smartphones, prior research has explored a variety of methods to extend the input vocabulary of mobile devices. In particular, body tapping has been previously proposed as a technique that allows the user to quickly access a target functionality by simply tapping at a specific location of the body with a smartphone. Though compelling, prior work often fell short in enabling users' unconstrained tapping locations or behaviors. To address this problem, we developed a novel recognition method that combines both offline—before the system sees any user-defined gestures—and online learning to reliably recognize arbitrary, user-defined body tapping gestures, only using a smartphone's built-in sensors. Our experiment indicates that our method significantly outperforms baseline approaches in several usage conditions. In particular, provided only with a single sample per location, our accuracy is 30.8% over an SVM baseline and 24.8% over a template matching method. Based on these findings, we discuss how our approach can be generalized to other user-defined gesture problems.

Chen, X., & Li, Y. (2016). Bootstrapping User-Defined Body Tapping Recognition with Offline-Learned Probabilistic Representation. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (pp. 359-364).

@inproceedings{chen2016bootstrapping,
title={Bootstrapping User-Defined Body Tapping Recognition with Offline-Learned Probabilistic Representation},
author={Chen, Xiang 'Anthony' and Li, Yang},
booktitle={Proceedings of the 29th Annual Symposium on User Interface Software and Technology},
pages={359--364},
year={2016},
organization={ACM}
}

×

Encore: 3D Printed Augmentation of Everyday Objects with Printed-Over, Affixed and Interlocked Attachment

Xiang 'Anthony' Chen (UCLA HCI Research)
Stelian Coros (ETH Zurich)
Jennifer Mankoff (University of Washington)
Scott Hudson (Carnegie Mellon University)

One powerful aspect of 3D printing is its ability to extend, repair, or more generally modify everyday objects. However, nearly all existing work implicitly assumes that whole objects are to be printed from scratch. This paper presents a framework for 3D printing to augment existing objects that covers a wide range of attachment options. We illustrate the framework through three exemplar attachment techniques - print-over, print-toaffix and print-through, implemented in Encore, a design tool that supports a set of analysis metrics relating to viability, durability and usability that are visualized for the user to explore design options and tradeoffs. Encore also generates 3D models for production, addressing issues such as support jigs and contact geometry between the attached part and the original object. Our validation helps to illustrate the strengths and weaknesses of each technique. For example, we characterize how surface curvature and roughness affect print-over's strength compared to the conventional print-in-one-piece.

Chen, X., Coros, S., Mankoff, J., & Hudson, S. E. (2015). Encore: 3D printed augmentation of everyday objects with printed-over, affixed and interlocked attachments. In Proceedings of the 28th Annual ACM Symposium on User Interface Software and Technology (pp. 73-82).

@inproceedings{chen2015encore,
title={Encore: 3D printed augmentation of everyday objects with printed-over, affixed and interlocked attachments},
author={Chen, Xiang 'Anthony' and Coros, Stelian and Mankoff, Jennifer and Hudson, Scott E},
booktitle={Proceedings of the 28th Annual ACM Symposium on User Interface Software and Technology},
pages={73--82},
year={2015},
organization={ACM}
}

×

Duet: Joint Interaction Between a Smart Phone and a Smart Watch

Xiang 'Anthony' Chen (UCLA HCI Research)
Tovi Grossman (University of Toronto)
Daniel Wigdor (University of Toronto)
George Fitzmaurice (Autodesk Research)

The emergence of smart devices (e.g., smart watches and smart eyewear) is redefining mobile interaction from the solo performance of a smart phone, to a symphony of multiple devices. In this paper, we present Duet - an interactive system that explores a design space of interactions between a smart phone and a smart watch. Based on the devices' spatial configurations, Duet coordinates their motion and touch input, and extends their visual and tactile output to one another. This transforms the watch into an active element that enhances a wide range of phone-based interactive tasks, and enables a new class of multi-device gestures and sensing techniques. A technical evaluation shows the accuracy of these gestures and sensing techniques, and a subjective study on Duet provides insights, observations, and guidance for future work.

Chen, X., Grossman, T., Wigdor, D. J., & Fitzmaurice, G. (2014). Duet: exploring joint interactions on a smart phone and a smart watch. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 159-168).

@inproceedings{chen2014duet,
title={Duet: exploring joint interactions on a smart phone and a smart watch},
author={Chen, Xiang 'Anthony' and Grossman, Tovi and Wigdor, Daniel J and Fitzmaurice, George},
booktitle={Proceedings of the SIGCHI Conference on Human Factors in Computing Systems},
pages={159--168},
year={2014},
organization={ACM}
}

×

Air + touch: interweaving touch and in-air gestures

Xiang 'Anthony' Chen (UCLA HCI Research)
Julia Schwarz, Microsoft Research
Chris Harrison (Carnegie Mellon University)
Jennifer Mankoff (University of Washington)
Scott Hudson (Carnegie Mellon University)

We present Air+Touch, a new class of interactions that interweave touch events with in-air gestures, offering a unified input modality with expressiveness greater than each input modality alone. We demonstrate how air and touch are highly complementary: touch is used to designate targets and segment in-air gestures, while in-air gestures add expressivity to touch events. For example, a user can draw a circle in the air and tap to trigger a context menu, do a finger 'high jump' between two touches to select a region of text, or drag and in-air 'pigtail' to copy text to the clipboard. Through an observational study, we devised a basic taxonomy of Air+Touch interactions, based on whether the in-air component occurs before, between or after touches. To illustrate the potential of our approach, we built four applications that showcase seven exemplar Air+Touch interactions we created.

Chen, X., Schwarz, J., Harrison, C., Mankoff, J., & Hudson, S. E. (2014). Air + touch: interweaving touch & in-air gestures. In Proceedings of the 27th annual ACM symposium on User interface software and technology (pp. 519-525).

@inproceedings{chen2014air+,
title={Air+ touch: interweaving touch and in-air gestures},
author={Chen, Xiang 'Anthony' and Schwarz, Julia and Harrison, Chris and Mankoff, Jennifer and Hudson, Scott E},
booktitle={Proceedings of the 27th annual ACM symposium on User interface software and technology},
pages={519--525},
year={2014},
organization={ACM}
}

×

Swipeboard: A Text Entry Technique for Ultra-Small Interfaces That Supports Novice to Expert Transitions

Xiang 'Anthony' Chen (UCLA HCI Research)
Tovi Grossman (University of Toronto)
George Fitzmaurice (Autodesk Research)

Ultra-small smart devices, such as smart watches, have become increasingly popular in recent years. Most of these devices rely on touch as the primary input modality, which makes tasks such as text entry increasingly difficult as the devices continue to shrink. In the sole pursuit of entry speed, the ultimate solution is a shorthand technique (e.g., Morse code) that sequences tokens of input (e.g., key, tap, swipe) into unique representations of each character. However, learning such techniques is hard, as it often resorts to rote memory. Our technique, Swipeboard, leverages our spatial memory of a QWERTY keyboard to learn, and eventually master a shorthand, eyes-free text entry method designed for ultra-small interfaces. Characters are entered with two swipes; the first swipe specifies the region where the character is located, and the second swipe specifies the character within that region. Our study showed that with less than two hours' training, Tested on a reduced word set, Swipeboard users achieved 19.58 words per minute (WPM), 15% faster than an existing baseline technique.

Chen, X., Grossman, T., & Fitzmaurice, G. (2014). Swipeboard: a text entry technique for ultra-small interfaces that supports novice to expert transitions. In Proceedings of the 27th annual ACM symposium on User interface software and technology (pp. 615-620).

@inproceedings{chen2014swipeboard,
title={Swipeboard: a text entry technique for ultra-small interfaces that supports novice to expert transitions},
author={Chen, Xiang 'Anthony' and Grossman, Tovi and Fitzmaurice, George},
booktitle={Proceedings of the 27th annual ACM symposium on User interface software and technology},
pages={615--620},
year={2014},
organization={ACM}
}

×

Around-Body Interaction: Sensing & Interaction Techniques for Proprioception-Enhanced Input with Mobile Devices

Xiang 'Anthony' Chen (UCLA HCI Research)
Julia Schwarz (Microsoft Research)
Chris Harrison (Carnegie Mellon University)
Jennifer Mankoff (University of Washington)
Scott Hudson (Carnegie Mellon University)

The space around the body provides a large interaction vol-ume that can allow for big interactions on small mobile de-vices. However, interaction techniques making use of this opportunity are underexplored, primarily focusing on dis-tributing information in the space around the body. We demonstrate three types of around-body interaction includ-ing canvas, modal and context-aware interactions in six demonstration applications. We also present a sensing solu-tion using standard smartphone hardware: a phone's front camera, accelerometer and inertia measurement units. Our solution allows a person to interact with a mobile device by holding and positioning it between a normal field of view and its vicinity around the body. By leveraging a user's proprioceptive sense, around-body Interaction opens a new input channel that enhances conventional interaction on a mobile device without requiring additional hardware.

Chen, X., Schwarz, J., Harrison, C., Mankoff, J., & Hudson, S. (2014). Around-body interaction: sensing & interaction techniques for proprioception-enhanced input with mobile devices. In Proceedings of the 16th international conference on Human-computer interaction with mobile devices & services (pp. 287-290).

@inproceedings{chen2014around,
title={Around-body interaction: sensing and interaction techniques for proprioception-enhanced input with mobile devices},
author={Chen, Xiang 'Anthony' and Schwarz, Julia and Harrison, Chris and Mankoff, Jennifer and Hudson, Scott},
booktitle={Proceedings of the 16th international conference on Human-computer interaction with mobile devices and services},
pages={287--290},
year={2014},
organization={ACM}
}

×

Extending a Mobile Device's Interaction Space through Body-Centric Interaction

Xiang 'Anthony' Chen (UCLA HCI Research)
Nicolai Marquardt (University College London)
Anthony Tang (University of Toronto)
Sebastian Boring (Aalborg University)
Saul Greenberg (University of Calgary)

Modern mobile devices rely on the screen as a primary input modality. Yet the small screen real-estate limits interaction possibilities, motivating researchers to explore alternate input techniques. Within this arena, our goal is to develop Body-Centric Interaction with Mobile Devices: a class of input techniques that allow a person to position and orient her mobile device to navigate and manipulate digital content anchored in the space on and around the body. To achieve this goal, we explore such interaction in a bottomup path of prototypes and implementations. From our experiences, as well as by examining related work, we discuss and present three recurring themes that characterize how these interactions can be realized. We illustrate how these themes can inform the design of Body-Centric Interactions by applying them to the design of a novel mobile browser application. Overall, we contribute a class of mobile input techniques where interactions are extended beyond the small screen, and are instead driven by a person's movement of the device on and around the body.

Chen, X., Marquardt, N., Tang, A., Boring, S., & Greenberg, S. (2012). Extending a mobile device's interaction space through body-centric interaction. In Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services (pp. 151-160).

@inproceedings{chen2012extending,
title={Extending a mobile device's interaction space through body-centric interaction},
author={Chen, Xiang 'Anthony' and Marquardt, Nicolai and Tang, Anthony and Boring, Sebastian and Greenberg, Saul},
booktitle={Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services},
pages={151--160},
year={2012},
organization={ACM}
}

×

Spalendar: Visualizing a Group's Calendar Events over a Geographic Space on a Public Display

Xiang 'Anthony' Chen (UCLA HCI Research)
Sebastian Boring (Aalborg University)
Sheelagh Carpendale (Simon Fraser University)
Anthony Tang (University of Toronto)
Saul Greenberg (University of Calgary)

Portable paper calendars (i.e., day planners and organizers) have greatly influenced the design of group electronic calendars. Both use time units (hours/days/weeks/etc.) to organize visuals, with useful information (e.g., event types, locations, attendees) usually presented as - perhaps abbreviated or even hidden - text fields within those time units. The problem is that, for a group, this visual sorting of individual events into time buckets conveys only limited information about the social network of people. For example, people's whereabouts cannot be read 'at a glance' but require examining the text. Our goal is to explore an alternate visualization that can reflect and illustrate group members' calendar events. Our main idea is to display the group's calendar events as spatiotemporal activities occurring over a geographic space animated over time, all presented on a highly interactive public display. In particular, our SPALENDAR (SPAtial CALENDAR) design animates peoples' past, present and forthcoming movements between event locations as well as their static locations. Details of people's events, their movements and their locations are progressively revealed and controlled by the viewer's proximity to the display, their identity, and their gestural interactions with it, all of which are tracked by the public display.

Chen, X., Boring, S., Carpendale, S., Tang, A., & Greenberg, S. (2012). Spalendar: visualizing a group's calendar events over a geographic space on a public display. In AVI '12 Proceedings of the International Working Conference on Advanced Visual Interfaces (pp. 689-696).

@inproceedings{chen2012spalendar,
title={Spalendar: visualizing a group's calendar events over a geographic space on a public display},
author={Chen, Xiang 'Anthony' and Boring, Sebastian and Carpendale, Sheelagh and Tang, Anthony and Greenberg, Saul},
booktitle={Proceedings of the International Working Conference on Advanced Visual Interfaces},
pages={689--696},
year={2012},
organization={ACM}
}

Projects