XR I: Performance

Talk 1, 9:00 am

Optimizing Visual Comfort in Augmented Reality: Effects of Monocular and Binocular Displays and Focal Distances

Daniel Spiegel¹ (); ¹Meta Reality Labs

Augmented reality (AR) see-through displays—monocular and binocular—pose distinct challenges to the visual system. Monocular displays can cause binocular rivalry, while binocular displays may lead to vergence-accommodation and occlusion conflicts. Understanding how these perceptual issues impact subjective visual comfort is key to designing comfortable AR experiences. We assessed instantaneous visual comfort in simulated monocular and binocular AR. The apparatus consisted of a custom-made varifocal AR headset enabling dynamic adjustment of the display focal distance. AR content was represented by a virtual recipe rendered over an image of a kitchen counter presented on a high-brightness DynaScan display at 1.2m, achieving an additive contrast ratio of approximately 4:1. Twenty-five participants (29.5 years ± 5.8 SD) with normal vision completed three tasks: subjective comfort ranking when looking at the recipe for different focal distances (0.25D, 0.5D, 1.0D, 1.5D, 2.0D and 2.5D) and display modes (binocular at 1.0m, right-eye monocular, left-eye monocular), direct comparison of comfort between binocular and monocular modes at the same focal distances, and comfort ranking while viewing the background through AR content (same conditions as in the first task). The data were analyzed using a repeated measures ANOVA with factors of display mode and focal distance. Binocular display provided consistently greater visual comfort (approximately 30%) across the range of focal distances (main factor of display mode p<0.001 in all tasks). The display focal distance affected comfort only in the first task (p < 0.001), peaking at 1.0D and 1.5D. We found no measurable impact of eye (left vs. right) or sensory eye dominance. Binocular AR displays offer overall greater visual comfort. However, to ensure an optimal user experience, it remains essential to address and mitigate binocular artifacts in binocular AR displays.

Talk 2, 9:15 am

Designing Accessible Multimodal Cues to Increase Eye-Tracking Accuracy in Virtual Reality

Batuhan Erkat¹ (), Dinesh Venugopal¹, Arathy Kartha¹; ¹SUNY College of Optometry, ²SUNY College of Optometry, ³SUNY College of Optometry

Eye tracking is widely used in virtual reality (VR) to enable gaze-based interaction and to support research on visual attention and user behavior. However, standard calibration protocols create a systematic barrier for people with low and ultra-low vision (ULV): they typically require precise fixation on small or low-contrast visual targets that may be difficult or impossible to perceive reliably. This barrier limits participation in VR studies and, in turn, risks producing interfaces and assistive systems optimized primarily for typical vision. To address this challenge, we developed an accessibility-oriented calibration system that uses multimodal stimuli to support target acquisition and to improve calibration quality. In our system, we present large, high-contrast visual targets arranged on a 3×3 grid spanning the VR display. Each target is paired with a spatial audio cue that announces the target position and is rendered from the target’s location, enabling auditory localization to guide gaze. We sample gaze data for 10 seconds at each target to obtain sufficient observations for stable estimation. We then map the observed gaze samples to known target locations and apply a post-hoc calibration model to estimate and correct systematic offsets; this correction is subsequently applied to downstream VR tasks. Our approach enables inclusive data collection for user interface research and helps reduce low-accuracy eye-tracking data in populations with vision loss. Specifically, the integration of accessible visual targets with spatial audio guidance provides a generalizable approach for eye-tracking applications that require robust calibration across diverse visual abilities, advancing inclusive VR interaction and assistive technology development.

Talk 3, 9:30 am

How distortion components shape swim perception in optical wearables

Yannick Sauer^1,2 (), Franziska Kreis¹, David-Elias Künstle¹, Felix A. Wichmann¹, Siegfried Wahl^1,2; ¹University of Tübingen, ²Carl Zeiss Vision International GmbH

Optical wearable devices should be optimized for natural and comfortable vision, but inherent optical aberrations pose significant challenges. Aberrations often cannot be eliminated, only redistributed or reshaped. For example, geometric distortions, leading to displacement, magnification, or warping of the retinal image, are inherent to glasses, particularly progressive addition lenses. During head movements, these distortions cause unnatural or unstable visual perception, often referred to as the swim effect. However, it remains unclear how the shape of distortions should be optimized to minimize this effect. An objective metric relating distortions to perceived swim intensity would facilitate improved optical design. We present a framework for relating the perceived intensity of distorted motion to optical parameters by measuring a psychophysical scaling function. We developed a virtual reality simulator that applies distortions during natural self-motion in a 3D environment. Using ordinal embedding with triplet comparisons, we assessed how different distortion components influence the perceived swim effect. We found that perceived intensity scales linearly with the magnification component, but radial distortions dominate the perceptual response when present. Additionally, we identified systematic individual differences between participants. Model simulations of distorted optic flow helped identify which visual features are altered by distortions and can explain the individual differences. Combining these psychophysical results with the model simulations allowed us to define a perceptual distortion metric. This metric can help in the optical design process to develop more comfortable optical wearables by accounting for how distortions affect dynamic visual perception.

Talk 4, 9:45 am

Psychophysical and Naturalistic Evaluations of Latency in Head-Mounted Displays

Phillip Guan¹, Eric Penner¹, Josephine D'Angelo¹, Clinton Smith¹, Neethan Siva², Nathan Matsuda¹; ¹Reality Labs Research, Meta, ²Reality Labs, Meta

One challenge in applying vision science methods is relating how psychophysical thresholds extend to naturalistic and ecologically-valid experiences. Here we explore how psychophysical measurements of motion-to-photon (m2p) latency–defined as the time delay between when an event occurred in the world and when that change is shown in a head-mounted display–relate to user performance and subjective evaluations of latency. We accomplish this by comparing how psychophysical measures of latency detection in a perspective-correct, zero latency testbed compared to subjective evaluations of latency while catching a ball when wearing a nearly perspective-correct video-passthrough headset with two milliseconds of m2p latency. In the psychophysical study (n=30) we find an average detection threshold of 2.8 and 33.4 ms while participants make VOR head and eye movements for AR and VR scenes respectively. The same 30 individuals wore the 2ms m2p latency headset and attempted to catch 10 balls with 2, 14, 23, and 29 milliseconds of latency in a blinded, counterbalanced study design. Participants were asked to provide subjective ratings of their experience after each condition and to rank the conditions from best to worst. General linear mixed modeling shows that participants subjectively preferred 2 and 14 ms of latency over 23 and 29, and were most likely to successfully catch a ball with 2 ms of latency. We examined the correlation between psychophysical detection thresholds with subjective evaluations and ranking accuracy to determine whether individual differences at the psychophysical level could meaningfully predict differences in the ball catching task, and did not find a meaningful relationship between them. We conclude by exploring how measurement of suprathreshold difference judgments instead of detection thresholds might lead to more meaningful correlation between naturalistic and psychophysical tasks, and how better understanding m2p latency can lead to insights into human visual perception.

Talk 5, 10:00 am

The visibility of saccadic color break-up artifacts in AR displays

Xiuyun Wu¹ (), T. Scott Murdison¹; ¹Meta

Field-sequential color displays are favored by augmented reality (AR) manufacturers for high brightness and resolution in power-efficient and compact designs. However, color-sequential displays suffer from color-breakup (CBU) artifacts during saccades, as different color images are projected onto different retinal locations with a lack of chromatic saccadic suppression (Burr and Morrone, 1994). The detection and discrimination thresholds of saccadic CBU in see-through AR displays have not been comprehensively measured. Here, we seek to inform AR architectural decisions with a focus on the in situ response of the human visual system by measuring these thresholds. Fifty-six participants judged CBU visibility using a see-through AR prototype with controllable frame rate (up to 1400 Hz, or 4200 Hz field rate), brightness (up to 600 cd/m2), and duty cycle (minimum pulse width 36 microseconds). Participants performed a two-interval forced-choice task, making saccades between two targets 20 dva away, with a white text AR stimulus presented half-way in between. Parameters in each trial were selected by AEPsych, a multidimensional adaptive sampling method (Owen et. al., 2021). In the detection task, participants compared a no-CBU (simultaneous RGB) stimulus to the corresponding color-sequential stimulus. In the discrimination task, participants compared two color-sequential stimuli. In both tasks, participants indicated the interval with worse perceived CBU. Overall, we found that CBU was most detectable and discriminable under low frame rate, low duty cycle, and high additive contrast. CBU sensitivity was lowest to changes across duty cycle, compared to frame rate and additive contrast. Notably, there were considerable individual differences. This dataset provides detection and discrimination thresholds of saccadic CBU across a wide combined range of frame rate, duty cycle, and additive contrast. These data comprehensively inform architectural requirements for color-sequential displays while providing fundamental learnings about perisaccadic perception in AR.

Acknowledgements: This research is funded by Meta