304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
I have been taking pictures and making short video clips to demonstrate different aspects of the displays and optics of the Meta Quest Pro (MQP) for over a month. It was a bit of an onion-peel where detecting one issue caused me to run experiments and take more pictures and videos to see what was happening.
At the risk of spoiling the punchline, the Meta Quest Pro is unbelievably bad for this day and age in terms of the human visual factors for the business and work applications that Meta claims, both verbally and in writing, that the MQP serves. In this series, I will try to quantify the numerous faults with pictures and measurements, applying my 40+ years of experience in computer graphics, displays, and human visual interfaces.
I wonder who they think they are fooling when they call a product designed for video games a “business” or “enterprise” product. We saw this with Magic Leap when one day they declared the Magic Leap 1 (see: Magic Leap Ill Suited to Pivot to Enterprise – More Like Spinning a Narrative than Pivoting) and later the Magic Leap 2 (see: Magic Leap 2 for Enterprise, Really? Plus Another $500M). It is like somebody in marketing said, “At that price, we can’t sell it as a game system, so it must be a business product.”
A few VR enthusiast YouTubers aside, who marvel at having multiple “big screen” multiple monitors, most have pointed out the ridiculousness of using the MPQ for business applications. Future articles in this series will show how poor these very low-resolution monitors with big chunky pixels, vergence accommodation conflict, unacceptable flicker, and don’t behave well for business applications.
I’m off to Las Vegas for CES 2023 tomorrow and booked almost solid for the first three days. It is not quite back to pre-pandemic levels, but much busier than I expected. So it will likely be a week or so before the next article in this series, and I will be blending in articles covering what I see at CES.
The MQP is bad in so many ways for business applications it was hard to know where to start. As this blog has focused on Augmented Reality, this first in a series of articles covers AR/MR passthrough issues, and in particular, the claim below from Meta Connect 2022: Meta Quest Pro, More Social VR and a Look Into the Future:
I’m not sure what Meta considers “seamless,” but I don’t think MQP’s passthrough AR comes close to the dictionary definition of “having no awkward transitions, interruptions, or indications of disparity.”
Many reviews have already subjectively commented on the poor quality of the Meta Quest Pro’s AR passthrough, and this article will try and quantify some of the issues.
I should also add that the human body and visual system are more adept at looking down at a phone than is it to look up at low-resolution, badly behaving VR screens.
All photographs and videos were shot through the Quest Pro’s optics and were taken with a Canon R5 8192 x 5464-pixel camera with an ~16mm lens with roughly a FOV of 106° horizontal by ~81° vertical. This nets to about 80 pixels per inch or ~4 camera pixels per Meta Quest Pixel (and thus above Nyquist). The Canon R5 resolving power with 16mm less is close to excellent human vision. In most cases, the full resolution but cropped images are used for discussion, with full-resolution whole images also being provided. You will need to click on even the cropped images provided to see them at full resolution.
When capturing “through-the-lens,” I always captured the left eye as it was easier to fit the camera. The left eye’s video capture was then used in any comparisons.
You will see the (distracting) diagonal screen door effect in all pictures taken through the MQP, which is also visible to the naked eye. As will be discussed later, the displays in the MQP are rotated, and thus the screen door effect is rotated.
While the camera can produce 8K videos, the videos were shot in 4K (3860 x 2160), downscaled by the camera from the higher resolution camera sensor. I found negligible differences in the 8K vs. 4K videos, so I shot in 4K at both 30fps and 60fps, depending on the situation. The MP4 video files are included (rather than re-compressed for YouTube). The video clips are generally short and can be played by clicking on them. I recommend enabling looking with your player (with Google Chrome, right-clicking on the video enables looping).
You will see the flicker and rolling brightness bands in the video. The much-hyped arrayed dimming is not used in the current applications. But the MQP current applications released by Meta instead organize the LEDs in rows to provide a rolling illumination with a very short on-duty-cycle. This will cause banding in still pictures and rolling in videos (part of the rolling is also caused by the camera’s rolling shutter). For still pictures, I shot at low (typically 1/30th of second or slower shutter speeds) to average out the banding effect.
Still, the MQP has a LOT of flicker. For some strange reason, the Meta Workplace application drops the shutter rate to about 70Hz, which is far too slow for a wide FOV device (peripheral vision is sensitive to flicker) and combined with the low–on-duty cycle, there is a lot of low-frequency flicker which is well known (since at least the 1980s) to bad for humans. Most apps use 90Hz, which is still a bit slow for a wide FOV but better. In some rooms, depending on the lighting, I notice a beat frequency between the room lighting and the headset. I plan a whole article on the MQP’s flicker issue.
Below is a still capture of the camera capture to both displays in the MQP from the Meta Quest Developer HUB (MQDH) application. This image included a mix of VR content and AR passthrough video. The images are rotated because the physical displays in MQP are counter-rotated by ~21 degrees. According to Brad on SadlyIsBradley, the rotation is likely being done for two reasons, A) to give a bigger vertical FOV for the same size display when inscribing a squarish (with cut corners) display into a somewhat circular FOV, and B) to clear the triangular-shaped nose area due to the displays being closer to the user’s face with the pancake optics.
In addition to the rotation, the image is barrel distorted (bows outward on all four sides). The barrel distortion is used to pre-correct the significant pincushion distortion by the MQP’s pancake optics (many VR optics have similar distortion corrections). This does mean that pixels in the periphery are optically stretched and are thus of lower resolution.
Since the cameras used for AR passthrough are not rotated, each video frame requires digital rotation. My first thought was that the rotation itself would hurt resolution. Still, as it is just one of many transformations on top of the lens correction, and provided all the transformations are done in a single pass (I don’t know and may vary by application), it may not significantly further degrade the image.
Below is the MQDH video capture for the left eye, a through-the-lens picture via the optics, and the camera’s view without the headset. The camera and the headset were on separate tripods to capture pictures from the same spot with and without the headset. Looking at the video capture image, it is apparent that significant pre-correction for the display’s lens distortion is being done. HQDH video capture image has been rotated and scaled to match the other two images roughly, but no correction for the barrel distortion has been applied. As seen by the white wall on the left of the bookcase of the capture and through-the-lens images, the outer pixels are significantly stretched by more than 2x by the optics distortion (and squeezed down by the pre-compensation).
Below are the full-resolution cropped images from the center of the images in the same order as before.
Below are the corresponding full-size images (on the order of 45 Megapixels each – click on each one to open the full-size image). The view through the lens has had the black on the sides cropped. The direct camera view has had black bars added where the vertical FOV was less than the other two images.
Shown on the right is a classic Snellen eye chart taken via the MQP optics on the (left) and a direct camera picture (right). The chart’s size was scaled to match the shorter distance from the headset for a standard vision test.
Typically in the USA, not being able to read the “E” on the top line qualifies as “legally blind.” The MQP appears to be between 20/200 (“legally blind”) and 10/200 (“visually impaired”). Colors are another issue that can range from being desaturated/lost to being overly saturated (see the wood behind the Snellen chart above) with color-shifted tones. The orange-toned oversaturated wood behind the chart captured by the camera is close to how it looks to the eye.
For another test, I used a large 1962 Disneyland map combining color with various levels of detail print. Below are pictures taken through the MQP optics (left) and a direct camera picture (center) from about the same angle that was scaled down slightly to match the MQP image (resulting in ~70 pixels/degree). The direct camera image was scaled to roughly match the MQP’s level of detail and then scaled back up to the same size (right) as the other two images. Also, note the unscaled image taken directly with the camera and the image through the MQP are very close to the same size.
You can click on each image above to see the full-camera resolution images. To make things easier, I have included the center crops of each image below. I added a through-the-camera center crop from Quest 2 on the far right.
This test gives about the same result as the Snellen Eye Chart. When wearing the Meta Quest Pro, the best-case center vision is about 1/10th that of good human vision.
Working on the resolution from the “bottom-up,” the MQP has 1280 by 1024 pixel B&W tracking camera per eye that is resampled in a process that crops, rotates, and barrel distorts to generate an image to display. The resampling process should be expected to result in about 500 x 500-pixel images or about 6 pixels per degree for ~85 degrees per eye. The MQP’s 106-degree horizontal FOV quoted “spec” is for both eyes.
Below is a 20-second 4K video shot through the left-eye optics using a Canon R5 camera with a 16mm lens which captures almost the entire horizontal FOV and about 75% of the vertical FOV of the MQP. (You will need to right-click and open in a new window (or download) to see it at full resolution and also right-click on the video to select looping). The digital counter on the PC only updates about 10 times a second to give a rough synchronization between the lens and video capture (shown later). In contrast, the pendulum with Mickey waving the small eye chart gives constant analog timing.
The dynamic range (dark to light) when using MQP passthrough can best be described as pitiful. Using tracking cameras sensitive to IR means anything bright or light emitting blows out white. Anything, not well-lit comes out very, very noisy. The video below illustrates the problems with dynamic range plus the extreme distortion and color problems with the AR passthrough. While much worse with close objects, strange distortions can also occur with things further away. The MQP takes bad AR passthrough to new low levels.
The next short video clip shows looking at a paper document (the front page of the original “sprite” patent) and other problems, such as blown-out displays and lights. The image on the computer is available under KGOnTech’s test patterns.
The video below shows a timer on the PC and a moving Mickey on a pendulum waving a miniature Snellen chart. The PC’s timer updates about 6.6 times a second, and the waving Mickey provides continuous analog movement. Combining the big clock for the approximate timestamp and the waving Mickey let me synchronize the through-the-lens camera and Meta’s video capture to the same frame. The waving Mickey also gives a moving subject. The PC had to be set to its near-minimum brightness to keep it from blowing out the MQP’s cameras so I could read the timer.
There are a lot of adjectives I could use to describe the passthrough image quality of the MQP, but “seamless,” as Meta describes it, is not one of them. The first thing to be noted is the distortion, particularly around the Mickey Pendulum and waving eye chart. It can also be seen how the color is very approximate and lags the behind the motion.
While the “video capture” colors look reasonably accurate, the colors captured through the lens are oversaturated (where this is color). This oversaturated look in the passthrough mode is what the eye sees as well. When using VR, the color looks reasonably accurate (if misplaced), so it is somewhat perplexing how poor the colors look in passthrough mode (and maybe fixable).
The zoomed-in still frames below compare the through-the-lens to the direct video capture. The video capture was rotated and scaled for this comparison (in later captures, it will not be rotated). Note how the missing gaps of color match and the amount of distortion in the images.
While the frame rate of the display was about 90Hz and the capture rate of the video with Meta Developer’s app was also about 90Hz, the rates differ slightly, resulting in tears/rolls somewhere in the capture about every 3rd frame capture (see right).
Beyond this, the video captured by the Tears/frame-roll frames in the video capture is not seen in the videos via the optics. The frame captured above avoids these tears frames. The cropped-in frame (right) shows one of these tears/rolls. This suggests the video capture is not synchronous with the video source going to the displays. The video is also captured at a higher frame rate than the video capture, as there are repeated frames in the video capture.
Initially, I had assumed the MQDH captured was what was sent to the display, but this is not the case. The captured video was 90Hz (bottom row below), but the capture frames change at roughly 30Hz with two repeated frames and with one of the frames having a rollbar. But on the 60Hz video captured through the optics (top row below), each frame is unique (and thus different than the capture), but the whole subject only moves at 30Hz. As the camera was at ~60Hz, and the MQDH capture was at ~90Hz, there are about two camera frames per three capture frame. The image below shows crops with red lines at the 30Hz divisions.
If you look carefully at the through-the-lens frame sequence, you should notice that in each pair of frames separated by the red lines, the subject/Mickey does NOT move. Rather the color moves, and the moving content distorts differently. The first frame in the pair exactly matches one of the frames in the MQDH capture, but the second frame is unique. I don’t have a 90Hz frame rate camera readily available, but I suspect the image changes at 90Hz when the display runs at 90Hz, and the 60Hz camera is getting two out of the three in-between frames.
I have seen reviewers comment that the video feed they capture that combines both eyes looks better than what they see with their own eyes. Unfortunately, due to the buggy software and unhelpful error messages in the MQDH application, I could not capture higher-quality video via MQDH (only the lower resolution 30fps headset recording and casting).
It appears that what is captured is different and better than what is shown. I suspect that the processing for what is sent to the eye displays is prioritized for motion-to-photon latency over absolute quality. They want to get new information to the eyes as fast as possible, whereas video capture may prioritize image quality over latency.
I also ran a small experiment where I sent a camera flash off in a dimly lit room and measured the delay with light sensors using one sensor to trigger an oscilloscope’s capture and the second light sensor to capture the display response. It typically took a little over four 90Hz frame times (~40 to 50ms) for the flash to show up on the image. In the scope capture below, you may note that response (in yellow) is three somewhat stair-stepped, low-on-duty-cycle pulses. The MQP’s illumination LEDs only illuminate for a very short period of time. Various rows of LEDs fire off in a rolling sequence and thus the stairstep pattern. In future articles, I plan to go into more detail on the MQP’s LED array illumination. Consistent with the video captures, the flash is displayed for three frames at 90Hz or 30Hz.
The method in the Madness of mixing IR Tracking with a Color Camera
My first reaction was that the MQP’s passthrough was originally meant to give game players a better way than the Meta Quest 2 to see their surroundings when defining boundaries. But then the powers that be at Meta said they needed it to be a “Mixed Reality” device. At the same time, the implementation suggests it was “an experiment that escaped the lab.” Perhaps the answer is that the MQP’s passthrough is a bit of both, with the researchers wanting to show and experiment on a larger scale with technology and the company’s desire to market the product as more Mixed Reality than just VR.
According to SadlyInReality (SadlyItsBradley’s written articles), Quest Pro’s AR passthrough uses two low-resolution IR Tracking/SLAM cameras with nominally 1280×1024 resolution) to give the brightness information for each eye’s display combined with a single 16-megapixel color camera to colorize each image. And 1280×1024 hasn’t been considered, as Meta claims, “high-resolution cameras” for about 30 years. The Tracking cameras but an upper bound on the resolution. As will be seen, the actual resolution is much worst.
The engineers at Meta had to know that combining a single color camera with two tracking-optimized cameras would hurt the image quality severely. As I often say, “When smart people do something that appears dumb, it is because they were trying to avoid something they felt was worse.” In this case, they were willing to sacrifice image quality to try to make the position of things in the real world agree with where virtual objects appear. To some degree, they have accomplished this goal. But the image quality and level of distortion, particularly of “close things,” which includes the user’s hands, is so bad that it seems like a pyrrhic victory.
SadlyItsBradley reports that the Meta Quest 3 will have two high-resolution color cameras, one per eye, rather than sharing a single color camera like the MQP. This should give better image quality and reduce some problems. However, there will likely still be distortion effects as they try to map real-world camera images based on tracking and depth-sensing camera information in real-time. Also, according to Brad, AR passthrough process is very processor intensive, and the power draw goes up dramatically. It makes one wonder how much processing it will take to get decent image quality.
In June 2022, I did a 28-minute video with Brad of the SadlyIsBradley YouTube channel discussing the general Pros and (mostly) Cons of Passthrough AR (VR with cameras). The slide below outlines many of the points explained in the video. Rather than rehashing these points, I will refer you to that video.
A major point missing above, but discussed in the video, is that Passthrough AR can often be unsafe due to the various ways it can obscure real-world hazards, from bumping into things to being hit by equipment and vehicles. Brad told me that in the VR community, it is known as “VR to the ER.”
The passthrough mode of the MQP only mitigates some of the VR visual safety issues, and the following are some of the remaining problems:
Meta’s lawyers know about “VR to the ER” issues as the device tells you to be indoors in a single room, free of obstacles, and have you draw an electronic boundary. But realistically, if you draw a boundary small enough to protect you, you will constantly be annoyed by visual boundary warnings. Using the device without constant annoyance, completely emptying your room, and padding everything require taking physical risks. And still, you will be caught out when you quickly reach out for something in the virtual world and hit something in the real world. I can see the corporate safety people now drafting up the requirements😁.
And these are just some of the obvious safety issues. There are also human factor safety issues that will likely cause problems with prolonged “exposure” to the product, including vergence accommodation conflict (that Magic Leap 1 tried to solve and then dropped on the Magic Leap 2), to flicker (which nobody seems to care about — more in a future article).
The passthrough mode of the MQP, while a big improvement over the Meta Quest 2, is not even close to being used as an AR/MR device. I have my issues with Lynx’s AR passthrough, but it is nowhere near the train wreck of the MQP. The passthrough mode is barely ok for finding large objects in the real world and setting up boundaries for VR applications.
It is so bad at AR passthrough that it makes me wonder why Meta claimed it to be a mixed-reality device. Some have speculated that it was to claim an MR device ahead of an expected Apple MR headset.
Even if Meta greatly improves the image quality of the passthrough modes, it’s doubtful that it will come anywhere close to being safe to use without the typical restrictions of VR (indoors in a safe room with few/no obstacles). Meta has spent more than $30 Billion in this field, and the MQP is what they can do for a $1,500 “pro” device.
There are just so many things the Meta Quest Pro gets wrong from the point of view of human visual factors. They even get the simple stuff like flicker wrong. Not to mention harder-to-solve problems like not addressing vergence accommodation conflict (VAC).
If anything, the MQP reinforces the many cons of passthrough AR that I made in the video with SadlyItsBradley.