Apple Vision Pro (Part 5B) – More on Monitor Replacement is Ridiculous.

Introduction – Now Three Parts 5A-C

I want to address feedback in the comments and on LinkedIn from Part 5A about whether Apple claimed the Apple Vision Pro (AVP) was supposed to be a monitor replacement for office/text applications. Another theory/comment from more than one person is that Apple is hiding the good “spatial computing” concepts so they will have a jump on their competitors. I don’t know whether Apple might be hiding “the good stuff,” but it would seem better for Apple to establish the credibility of the concept. Apple is, after all, a dominant high-tech company and could stomp any competitor.

Studying the MQP’s images in more detail, it was too simplistic to use the average pixels per degree (ppd), given by dividing the resolution into the FOV of the MQP (and likely the AVP).

As per last time, since I don’t have an AVP, I’m using the Meta Quest Pro (MQP) and extrapolating the results to the AVP’s resolution. I will show a “shootout” comparing the text quality of the MQP to existing computer monitors. I will then wrap up with miscellaneous comments and my conclusions.

I have also included some discussion of Gaze-Contingent Ocular Parallax (GCOP) from some work by Stanford Computational Imaging Labs (SCIL) that a reader of this blog asked about. These videos and papers suggest that some amount of depth perception is conveyed to a person by the movement of each eye in addition to vergence (biocular disparity) and accommodation (focus distance).

I’m pushing out a set of VR versus Physical Monitor “Shootout” pictures and some overall conclusions to Part 5C to discuss the above.

Yes, Apple Claimed the AVP is a Monitor Replacement and Good for High-Resolution Text

Apple Vision Pro Concept

In Apple Vision Pro (Part 5A) – Why Monitor Replacement is Ridiculous, I tried to lay a lot of groundwork for why The Apple Vision Pro (AVP), and VR headsets in general, will not be a good replacement for a monitor. I thought it was obvious, but apparently not, based on some feedback I got.

So to be specific and quote directly from Apple’s WWDC 2023 presentation (YouTube transcript) with timestamps with my bold emphasis added and in-line comments about resolution are given below:

1:22:33 Vision Pro is a new kind of computer that augments reality by seamlessly blending the real world with the digital world.

1:31:42 Use the virtual keyboard or Dictation to type. With Vision Pro, you have the room to do it all. Vision Pro also works seamlessly with familiar Bluetooth accessories, like Magic Trackpad and Magic Keyboard, which are great when you’re writing a long email or working on a spreadsheet in Numbers.

Seamless makes many lists of the most overused high-tech marketing words. Marketeers seem to love it because it is both imprecise, suggests it works well, and unfalsifiable (how do you measure “seamless?”). Seamlessly was used eight times in the WWDC23 to describe the AVP and by Meta to describe the Meta Quest Pro (MQP) twice at Meta Connect 2022. From Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, Meta also used “seamless” to describe the MQP’s MR passthrough:

Apple claims the AVP is good for text-intensive “writing a long email or working on a spreadsheet in numbers.”

1:32:10 Place your Mac screen wherever you want and expand it–giving you an enormous, private, and portable 4K display. Vision Pro is engineered to let you use your Mac seamlessly within your ideal workspace. So you can dial in the White Sands Environment, and use other apps in Vision Pro side by side with your Mac. This powerful Environment and capabilities makes Apple Vision Pro perfect for the office, or for when you’re working remote.

Besides the fact that it is not 4K wide, it is stretching those pixels over about 80 degrees so that there are only about 40 pixels per degree (ppd), much lower than typically with a TV or movie theater. There are the issues discussed in Part 5A that if you are going to make the display stationary in 3-D, the virtual monitor must be inscribed in the viewable area of the physical display with some margin for head movement, and content must be resampled, causing a loss of resolution. Movies are typically in a wide format, whereas the AVP’s FOV is closer to square. As discussed in Apple Vision Pro (Part 3) – Why It May Be Lousy for Watching Movies On a Plane, you have the issue that the AVP’s horizontal ~80° FOV where movies are designed for about 45 degrees.

Here, Apple claims that the “Apple Vision Pro; perfect for the office, or for when you’re working remote.”

1:48:06 And of course, technological breakthroughs in displays. Your eyes see the world with incredible resolution and color fidelity. To give your eyes what they need, we had to invent a display system with a huge number of pixels, but in a small form factor. A display where the pixels would disappear, creating a smooth, continuous image.

The AVP’s expected average of 40ppd is well below the angular resolution “where the pixels would disappear.” It is below Apple’s “retinal resolution.” If the AVP has a radial distortion profile similar to the MQP (discussed in the next section), then the center of the image will have about 60ppd or almost “retinal.” But most of the image will have jaggies that a typical eye can see, particularly when they move/ripple causing scintillation (discussed in part 5A).

1:48:56 We designed a custom three-element lens with incredible sharpness and clarity. The result is a display that’s everywhere you look, delivering jaw-dropping experiences that are simply not possible with any other device. It enables video to be rendered at true 4K resolution, with wide color and high dynamic range, all at massive scale. And fine text looks super sharp from any angle. This is critical for browsing the web, reading messages, and writing emails.

WWDC 2023 video at 1:56:08 with Excel shown

As stated above, the video will not be a “true 4K resolution.” Here is the claim, “fine text looks super sharp from any angle,” which is impossible with resampled text onto 40ppd displays.

1:56:08 Microsoft apps like Excel, Word, and Teams make full use of the expansive canvas and sharp text rendering of Vision Pro.

Here again, is the claim that there will be “sharp text” in text-intensive applications like Excel and Word.

I’m not sure how much clearer it can be that Apple was claiming that the AVP would be a reasonable monitor replacement, used even when a laptop display is present. Also, they were very clear that the AVP would be good for heavily text-based applications.

Meta Quest Pro (likely AVP) Pincushion Distortion and its Affect on Pixels Per Degree (ppd)

While I was aware, as discussed in Meta Quest Pro (Part 1) – Unbelievably Bad AR Passthrough, that the MQP, like almost all VR optics, had a signification pincushion distortion, it didn’t quantify the amount of distortion and its effect on the angular resolution aka ppd. Below is the video capture from the MQP developers app on the left, and the resultant image is seen through the optics (middle).

Particularly note above how small the white wall to the left of the left bookcase is relative to its size after the optics; it looks more than 3X wide.

For a good (but old) video explaining how VR headsets map source pixels into the optics (among other concepts), I recommend watching How Barrel Distortion Works on the Oculus Rift. The image on the right shows how equal size rings in the display are mapped into ever-increasing width rings after the optics with a severe pincushion distortion.

Mapping Pixels Per Degree (ppd)

I started with a 405mp camera picture through the MQP optics (right – scaled down 3x linearly), where I could see most of the FOV and zoom in to see individual pixels. I then picked a series of regions in the image to evaluate. Since the pixels in the display device are of uniform size, any size change in their size/spacing must be due to the optics.

The RF16f2.8 camera lens has a known optical barrel distortion that was digitally corrected by the camera, so the camera pixels are roughly linear. The camera and lens combination has a horizontal FOV of 98 degrees and 24,576 pixels or ~250.8ppd.

The MQP display processing pre-compensates for the optics plus adds a cylindrical curvature effect to the virtual monitors. These corrections change the shape of objects in the image but not the physical pixels.

The cropped sections below demonstrate the process. For each region, 8 by 8 pixels were marked with a grid. The horizontal and vertical width of the 8 pixels was counted in terms of the camera pixels. The MQP display is rotated by about 20 degrees to clear the nose of the user, so the rectangular grids are rotated. In addition to the optical distortion in size, chroma aberrations (color separation) and focus worsen with increasing radii.

The image below shows the ppd at a few selected radii. Unlike the Oculus Rift video that showed equal rings, the stepping between these rings below is unequal. The radii are given in terms of angular distance from the optical center.

The plots below show the ppd verse radius for the MQP (left); interestingly, the relationship turns out to be close to linear. The right-hand plot assumes the AVP has a similar distortion profile and FOV, the l but three times the pixels, as reported. It should be noted that ppd is not the only factor affecting resolution; other factors include focus, chroma aberrations, and contrast which worsen with increasing radii.

The display on the MQP is 1920×1800 pixels, and the FOV is about 90° per eye diagonally across a roughly circular image, which works out to about 22 to 22.5 ppd. The optical center has about 1/3rd higher ppd with the pincushion distortion optics. For the MPQ Horizon Desktop application shown, the center monitor is mostly within the 25° circle, where the ppd is at or above average.

Gaze-Contingent Ocular Parallax

While a bit orthogonal to the discussion of ppd and resolution, Gazed-Contingent Ocular Parallax (GCOP) is another issue that may cause problems. A reader, VR user, claims to have noticed GCOP brought to my attention the work of the Stanford Computational Imaging Lab’s (SCIL) work in GCOP. SCIL has put out Multiple videos and articles, including Eye Tracking Revisited by Gordon Wetzstein and Gaze-Contingent Ocular Parallax Rendering for Virtual Reality (associated paper link). I’m a big fan of Wetzstein’s general presentations; per his usual standard, his video explains the concept and related issues well.

The basic concept is that because the center of projection (where the image land on the retina) and center of rotation of the eye are different, the human visual system can detect some amount of 3-D depth in each eye. A parallax and occlusion difference occurs when the eye moves (stills from some video sequences below). Since the eyes constantly move and fixate (saccades), depth can be detected.

GCOP may not be as big a factor as vergence and accommodation. I put it in the category of one of the many things that can cause people to perceive that they are not looking at the real world and may cause problems.

Conclusion

The marketing spin (I think I have heard this before) on VR optics is that they have “fixed foveated optics” in that there is a higher resolution in the center of the display. There is some truth that severe pincushion optical distortion improves the pixel density in the center, but it makes a mess of the rest of the display.

While MQP’s optics have a bigger sweet spot, and the optical quality falls off less rapidly than the Quest 2’s Fresnel optics, they are still very poor by camera standards (optical diagram for the 9-element RF16f2.8 lens, a very simple camera lens, used to take the main picture on the right). VR optics must compromise due to space, cost, and, perhaps most importantly, supporting a very wide FOV.

With a monitor, there is only air between the eye and the display device with no loss of image quality, and there is no need to resample the monitor’s image when the user’s head moves like there is with a VR virtual monitor.

As the MQP other pancake optics and most, if not all, other VR optics have major pincushion distortion; I fully expect the AVP will also. Regardless of the ppd, however, the MQP virtual monitor’s far left and right sides become difficult to read due to other optical problems. The image quality can be no better than its weakest link. If the AVP has 3X the pixels and roughly 1.75x the linear ppd, the optics must be much better than the MQP to deliver the same small readable text that a physical monitor can deliver.

Karl Guttag
Karl Guttag
Articles: 258

8 Comments

  1. After the VisionPro keynote in a Developer talk at WWDC, Apple mentioned that they rewrote the entire renderstack, including the way text is rendered.

    Please do not extrapolate from the text rendering of the MQP, as Meta has the tech to do foveated rendering, but decided to not ship it because it reduced FPS.

    • Thanks,
      I am fond of saying, “nobody will volunteer information, but everyone will correct you.” It is interesting that they working the whole render stack for text. That is the best way to do it in terms of text quality (what I called “Case 1” in https://kguttag.com/2023/08/05/apple-vision-pro-part-5a-why-monitor-replacement-is-ridiculous/). What Meta does in Horizons Desktop is terrible (both in terms of quality and lag). Still at the end, Apple has to render the illusion of a stationary virtual monitor onto the pixel grid of the headset display and for typical sized readable text, that means dots and strokes are going land between physical pixels and some form of resampling/antialiasing will have to be done. They will have to make the classic trade-offs between softness and wriggling as they try an fit features that end up roughly the size of pixels but don’t land on pixel boundaris. Unlike at “retinal display” the display in the AVP is too low to simply ignore the problem.

      I understand that what the MQP is doing is terrible and tried to convey that. The reason for studying how the MQP is doing things is to be prepared to see how the AVP different and if it solves the identified problems. I believe in the “Socratic method” where you keep probing and asking questions. Then people like my readers such as you and followers on LinkedIn fill in information. By giving information and asking questions on my blog, we all get better feedback and information.

      I know that the MQP is not doing Foveated rendering. John Carmack said what you wrote just before he left Meta in the same talk I cited in https://kguttag.com/2023/06/13/apple-vision-pro-part-1-what-apple-got-right-compared-to-the-meta-quest-pro/. But what I was talking about is that the optics have a “foveation effect” of making the center higher resolution.

  2. Great article. I must admit I don’t use any VR for heavy text reading, but even the Pico 4 I have has become my go to for light TV and youtube watching because it’s really comfortable to have a screen that is easily moved around rather than needing to lock my gaze on a given spot. I can read comparatively large you tube comments in the headset easily. While this is no doubt not monitor resolution or clarity it has other compensations, and I imagine much better micro oled displays in AVP will also (not that I am even slightly tempted to buy apple anything). I am curious what you think of the teased Immersed Visor (and immersed generally) as they are very much gunning at monitor replacement entirely with 4k per eye MicroOled. Is this just a scam to encourage investment?

  3. I applaud the depth of your analyses but you and your readers should consider that you are dealing with many unknown technical aspects and capabilities of the AVP. Many of your calculations are based on incomplete information and details where you place your best guess or what other HMDs currently provide. That is fine as long as you adequately convey that uncertainty to your reader throughout the piece, which I find that you do not in your writings. This is particularly problematic when you don’t have converging evidence such as personal interactions and experiences with the AVP hardware. Your analyses and some of the assumptions made to do your analyses are also brought into question when the first hand reviews of text and video legibility are uniformly positive from technology reporters that have received a demonstration of the AVP.

    • Sorry, but most “first hand tech reviewers” don’t know what they are talking about with respect to near eye displays. They also have a lot of ego and or their access to Apple tied up in “Apple is never wrong.”

      As I often say, “Demos are a magic show” where they show you want they want you to see. If their text resolution is crappy, then they will be make it 1.5x or 2x bigger. You can make a small amount of text bigger so it LOOKS better, but then big text slows you down reading and lowers the content density. Then we have all the other factor like scaling when they want the monitor to appear stationary.

      I can’t tell you how many times someone has told me that something looks great only to find out that they were fooled by a slick demo. Get me an analysis from someone that objectively knows what they are talking about and then maybe I will change my mind. I heard all this about Magic Leap, Meta Quest, Hololens, and on and on. Objectively, the number that AVP has don’t add up either.

  4. “The image quality can be no better than its weakest link.”

    I so much wish the XR world would pay attention to this quote, which will ultimately represent a pivot in VR display approaches.

    It is amazing how so many people believe or spread the concept that emissive displays with infinite contrast are the solution when the fact is that pancake lenses, despite strengths over previous magnifiers, still blur images and reduce contrast. Blurry, low-contrast images are a very difficult thing to accept by people looking at desktop monitors of even average performance (let alone gamers). The only known solution to increase contrast and clarity is to eliminate the light responsible for it – but that is likely a paradigm shift from the brute-force directions most headset manufacturers are attempting.

Leave a Reply

Discover more from KGOnTech

Subscribe now to keep reading and get access to the full archive.

Continue reading