Apple Vision Pro (Part 3) – Why It May Be Lousy for Watching Movies On a Plane

Introduction

Part 1 and Part 2 of this series on the Apple Vision Pro (AVP) primarily covered the hardware. Over the next several articles, I plan to discuss the applications Apple (and others) suggest for AVP. I will try to show the issues with human factors and provide data where possible.

I started working in head-mounted displays in 1998, and we bought a Sony Glasstron to study. Sony’s 1998 Glasstron had an 800×600 (SVGA) display, about the same as most laptop computers in that year, and higher resolution than almost everyone’s television in the U.S. (HDTVs first went on sale in 1998). The 1998 Glasstron even had transparent (sort of) LCD and LCD shutters to support see-through operation.

In the past 25 years, many companies have introduced headsets with increasingly better displays. According to some reports, the installed base of VR headsets will be ~25 million units in 2023. Yet I have never seen anyone on an airplane or a train wear a head-mounted display. I first wrote about this issue in 2012 in an article on the then-new Google Glass with what I called “The Airplane Test.”

I can’t say I was surprised to see Apple showing the movie watching on airplanes VR app, as I have seen it again and again over the last 25 years. It makes me wonder how well Apple verified the concepts they showed. As Snazzy Lab’s explained, there were no new apps that Apple showed that had not failed before, and it is not clear they failed due to not having better hardware.

Since the technology for watching videos on a headset has been available for decades, there must be reasons why almost no one (Brad Lynch of SadlyItsBradley says he has) uses a headset to watch movies on a plane. I also realize that some VR fans will watch movies on their headsets, but this, like VR, does not mean it will support mass market use.

As will be shown, the total pixel angular (pixels per degree) resolution of the AVP, while not horrible, is not particularly good for watching movies. But then, the resolution has not been what has stopped people from using VR on airplanes; it has been other human factors. So the question becomes, “Has the AVP solved the human factors problems that prevent people from using headsets to watch movies on airplanes?

Some Relevant Movie Watching Human Factors Information

In 2019 in FOV Obsession, I discussed an excellent Photonics West’s AR/VR/MR Conference presentation by Thad Starner, the Georgia Institute of Technology and a long-time AR advocate and user.

First, the eye only has high resolution in the fovea, which covers only ~2°. The eye goes through a series of movements and fixations known as saccades. What a person “sees” results from the human vision system piecing together a series of “snapshots” at each saccade. The saccadic movement is a function of the activity and the person’s attention. Also, vision is partially, but not completely, blanked when the eye is moving (see: We thought our eyes turned off when moving quickly, but that’s wrong, and Intrasaccadic motion streaks jump-start gaze correction)

Starner shows the results from a 2017 Thesis by Haynes, which included a study on FOV and eye discomfort. Haynes’ thesis states (page 8 of 303 pages and 275 megabytes – click here to download it):

Thus, eye physiology provides some basic parameters for potential HWD design. A display can be no more than 55° horizontally from the normal line of sight based on oculomotor mechanical limits. However, the effective oculomotor range places a de facto limit at 45°. Further, COMR and saccadic accuracy suggest visually comfortable display locations may be no more than [plus or minus] 10-20° from the primary position of gaze.

The encyclopedic Optical Architectures for Augmented-, Virtual-, and Mixed-Reality Headsets by Bernard Kress writes about a “fixed foveated region of about 40-50° (right). But in reality, the eyes can’t see 40-50° with high resolution for more than a few minutes without becoming tired.

The bottom line is that the human eye will want to stay within about 20° of the center when watching a movie. Generally, if a user wants to see something more than about 30° from the center of their vision, they will turn their head rather than use just their eyes. This is also true when watching a movie or using a large computer monitor for office-type work.

The Optimum Movie Watching FOV is about 30-40 Degrees

It may shock many VR game players that want 120+ degree FOVs, but SMTPE, which sets the recommendations for movie theaters, says the optimal viewing angle for HDTV is only 30°. THX specifies 40 degrees (Wikipedia and many other sources). These same optimum seating location angles apply to normal movie theaters as well.

The front row of a “normal” movie theater is about 60°, which is usually the last row in a theater where people will want to sit. Most people don’t want to sit in the front rows of a theater because of the “head pong” (as Thad Starner called it) required to watch a movie that is ~60° wide.

While 30°-40° may seem small, it comes back to human factors and a feedback loop of the content generated to work well with typical theater setups. A person in the theater will naturally only see what is happening in the center ~30° of the Screen most of the time, except for some head-turning fast action.

The image content generated outside of ~30° helps give an immersive feel but costs money to create and will not be seen in any detail 99.999% of the time. If you take content generated assuming a nominal 30° to 40° viewing angle and enlarge it to fill 90°, it will cause eye and head discomfort for the user to watch it.

AVP’s Pixels Per Degree Are Below “Retinal Resolution”

Another factor is “angular resolution.” The bands in the chart on the right show how far back from a given size TV with a given resolution must sit before you can’t see the pixels. The metric they use for being “beneficial” is 60ppd or more. Also shown on the chart with the dotted white lines are the SMTPE 30° and THX 40° recommendations.

Apple has not given the exact resolution but stated 23 Million (pixels for both eyes). Assuming a square display, this computes to about 3,400 pixels in each direction. The images in the video look to be about a 7:6 aspect ratio which would work out to about ~3680 by ~3150. Also, the optics cut off some of the display’s pixels for each eye, yet often companies count all the display’s pixels.

Apple didn’t specify the field of view (FOV). One big point of confusion on FOV is that VR headsets are typically quoted for both eyes, including the binocular view combing both eyes. The FOV also varies based on the eye relief from person to person (people’s eye insets, foreheads, and other physical features are different). Reports are that the FOV is “similar” to the Meta Quest Pro, which has a binocular FOV of about 106 degrees. The single-eye FOV is about 90°.

Combining the information from various sources, the net result is about 35 to 42 pixels per degree (ppd). Good human 20/20 vision is said to be ~60ppd. Steve Jobs with the iPhone 6 called 300 pixels per inch at reading distance, which works out to ~60ppd), “retinal resolution.” For the record, people with very good eyesight can see 80ppd

Some people wearing the AVP commented that they could make out some screen door effect consistent with about 35-40ppd. The key point is that the AVP is below 60, so jagged line effects will be noticeable.

Using the THX 40° horizontal FOV standard and assuming the AVP is about 90° horizontally (per eye, 110 for both eyes), ~3680 pixels horizontally, and almost no pixels get cropped, this leaves 3680 x (40/90) = ~1635 pixels horizontally. Using the STMPE 30° gives about 3680 x (30/60) = ~1226 pixels wide.

If the AVP is used for watching movies and showing the movie content “optimally,” the image will be lower than full HD (1920×1080) resolution, and since there are ~40ppd, jaggies will be visible.

While the AVP has “more pixels than a 4K TV,” as claimed, they can’t deliver those pixels to an optimally displayed movie’s 40° or 30° horizontal FOV. Using the full FOV would, in effect, put you visually closer than the front row of a movie theater, not where most people would want to watch a movie.

Still, resolution and jaggies alone are not so bad as they would not, and have not, stopped people from using a VR headset for movies.

Vestibulo–Ocular Reflex (VOR) – Stabilizing the View with Head Movement – Simple Head Tracking Fails

The vestibulo-ocular reflex (VOR) stabilizes a person’s gaze during head movement. The inner ear detects the rotation, and if one is gazing, it causes the eyes to rotate to counter the movement to stay fixed on where the person is gazing. In this way, a person can, for example, read a document even if their head is moving. People with a VOR deficiency have problems reading.

Human vision will automatically suppress the VOR when it is a counter product. For example, the VOR reflex will be suppressed if one is tracking an object with a combination of head and eye movement, whereas VOR would be counter-productive. The key point is that the display system must account for the combined head and eye movement to generate the image without causing a vestibular (motion sickness) problem where the inner ear does not agree with the eyes.

Quoting from the WWDC 2023 video at ~1:51:18:

Running in parallel is a brand-new chip called R1. This specialized chip was designed specifically for the challenging task of real-time sensor processing. It processes input from 12 cameras, five sensors, and six microphones.

In other head-worn systems, latency between sensors and displays can contribute to motion discomfort. R1 virtually eliminates lag, streaming new images to the displays within 12 milliseconds. That’s eight times faster than the blink of an eye!

Apple did not say if the “12 cameras” included eye-tracking cameras, as they only showed the cameras on the front, but likely they are included. Complicating matters further is the saccadic movement of the eye. Eye tracking can know where the eye is aimed, but not what is seen. The AVP is known to have superior eye tracking for selecting things from a menu. But we don’t know if the eye tracking coupled with the head tracking deals with VOR, and if so, whether it is accurate and fast enough to solve to not cause VOR-related problems for the user.

Movies on AVP (and VR) – Chose Your Compromises

Now consider some options for displaying a virtual screen on a headset below. Apple has shown locking the Screen in the 3-D space. For their demos, they appear to have gone with a very large (angularly) virtual screen for the demo impact. But, as outlined below, making a very large virtual screen is not the best thing to do for more normal movie and video watching. No matter which option is chosen below, jaggies and “zipper/ripple” antialiasing artifacts will be visible at times due to the angular resolution (pdd) of the AVP.

  1. Simplistic Option: Scale the image to full Screen for the maximum size and have the Screen moves with the headset (not locked in the virtual 3-D space). This option is typically chosen for headsets with smaller FOVs, but it is a poor choice for headsets with large FOVs.
    • It is like sitting in a movie theater’s front row (or worse).
    • The screen moves unnaturally with head motion as it follows any head motion.
  2. Lock the Virtual Screen but nearly fill the FOV: This is what I will call “Head-Lock for Demos Only Mode.” If the virtual Screen nearly fills the FOV, then the small head movement will cause the Screen to cut off and will, in turn, will trigger a person’s peripheral vision causing some distraction. To avoid distraction, the user must limit head movement and eye movement; perhaps doable in a short demo, but not a comfortable way to watch a movie.
  3. Locking Screen in 3-D space with the Screen at STMPE 30° to THX 40°: With ~40° FOV, there is room for the head to turn and total without cutting off the Screen or forcing the user to keep their head rigidly held in one location.
    • This will test the ability of the system to track head motion without causing motion sickness. There will always be some motion-to-photon lag and some measurement errors. There is also the VOR issue discussed earlier and whether it is solvable.
    • Some additional loss in resolution and potential for motion/temporal artifacts as the flat or 3-D movie is resampled into the virtual space.
    • Add motion blur to deal with head and eye movement (unlikely as it would be really complex).
    • The AVP reshows a 24 fps movie four times at 96Hz – does each frame get corrected at 96Hz, and what about visual artifacts when doing so?
    • What does it do for 30 fps and 60 fps video?
    • The Screen will still unnaturally be cut off if the user’s head turns too far. It does not “degrade gracefully” as a real-world screen would when you turn away from it.

Apple showed (above) images that might fill about 70 to 90 degrees of the FOV in its short Avatar demos (case 2 above). This will “work” in a demo to be something new and different, but as discussed in #2 above, it is not what you would want to do for a long movie.

And You Are on a Plane and Wearing A Heavy Headset Pressed Against Your Face with a Cord to Snag

On top of all the other issues, the headset processing and sensor must address vestibular-related motion sickness problems caused by being in a moving vehicle while displaying an image.

You then have the ergonomic issues of wearing a somewhat heavy, warm headset sealed against your face with no air circulation for hours while on a plane. Then you have the snag hazard of the cord, which will catch on just about everything.

There will be flight attendants or others tapping you to get your attention. Certainly, you don’t want the see-through mode to come on each time somebody walks by you in the aisle.

A more basic practical problem is that a headset takes up more room/volume due to its shape and the need to protect the glass front than a smartphone, tablet, or even a moderately sized laptop.

Conclusions

It is important to note that humans understand what behaves as “real” versus virtual. The AVP is still cutting off much of a person’s peripheral vision. Something like VOR and Vergence-Accommodation Conflict (VAC discussed in Part 2) and the way focus behaves are well-known issues with VR, but many more subtle issues can cause humans to sense there is something just not right.

In visual human factors, l like to bring up the 90/90 rule, which states, “it takes 90% of the effort to get 90% of the way there, and then the other 90% of the effort to solve the last 10%.” Sometimes this rule has to be applied recursively where multiples of the “90%” effort are required. Apple could do a vastly better job of head and eye tracking with faster response time, and yet people would still prefer to watch movies and videos on a direct-view display.

Certainly, nobody will be the wiser in a short flashy demo. The question is whether it will work for most people watching long movies on an airplane. If it does, it will break a 25+ year losing streak for this application.

Karl Guttag
Karl Guttag
Articles: 260

18 Comments

  1. As long as it’s above 200 grams, and that people look at you like you’re a weirdo nerd because it’s huge, it won’t be used in publish transports. End of the story.

    As an early VR early adopter, I gave up on it until the tech become more mature. I will buy an new VR device and watch movies with it when 60ppd will be achieved, with RGB OLED, 150 degrees FOV min, clear lenses and low friction setup.
    Until then, I let people playing the beta testers, get their neck aches and get their hair pulled off by the strap and get their face all red and sweaty thanks to the pressure.

  2. Hi, Thank you for always a great article. I agree with your points of PPD issues.
    BTW, what do you think about paper vs. display to read something?
    I feel paper is still better for reading compared to any high reso displays…
    Any academic paper to prove why?

  3. Hi, afaik the AVP has a situation that’s peculiar at best wrt. movie watching. On one hand optics dictate low nit number so SDR, that also makes PPD less of a problem or “enough”. On the other, high fov dictates ability for HDR, and the reasoning for this is the following excerpt:
    “with the relatively small size of TVs, combined with the standard viewing distance – 3m or so – the whole TV screen is within the high-acuity, central angle of view of the human eye (5° to 15°), meaning the human visual system cannot respond independently to different areas of brightness – being stuck within a state of full adaptation, so the viewer is only able to use the static dynamic range of the human eye.

    To actually gain benefit from the concept of HDR the actual viewing angle the display would need to occupy would be in the order of 45°, which with an average large TV of 55″ would means sitting just 65″ from the screen.
    (See also the section on ‘Resolution’.)” from: https://www.lightillusion.com/what_is_hdr.html

  4. As someone who routinely sits near the front of a cinema, I question the 40 degree FOV preference. I like to look around with a wider field of view. When I convince others to join me closer to the screen, they in variably say “this is better”. I’m not saying front row – but perhaps 3rd or 4th depending on the theatre. I would prefer closer but front projection cinemas force all theatre viewers to be below the projector’s FOV. I wonder how many studies were funded by movie studios who don’t want us sitting close and in the middle of screen.

    Either way, IMAX is another fundamental and contrary data point. Everyone in the theatre has more than a 40 degree FOV – by design. The IMAX short throw projector allows everyone to sit higher than is most cinemas while getting closer to the screen

  5. It’s implicit in your description and mentions of VOR but I was surprised it was left implicit. To spell it out a bit more:

    In VR, “motion to photon” lags in the ms can lead to motion sickness. For fast response, HMD image stabilization mostly feeds off of Inertial Measurement Units that report at hundreds or thousands of Hz. The headset’s IMU can only measure motion in space; it realistically has no way of telling the difference between the rocking, bumping, and turning of a plane (or car, or bus, or train…) versus the user’s head motions relative to the space they perceive as stationary (or not, at any given moment).

    Locking the virtual movie screen or monitor exactly in front of the user’s head at all times has been possible for thirty years or so; it feels bad and nobody chooses to do it. Any attempt to lock the screen in space in a non-stationary environment will result in the large virtual object failing to move when it “should” or even moving the wrong way with a lag. This is a recipe for rapid and extreme motion sickness even in less susceptible users..

  6. Interesting read, thank you!

    I think there is a typo in the penultimate paragraph: “it takes 90% of the effort to get 90% of the way there”

  7. The problem of reference in a moving vehicle could be solved with an external tracker (controller or dedicated device) like on simulators motion rigs. But Apple didn’t seem to provide anything like that.

    And yeah, the huge virtual movie screen is a gimmick, as the content is not designed for it.

    There are some 180°/360° videos, but I didn’t see many high quality ones, except in physical immersive theaters (usually documentaries). Maybe Apple and Disney will fund this.
    Though I’m not a big movie watcher in general, I liked some productions showed in the VR section of film festivals and made available in “Museum of Other Realities”. Framing and camera work in general is very different than traditional movies, it’s an interesting area.

  8. I am going to bet you whatever you want that even at $3500 before 2024 is over you will have personally seen a passenger in a plane wearing Vision Pro.

    • Sure lots of people are going to try with the AVP, including myself. The question is whether regardless of the cost (as cost is a temporal and will come down), will most people prefer it over a tablet, laptop, or cell phone?

  9. Look, you are an engineer, you see flaws and errors where others see magic. I go to the movies and see how every visual effect is done, why every corner of scene is lit in a particular way – why because I spent the last 35 years in the entertainment industry.

    Yet when I take a step back, I just see the story and enjoy it (most of the times).

    I use my monster Apple head phones everyday. If the AVP is anything in terms of weight or wearability like them, I see zero issues from most non critical folk.

    Most people will love the device because its integration into a large eco system, which non other then Apple is capable of doing.
    Will it be the best in class, will it be the best visual experience ? Most likely not, else there was no room for future products.

    Yet I bet you the overall experience and usability will beat the competition by miles.

  10. Out of all the public forms of transport, airplanes have the least social stigma attached (for a variety of reasons that I will leave to sociologists). As a result it’s being overused in marketing, every product from XREAL to Apple is supposed to be used on planes. Meanwhile all I see on planes is people napping or watching movies using in-flight entertainment systems.

  11. It’s a great, purely theoretical, explanation why it might not work, but I watched 3D movies on long flights in AVP, and it was best in-flight entertainment I ever had, and best movie experience in general.

Leave a Reply

Discover more from KGOnTech

Subscribe now to keep reading and get access to the full archive.

Continue reading