304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
The startup Varjo recently announced and did a large number of interviews with the technical press about their Foveated Display (FD) Technology. I’m going to break this article into multiple parts, as currently planned, the first part will discuss the concept and the need for and part 2 will discuss how well I think it will work.
Varjo’s basic concept is relatively simple (see figure at left – click on it to pop it out). Varjo optically combines a OLED microdisplay with small pixels to give high angular resolution over a small area (what they call the “foveated display“), with a larger OLED display to give low angular resolution over a large area (what they call the “context display“). By eye tracking (not done in the current prototype), the foveated display is optically moved to be in the center of the person’s vision by tilting the beam splitter. Varjo says they have thought of and are patenting other ways of optically combining and moving the foveated image other than a beam splitter.
The beam splitter is likely just a partially silvered mirror. It could be 50/50 or some other ratio to match the brightness of the large and microdisplay OLED. This type of combining is very old and well understood. They likely will blend/fade-in the image in the rectangular boarder where the two display images meet.
The figure above is based on a sketch by Urho Konttori, CEO of Varjo in a video interview with Robert Scoble combined with pictures of the prototype in Ubergismo (see below), plus answers to some questions I posed to Varjo. It is roughly drawn to scale based on the available information. The only thing I am not sure about is the “microdisplay lens” which was shown but not described in the Scoble interview. This lens(es) may or may not be necessary based on the distance of the microdisplay from the beam combiner and could be used to help make the microdisplay pixels appear smaller or larger. If the optical path though the beam combiner to large OLED (in the prototype from an Oculus headset) would equal the path from to the microdisplay via reflecting off the combiner, then the microdisplay lens would not be necessary. Based on my scale drawing and looking at the prototype photographs it would be close to not needing the lens.
Varjo is likely using either an eMagin OLED microdisplay with a 9.3 micron pixel pitch or a Sony OLED microdisplay with a 8.7 micron pixel pitch. The Oculus headset OLED has ~55.7 micron pixel pitch. It does not look from the configuration like the microdisplay image will be magnified or shrunk significantly relative to the larger OLED. Making this assumption, the microdisplay image is about 55.7/9 = ~6.2 time smaller linearly or effectively ~38 times the pixels per unit area. This ~38 times the area means effectively 38 times the pixels over the large OLED alone.
The good thing about this configuration is that it is very simple and straightforward and is a classically simple way to combine two image, at least that is the way it looks. But the devil is often in the details, particularly in what the prototype is not doing.
The Varjo “prototype” (picture at left from is from Ubergismo) is more of a concept demonstrator in that it does not demonstrate moving the high resolution image with eye tracking. The current unit is based on a modified Oculus headset (obvious from the picture, see red oval I added to the picture). They are using the two Oculus larger OLED displays the context (wide FOV) image and have added an OLED microdisplay per eye for the foveated display. In this prototype, they have a static beam splitter to combine the two images. In the prototype, the location of the high resolution part of the image is fixed/static and requires that the user look straight ahead to get the foveated effect. While eye tracking is well understood, it is not clear how successfully they can make the high resolution inset image track the eye and whether the a human will notice the boundary (I will save the rest of this discussion for part 2).
Near eye display resolution is improving at a very slow rate and is unlikely to dramatically improve. People quoting “Moore’s Law” applying to display devices are simply either dishonest or don’t understand the problems. Microdisplays (on I.C.s) are already being limited by the physics of diffraction as their pixels (or color sub-pixels) get withing 5 times the wavelengths of visible light. The cost of making microdisplays bigger to support more pixels drives the cost up dramatically and this not rapidly improving; thus high resolution microdisplays are still and will remain very expensive.
Direct view display technologies while they have become very good at making large high resolution display, they can’t be make small enough for lightweight head-mounted displays with high angular resolution. As I discussed the Gap in Pixel Sizes (and for reference, I have included the chart from that article) which I published before I heard of Varjo, microdisplays enable high angular resolution but small FOV while adapted direct view display support low angular resolution with a wide FOV. I was already planning on explaining why Foveated Displays are the only way in the foreseeable future to support high angular resolution with a wide FOV: So from my perspective, Varjo’s announcement was timely.
It is well known that the human eye’s resolution falls off considerably from the high resolution fovea/center vision to the peripheral vision (see the typical graph at right). I should caution, that this is for a still image and that the human visual system is not this simple; in particular it has sensitivity to motion that this graph can’t capture.
It has been well proven by many research groups that if you can track the eye and provide variable resolution the eye cannot tell the difference from a high resolution display (a search for “Foveated” will turn up many references and videos). The primary use today is with Foveated Rendering to greatly reduce the computational requirements of VR environment.
Varjo is trying to exploit the same foveated effect to gives effectively very high resolution from two (per eye) much lower resolution displays. In theory, it could work but will in in practice? In fact, the idea of a “Foveated Display” is not new. Magic Leap discussed it in their patents with a fiber scanning display. Personally, the idea seems to come up a lot in “casual discussions” on the limits of display resolution. The key question becomes: Is Varjo’s approach going to be practical and will it work well?
The main lens (nearest the eye) is designed to bring the large OLED in focus like most of today’s VR headsets. And the first obvious issues is that the lens in a typical VR headset is designed resolve pixels that are more than 6 times smaller. Typical VR headsets lenses are, well . . ., cheap crap with horrible image quality. To some degree, they are deliberately blurring/bad to try and hide the screen door effect of the highly magnified large display. But the Varjo headset would need vastly better, and much more expensive, and likely larger and heavier optics for the foveated display; for example instead of using a simple cheap plastic lens, they may need a multiple element (multiple lenses) and perhaps made of glass.
The next issue is that of the tilting combiner and the way it moves the image. For simple up down movement of the foveated display’s image will follow a simple path up/down path, but if the 45 degree angle mirror tilts side to side the center of the image will follow an elliptical path and rotate making it more difficult to align with the context image.
I would also be very concerned about the focus of the image as the mirror tilts through of the range as the path lengths from the microdisplay to the main optics changes both to the center (which might be fixable by complex movement of the beam splitter) and the corners (which may be much more difficult to solve).
Then there is the general issue of will the user be able to detect the blend point between the foveated and context displays. They have to map the rotated foveated image match the context display which will loose (per Nyquist re-sampling) about 1/2 the resolution of the foveated image. While they will likely try cross-fade between the foveated and context display, I am concerned (to be addressed in more detail in part 2) that the visible/human detectable particularly when things move (the eye is very sensitive to movement).
In my opinion, solving the resolution with wide field of view is a more important/fundamentally necessary problem to solve that VAC at the moment. It is not that VAC is not a real issue, but if you don’t have resolution with wide FOV, then VAC is not really necessary?
At the same time, this points out how far away headsets that “solve all the world’s problems” are from production. If you believe that high resolution with a wide field of view that also address VAC, you may be in for a many decades wait.
So the problem with display resolution/FOV growth is real and in theory a foveated display could address this issue. But has Varjo solved it? At this point, I am not convinced, and I will try and work though some numbers and more detail reasoning in part 2.