304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
I’m long overdue to report on some things I saw at AWE in late May 2018. Having written two articles (article one and article two) back in 2017, I was curious to see the Varjo demonstration. In recent news, Varjo announced that they raised $31 million in a Series B round. Varjo was nice enough to invite me to their suite to see some demos and let me take whatever pictures I wanted.
I wanted to see A) how it looked with my own eyes, and B) if they had solved the eye tracking and foveated display image movement I was skeptical about back in 2017 (see my 2017 article two). The answer to A) is that when you look straight ahead it does look very good, but you can see the boundaries if you look around. The answer to B) was “not so much.”
While it was disappointing for me not to see any demonstration of eye tracking and display movement from Varjo, I did enjoy their pass-through (camera) AR demo that was stunning in some ways. Varjo said they had just gotten the AR demo up and running right before the show. While it was a bit crude, it was still effective in demonstrating the concept.
Foveated display technology is based on the fact that the human vision only has high resolution at the center of the eye. The concept is to put a high-resolution image in the center of wherever the eye is looking and then fill in with a low resolution elsewhere to impart a feeling of fuller emersion. Varjo combines a large flat panel display used in VR headsets with a small OLED microdisplay for the “foveated” (center) display with a beam splitting mirror to combine the two images.
In theory, the beam splitter is supposed to move around causing the image from the OLED microdisplay to track the eye. While conceptually easy, this is very hard to solve in practice and is the part that Varjo chose not to demonstrate. When I asked at the show why they were not demonstrating it, I was given a bit of a waffling comment to the effect of their lab prototype being noisy and that people were so thrilled with the static (fixed mirror and eye tracking) version that they decided to concentrate on it.
The demo started with their static foveated display which consists of a modified VR headset with a static beam combiner and microdisplay. It does “work” if you look straight ahead, and the image quality is consistent with the pictures Varjo released in 2017 that were taken through the actual optics.
Varjo demonstrated a series of images, and as I have often written on this blog, a simple black and white test pattern is often the toughest thing for a color display to get right. Shown below (click on the image for a much larger view) is such a test pattern. The camera does tend to exaggerate the transition because when you are looking straight ahead, this region is meant to be seen by parts of the eye with much lower resolution.
One issue I did notice in the central region is some chroma aberrations (click on the photo to see these). A fundamental issue for a foveated display is that both the peripheral (low resolution) and foveated (high resolution) image must go through the same main lens (see diagram above). They are using a lens that may be just good enough for the peripheral image but is too low in quality for the foveated image and thus the visible chroma aberrations.
Ron Panzensky who reviewed and help edit this article commented that while this demo impressed him, he thought the transition from high to low resolution was so abrupt that he noticed it. This is an issue I wrote about in the Part 2 article on Varjo. Additionally, for the same reason, a foveated display is supposed to work, the lack of tracking means the eye can turn to bring the fovea to notice the transition.
The next photo shows a more complicated scene, and none of the problems can be seen. The complexity of the image helps hide any noticeable flaws that would show up on say a flat white background. At the same time, it demonstrates the high resolution of the foveated region (outlined in red dots).
Varjo’s pass-through AR demo I found particularly interesting. The headset had two cameras on the front, and the room was rigged for motion tracking. I believe they are still using an optically modified Oculus Rift headset (see picture on the left) for this demo rig.
When you put the rig on, you see a 3-D rendered motorcycle in the room that you can walk up to and around with the 3-D motorcycle convincingly behaving like it is in the room. Unlike optical AR (ex. Magic Leap and HoloLens) the virtual objects look solid and not ghostly. It is hard to tell the difference between what is real and virtual.
When you look straightforward, you do see what appears to be an extremely high-resolution image which adds to the sense of the motorcycle being there. Some of the features on the computer model for the motorcycle, however, look like they could use some more polygons as some curves seem to turn into line segments (for example, the back of the motorcycle’s mirror).
On the crop of the picture through the optics (left) and on the larger picture below, I have outlined with a dotted red line the rough boundary of the foveated image (click on the image to see larger versions). You can see the transition with the camera but you cannot if you are looking straight ahead.
While I didn’t notice a significant lag from head movement to the image changing, I was not rigorously checking for this issue during the demo. I also didn’t check for how the real-world looked in the foveated display region, but it would clearly be limited by the camera’s resolution.
I want to make clear that passthrough AR has its advantages and disadvantages compared to optical AR. Some of the hardest challenges with Optical AR are almost trivial with Passthrough AR.
Passthrough AR has major advantages regarding opacity and hard edge occlusion. It is also much easier to balance the brightness between the virtual and real worlds. The optics are generally also much simpler.
There are also many serious downsides to passthrough AR compared to optical AR. No display can match the human visual system. There will always be some lag between input and output. The real world does not vary in focus properly which causes vergence-accommodation conflict. Passthrough AR headsets are going to be bulkier and entirely block off the user’s direct view of the real world thus isolating the user.
According to Varjo at AWE 2018, they have delayed perfecting the eye tracking and display movement to concentrate on building static foveated displays. This is in spite of what Varjo has said previously and what is on their website, “By tracking your eyes in real-time, Bionic Display™ delivers a flawless and completely accurate image that far surpasses anything on the market today.”
Certainly the “static” (non-eye-tracking) foveated display makes a good first impression with the user perceiving a very high-resolution image. But it only works if the user stares straight ahead with little eye movement. If there were detail across the FOV like text, the user would know that only the center is in sharp focus.
Unfortunately, it is the tracking of the eye accurately and getting everything to line up and work optically, including getting everything to be in focus, that is by far the is the hardest part of the foveated display problem. There are a number of difficult problems to be solved regarding eye tracking, software/algorithms, moving the foveated part of the image, and the overall optics.
There are good reasons to believe foveated rendering will likely be used to reduce the computational load in the future for VR headsets. Companies no less than Nvidia and Microsoft are publishing papers on studies using with large flat panels (see my first article on foveated displays). These displays used conventional flat panels and simply vary the computations based on eye tracking.
Small flat panel displays have continued to reduce the pixel sizes, which in turn can be used to improve the angular resolution. While the first-generation VR headsets (ala Oculus Rift) were at a very chunky ~4.4 arcminutes per pixel, the Oculus Quest is at ~3 arcminutes per pixel. Though most designers “goal” is to achieve about 1 arcminute per pixel, somewhere around 1.5 arcminutes per pixels is often considered “good enough” for most practical uses — particularly game playing. The inevitable “crank turning” of flat panel displays is expected to keep the resolution improving and closing in on these goals.
The question becomes whether a “physical” foveated display with a moving eye tracking high-resolution region will become practical before flat panels support “good enough” angular resolution before an optically moving foveated region display becomes practical.
I would like to thank Ron Padzensky for reviewing and making corrections to this article.