304 North Cardinal St.
Dorchester Center, MA 02124
304 North Cardinal St.
Dorchester Center, MA 02124
I working on pipeline of articles AR systems and technologies, including the many things I saw at CES 2020 in January and Photonics West (PW) in February plus some related topics. Between CES and PWI have been going in circles trying to figure out what to write about first. Having gotten to see with my own eyes the Hololens 2 (HL2) two times, I decided to start with my observations on it.
I was able to try the HL2 the first time for 20 minutes, and while I could view any content, I was asked not to take any pictures. I was able to confirm a few things, but it will take more time, taking pictures, and making some measurements to more completely evaluate the HL2. The second time I got to try the HL2 was at a public demonstration at Photonics West was the content was tightly controlled and lasted about 10 minutes. This second time at least let me see a different unit that had similar image quality to the first unit.
Most of the people getting units to date have a vested interest in not reporting bad things about Hololens. Many are hoping to develop for Hololens, and they don’t want to risk hurting their relationship with Microsoft.
Additionally, I want to go point by point through some information/misinformation that was tweeted by Alex Kipman, Microsoft Technical Fellow of AI and Mixed Reality. Kipman was tweeting about the photos through the optics of the Hololens 2 that were the subject of this blog’s article, “Hololens 2, Not a Pretty Picture.” Kipman’s tweets were posted on the same day as my article, December 18th, 2019.
Compared to the worst television being sold today, the image quality of the HL2 is terrible just about any way you could measure it. Color uniformity and saturation are poor, the image flickers, and small text is hard to read. As I will discuss, there is also an issue with flicker.
There are applications for a mixed reality headset in enterprise/business applications, as demonstrated by the likes of Toyota working with the HL2. Generally, these applications require either hands-free use or are involve visualization where the user must move around a room. The ability to use SLAM to lock visual content to the real-world is of interest to many companies, particularly for industrial use. I’m a bit more dubious of the use in sales and marketing of products in showrooms to the general public, where it will likely just be an expensive gimmick once the novelty wears off. In industrial applications, my concern is with safety, in particular, how it may obscure the view and perhaps cause eye strain issue.
The first time I used a single HL2 was unplanned, but I was able to navigate to the test patterns on this blog at www.kguttag.com/test. I had generated some patterns at 1440p, which is the native resolution of the Hololens 2. Using test patterns with which I was familiar helped me identify issues with uniformity, color, color saturation, resolution, and flicker.
The human visual system does a lot of processing to produce the image that we think we see, including a form of automatic white balancing. Thus we have different color temperatures with things like a cool-white and a warm-white, and both will look “white” if that is all we see. But we will notice if we put up a solid white area and the colors vary.
In terms of the physical comfort of one’s head and neck, the HL2 is a big improvement. Much of this comes from balancing the weight and the ability to flip up the display. The hand tracking and gesture control are vastly improved over the HL1, and this leads to much less and arm strain when making gestures.
Still, for all the talk about ergonomics, the fact HL2 is using 120 Hertz interlaced (60Hz full-frame refresh) is ergonomically poor and should be disqualifying IMO. Both based on my calculation (see my article on Hololens 2 Interlacing) and my observation, the refresh rate falls far below the ISO-9241-3 standard’s recommendation for flicker ergonomics set in 1992 (see the Appendix: Some History of Flicker in Computer Monitors). Humans vary widely on the ability to perceive flicker and in their adverse reactions to it. Some problems with the interlace refresh cause perceptible problems like lines occasionally disappearing (to be discussed later). Some of the problems are less obviously perceptible but can cause eye strain, soreness, and nausea with use.
In the longer session with the HL2, I felt mild to moderate pain in my eyes. It didn’t keep getting worse nor go away until after I took off the Hololens 2. It felt almost like my eyes were swelling. I don’t remember this type of pain in my eyes with the Hololens 1, Magic Leap or other headsets.
My one-off experience (I didn’t notice is the second time but my exposure was shorter) and the fact that I have not heard of others reporting this problem is not definitive, but it is something I will be checking out more in the future. My best guess that it could be an adverse reaction to the flicker.
HL2 uses diffractive waveguides, and every diffractive waveguide has problems with color uniformity. As expected, even the best HL2 will have problems with say large areas of mostly white (such as a typical web page). So the question becomes, what constitutes a “good” versus a “bad” unit? The answer to this question is a function of the content that is shown, a person’s tolerance for poor image quality, and perhaps, their desperation to work with a Hololens 2.
Using a large white test pattern, the varying of color across the image was very noticeable. Both units I tried had significant color uniformity problems I had expected haven seen many diffractive waveguide-based headsets. With the first united I tried, I did take the time to go through the “eye calibration” that according to Alex Kipman’s tweets, should improve image quality. While it could be considered usable, the color uniformity was clearly not very good.
The two HL2 units I have used are much better than the pictures that have been posted. Still, the unit was significantly worse than diffractive waveguides from WaveOptics, for example, or even the Hololens 1. By all reports, there is a wide distribution with the HL2 in terms of color uniformity. I would like to note that if you just stick a cell phone up to a waveguide, you can exaggerate problems with a diffractive waveguide. But also remember, the pictures are being taken after a person sees problems with their own eyes. So it is not just problems with the way the pictures are taken. It helps if you have a small “mirrorless” interchangeable lens camera where you can control the focal length, f-number, shutter speed, and ISO (I use an Olympus E-M10 MARK III). It is not a perfect analog for the human eye, but it is much closer than a cell phone camera.
In my online test patterns, I included a picture of a Christmas Elf which has nice skin tones plus some very saturated color in the background. I have been using this picture for about 10 years, and I know what it supposed to look like.
It was evident to my eye that the colors lack saturation. The Elf sub-image is repeated 4 times in the 1440p test pattern, and nowhere did it look well saturated. It seemed to be the least saturating when in the center of the screen, which is where the laser scanning process is moving the fastest.
Prior to ever seeing the Hololens 2, I had received multiple reports of “flicking lines,” and indeed, I saw lines flicking/disappearing occasionally. If you have high resolution and high contrast detail on the screen, such as text or lines in my test patterns, they will tend randomly appear and disappear. I also noticed that if you stare at what should have been a solid area that occasionally, every other line would occasionally disappear. Also, you can see the individual scan lines if you concentrate on an area.
I believe that the main issue is a temporal issue due to interlacing (see my article on Hololens 2 Interlacing) at too low a refresh condition. Another cause could be an aliasing problem due to a combination of small movements of the head and resultant changes of the image combined with the laser scanning process and conversion to/from rectangular pixels rather than raster scan lines. The Microsoft Hololens engineers probably know all the issues and their causes, but they are not telling 😊.
Conceptually, the human vision system is taking a complex series of snapshots with the eye constantly moving (know as saccades) and effectively blanking vision between movements. With interlaced video sources and the saccadic movement of the eye, there will be occasions where the human vision system puts the interlaced fields together wrong or even misses a field altogether, resulting in missing line.
The other likely source of the flickering is that with a scanning system, LBS scan lines are not the same as rows of pixels. Pixels in the image have to be scaled and remapped onto where the laser beam is scanning. When doing the remapping, there is also the classic resampling problem of either making the image softer/blurrier or accepting some level of temporal aliasing (moving jaggies).
My first test for any headset is typing a WiFi password and then typing in a web address. As the WiFi was already connected, but I still needed to type in a web address. On the older HL1, typing was quite literally was painful and very time consuming with the “aim the dot and then used a pinch jester” method. The Hololens 2 has a floating keyboard where you push the keys with your finger. The way the keyboard works is even rather pleasing with how it works. While you are not going to want to type much this way, it does work well enough for things like web addresses and short messages. It still reduces you to hunting and pecking one key at a time.
There is a bit of a user’s interface dilemma with MR/XR with respect to being hands-free. Hololens (1 and 2) so far has seemed to focus on jesters you can see which while freeing up the hands requires the user to look at their hands while they interact with the display. Many people have noted that they wish they had a controller like the Magic Leap One, where you could keep your arms down and had some tactile feedback, but then this fills up the hands. Alternatives like Data Gloves and wrist muscle and electronic nerve sensors may work in some applications but not others. I would think many applications will end up relying on a cell phone or data tablet or similar device for any significant text input.
On the same day I published Hololens 2 Not Pretty Picture, Alex Kipman made a five (5) part set of Tweets in response to the images I cited in the blog post (see left). I would like to go through each part and respond to them.
Kipman Part 1: “Friends, we have a binocular system that forms an image at the back of your eyes, not in front of it. Eye tracking is fully in the loop to correct comfort which also includes color.”
At best, this is a half-truth. First, we should not that people have been complaining about the color uniformity problem due to seeing with their own eyes have done the “calibration.” Many people, including myself that have seen HL2 and other waveguides displays, have judged the HL2 to be very poor in terms of color uniformity. So if HL2 is correcting for it, they are either not doing a very good job of correction or the displays are so bad that the correction is ineffective.
It also does make a difference in what both eyes are seeing as the human visual system will tend to average, but with one eye dominating. In my limited observations, I saw a similar run-out in color in both eyes and didn’t notice a big difference between using one or both eyes (something I will want to look more at in the future).
The half-true part is that some level of color uniformity correction that is possible with waveguides. WaveOptic’s in their presentation at the Photonics West AR/VR/XR conference show a before and after with correction (see below).
Kipman Part 2: “Eye relief (the distance from lens to your pupil) changes the image quality. Further out you are, worse the image quality becomes in terms of MTF as well as color uniformity.”
Eye relief is going to vary from person to person based on the shape of the head, whether they are wearing glasses and other factors, including perceived comfort. One of the big advantages of HL2 over any other AR/MR headset is the amount of eye relief. It looks like Kipman is saying that HL2 that eye relief comes at the expense of color uniformity.
Kipman Part 3: “Taking monocle [sic] pictures from a phone (or other camera) is completely outside of our spec and not how the product is experienced.”
While it is true that a camera can exaggerate the effects, Kipman is non-responsive to the issue with his “outside our spec.” I would like to know the color uniformity spec and how they can measure it and correct for it. How are HL2 tested for quality before shipment (based on reports the testing is pretty loose, to say the least).
It is true that a phone can be a poor model for the eye, particularly in the hands of “amateur photographers.” Phones have much smaller sensors than the eye’s retina and much smaller numerical apertures and the phone camera put in the wrong place. The phone does work differently than the human visual system. All these facts can lead to exaggerating bad effects. But still if done properly, the right camera with proper setups, can give a reasonable representation of what the eye sees.
The size of the aperture and location of the camera will have an effect. Personally, I like using a 4/3rd (Olympus) camera as it seems to better match the eye’s parameters. Cell phone cameras/lenses/apertures are too small and full-size DLSRs or too big. One also wants full control over shutter speed, aperture, and ISO (gain) to get a representative picture.
While I recognize and agree that a camera works quite differently than human vision, you can still get a picture that fairly represents what the eye sees (once again WaveOptics can do). I think it is a marketing waffle to cover up problems to say that you can’t.
Kipman Part 4: When you look at it with both eyes, at the right eye relief (somewhere between 12-30 mm from your eyes) with eye tracking turned on, you experience something very different.
Another half-truth. Once again, people are seeing problems with their own eyes even after having been “calibrated.”
Part 5: “if you are having issues experiencing our product, first our apologies, second please get a hold of us (firstname.lastname@example.org is your friend) and let’s engage on how we can solve your issues. Team is fully leaned in and listening.”
There are numerous complaints online that Microsoft is unresponsive to problems. The field support representatives don’t know what to tell people. Also, people that have been getting HL2 to date are very “select,” and most have vested reasons not to speak out against the HL2 (including being cut-off from support altogether) and yet still complaints are filtering through.
Alex Kipman is acting more like a marketing person than a technical expert. As I pointed out in my blog post, Hololens 2 Video with Microvision “Easter Egg” Plus Some Hololens and Magic Leap Rumors, I’m sure he is a very intelligent person at some things, but his understanding of displays seems superficial. He has also been disingenuous when he says that Microsoft invented the laser beam scanning engine when there is plenty of evidence it was developed by Microvision.
“Those who cannot remember the past are condemned to repeat it.” (Santayana 1905). I know some think my criticism of the flicker with the HL2 seems a bit harsh, but I have some personal history developing graphics circuits in the days when there were only CRT displays and the issues with flicker. There also must be many people at Microsoft working on the HL2 that know about the flicker issue.
The screen refresh rate (1/flicker_frequency) is the rate between any spot being illuminated with a scanning type display like CRTs and Laser Beam Scanning (LBS). It should not be confused with the frame rate, which is how often the image content is changed. Modern LCD and OLED flat-panel-displays usually change from one image to the other without a period of blanking, and so they shouldn’t flicker. Unfortunately, some backlight for LCDs and OLEDs have flicker problems due to PWM dimming at too slow a rate.
Interlaced refresh is where every other line is refreshed in every other scan of the image. It is a trick that was used in the days of CRT Televisions to try and reduce flicker. It helps primarily when the TV is viewed from far away (and thus the lines blur together) and when the content keeps changing. But as was found when CRTs were used as computer monitors, as you get up close flicker becomes more of an issue.
There is a very wide range in human’s ability to notice and/or feel ill effects from flicker. Thus for the same display, one person might have problems when the flicker where another person may have a very adverse reaction to the flicker.
I was the lead architect of the TMS34010 (1986) and TMS34020 (1988) graphics processors, as well as the first VRAM (1984). The VRAM, a precursor to both the SDRAM and GDRAM, was specifically created to support the refreshing of (then) higher resolution CRT computer monitors.
In 1987, IBM introduced the 8514/A graphics card that supported ~87Hz Interlaced (~43.5Hz full-frame refresh) display and an IBM custom monitor with longer persistence phosphors. Up until the introduction of 8514/A, most people felt that 60Hz progressive refresh rates were necessary to avoid flicker. It turned out the many people had problems with the flicker of IBM 8514/A’s interlaced refresh even with the longer persistence phosphor monitors. For example, it was reported in PC Magazine from April 10, 1990, issue on graphics accelerators.
It so happens that in the same PC Magazine issue, it reported on page 175 (right) that “In this roundup, all the fastest adapters use TMS34010 coprocessors.”
The fact that so many people were having trouble with the flicker from the 8514 and its 87Hz interlace led to studies of the issues of flicker with CRT computer monitors. These studies resulted in ISO-9241-3 in 1992 recommendations for computer monitors. It was found that even 60Hz progressive scanning was not fast enough and that the perception of flicker also varied with screen brightness (among other factors). The ISO committee put out a recommendation based on a formula but which simplified down to about 85Hz refresh for most practical uses. See the graph below based on the ISO-9241-3 standard from the article The Human Visual System Display Interfaces Part 2 on website What-When-How.
And the ISO-9241-3 studies assumed CRT with phosphors with some persistence (say on the order of 1 to 2 milliseconds. With the HL2 and LBS there is zero persistence and thus more of a tendency to flicker and more of chance for the eye to see all or nothing in the case of disappearing lines.