Introduction – Following Up on Color Control
As I noted in Hololens Display Evaluation (Part 3: Color Uniformity), one the first things when I first tried a Hololens 2 (HL2) was that colors were washed out (see below). The color was much worst than the Hololens 1. So what’s wrong with the HL2’s intensity control? After all, the HL2 is starting with lasers that have highly saturated colors, so why are the colors so washed out?
The severe color uniformity problems of the HL2 tends to mask the poor color and intensity control of the laser beam scanning (LBS) engine itself. This article is going to separate the LBS’s poor intensity control from the HL2 waveguides’ color uniformity problems.
As discussed in February 2019 in Hololens 2 is Likely Using Laser Beam Scanning Display: Bad Combined with Worse, there are many unsolvable problems with LBS. Hololens 2 Display Evaluation (Part 1: LBS Visual Sausage Being Made) showed the resolution problems and temporal artifacts (including flicker and rippling) associated with the HL2’s LBS. After showing the photographic evidence, this article will explain some of the causes of the poor intensity and color control of the HL2’s LBS.
Intensity and brightness are commonly used interchangeably. For this article, the intensity will be used to describe the luminosity (measured in nits = cd/m2) [corrected 8/3]. Brightness will be used when referring to the overall display’s light output as controlled by the HL2’s brightness buttons.
Human Perception of Brightness “Gamma” Decoding
Human brightness perception is very non-linear. It takes much smaller absolute changes in intensity in dark areas to be perceived as the same relative change in brightness in bright areas. All common image and video formats take advantage of this fact when they encode an image to use fewer bits for the same perceived image quality. For example, with the sRGB colorspace encoding (the most widely used), the sum of the first 30 steps (0 to 30) in intensity is about the same as just the last step from 254 to 255 using 8-bit pixel values.
To correctly display 8-bit (28=256) encoded pixel color values requires about 12-bits (4,096 levels) of linear intensity control. The non-linear encoding and decoding of pixel values are commonly referred to as “gamma” even though most standards use are not simple gamma curves. For more discussion on gamma decoding, see the Appendix at the end.
Simple Test Patterns Help Isolate Issues
oiThe simplest test patterns often work best for evaluating a display device. In this case, I am going to use a simple “gray ramps” (including a ramp in green-only) that step through all 256 levels (with 8-bits/color), starting a 0 (black) in the center and incrementing to 255 (white/brightest green). Gray ramps usually begin on the left or right of the screen, but with the HL2’s waveguides, the colors on the left and right sides are so messed up that they would obscure the transition at the dark end. So gray ramps were created that start from the center. The test pattern has some additional shaded squares with their 8-bit pixel values.
Below are the test pattern (left) and the HL2’s image (right). Click on each image to see larger versions (the Hololens 2 picture has been scaled down to be 1920 pixels wide from ~4K originally).
Yes, the coloration looks this bad to the eye. The HL2 generates a “frosty rainbow of color.” The severe color uniformity problems with the HL2 obscures most of the gray ramp. Notice through all the colorations that the darker area in the center of the screen is too thin. The HL2’s LBS display is too rapidly transitioning from dark to light.
Green (Only) Ramp More Clearly Shows Some Issues
A test pattern with a green-only ramp test pattern reduces confusion from the waveguide’s coloration. The photo (left) shows the image generated by the HL2 (click on image for high-resolution version).
The main thing to note above is rather than smoothly moving from black to through dark green to bright green, the HL2 very rapidly jumps from black to green leaving a moderately thin dark area in the middle. The too rapid transition from dark to light has the effect of compressing the colors at the white/bright end and desaturating colors, as seen in the Elf Pictures (repeated below).
Next, we are going to look closer at the green ramp from the HL2 above. The four figures below show the picture of the green ramp compared to the test pattern along with a larger photo and test pattern cropped from the center.
The green-only ramp shows that there is a series of somewhat diagonal bands of brightness and color indicated by the blue dot annotations in the top picture. Most of the time, my eye is drawn to the trapezoidal one more toward the center, but this green image shows more clearly that there is a series of diagonal bands rather than a single region.
Observation: Fat Text and Lines
Something else I regularly see is that that the HL2 draws everything too fat. The widening effect can be seen in both the large red text and the smaller white number above. The HL2 blurs everything, thus covering a higher percentage of the image than it should with its “wobulation” as discussed in Part 1 of this series.
Red and Blue Colors in the “Green Ramp”
There are visible reddish splotches in a green only ramp. No matter how you filter or diffract the green light, it won’t turn red. The red is coming from a combination of mixing some red (and blue) into the green to get the right color point, and the HL2’s waveguide’s extreme non-uniformity and blocking green light in the area of the red patches.
The compensating colors, in this case, red and blue, should be at a fraction of the values of the primary color (green in this case) they are compensating. They should also be the same linear percentage as the primary color changes intensity to have consistent colors. As the pictures below show, the HL2 is nowhere close to keeping the compensation consistent.
This blog discussed the need to desaturate/mix laser colors to get the expected colors way back in 2011 in Direct Diode Green Lasers (Part 2, Chromaticity). Roughly speaking, laser’s green (and red and blue) is “too green,” and if used, for example, green grass, it would make the grass look like it is glowing. BTW, this is also an issue when using LEDs, but not as severe. A percentage of red and blue are mixed even when calling for a “pure” green. A similar mixing takes place for red and blue.
The compensation should be mixing a very small amount of red and blue with the green to correct the color. But because of the large step from zero and the rapidly increasing red light, it is overcompensating the green, at the dark end of the ramp.
Below is a gallery of 4 images showing the test pattern along with the HL2’s display filtered (in Photoshop) to show only individual green, red, and blue components. Remember other than the squares and text, everything in the ramp of intensity is suppose to be green-only. As can be seen, there is a lot of red and some blue in the “green” ramps.
Closeup Look at the Transition from Black
A test pattern with wide steps from a level from just 0 to 8 (right) was created to investigate how the HL2 transitions from black. The HL2 was backed up from the screen, making the virtual image smaller, and the 0 to 8 ramp was positioned in what appeared to be the most uniform part of the waveguide for taking the picture. The exposure of the image adjusted such that the upper “255” white square was about 244 using Photoshop post-processing.
There is a massive step from black/zero to the 1. And then the increases in intensity continue to be too large for the rest of the steps from 1 to 8. In the source test pattern, there are no noticeable intensity steps from 0 to 8. Even the step in the original source test pattern from 8 back to 0 is barely noticeable whereas it is a large step in the HL2’s image.
One more thing to notice is that there is a diagonal ramp that should be a single color (see green arrow above the 3 in the HL2 picture — you will need to look at the larger version to see it clearly). These ramps are visible.
HL2 Eleven Brightness Levels – ~538 nits in the Center
Because of the horrible uniformity of the HL2 and the laser scanning process’s flicker, it was not possible to get an accurate nit reading with a simple light meter. It was possible with a camera using a long exposure to make some very rough estimates about the intensity at the center of the display. Microsoft claims the HL2 has 500 nits, which appear to be about right for the center of the display based on some long exposure camera readings.
The HL2 has 11 brightness levels, and there is an up and down button to adjust the brightness level. Using a camera, a very roughly estimate the center area of the display’s brightness of the 11 levels are 16, 38, 59, 91, 113, 140, 221, 291, 360, 414, and 538 nits. Most of the pictures used in this series were taken with the brightness three levels below the maximum (very approximately 291 nits). This brightness level brightness seemed reasonably comfortable and appeared to be about a little brighter than a 200 nit computer monitor (used to also “calibrate” the camera exposure readings).
The HL2 with 538 nits is noticeably brighter than the Hololens 1, which this blog reported as being 320 nits (cd/m2) in Magic Leap, HoloLens, and Lumus Resolution “Shootout” (ML1 review part 3). The HL2 also blocks about 60% of the real-world light (as measured by a meter). With the HL2’s 60% light blocking, about 500 nits should be enough for most indoor uses even in brighter lit areas. But the HL2 is about an order of magnitude too low in luminance for used outdoors in bright daylight.
Close Up Look at Green Ramp
On the images below, the Green-Only Test-Pattern was put up in a large virtual window, and then the HL2 was moved close enough such that about the top 1/3rd of the test pattern filled the whole display. Virtually zooming in has the effect of making each step in pixel value in the ramp to be many pixels wide.
Once again, all the effects pointed out below can be seen by a human. These pictures were taken because I was seeing these strange effects and decided to document them. The simplicity of the patterns allowed the eye to detect that something was wrong.
The picture below of the HL2’s display is annotated in Blue and Yellow and with a rectangular (left) and oval (right) magnified inserts. This particular picture was taken with the HL2 brightness set 3 levels below its maximum. Pixel values of 0 through 12 are labeled on the right side of the gray ramp.
Even though the image calls for a pure green ramp, there is some red mixed into the green, as discussed in the section “Red and Blue Colors in the ‘Green Ramp'” above. In the next picture, the red and blue have been filtered out to show only the effects of the green lasers.
Perhaps the most interesting thing to see in this image is the “stairsteps” inside the smaller yellow dashed rectangle. The test pattern image in each vertical stripe calling for the same value yet the HL2 is putting out two different values.
The stairsteps move and change as a function of moving the HL2. Note that the stairsteps result in a wider darker area at the top. As will be seen in the red filtered picture, the red ramp stairsteps such that the center is darker (in red) at the bottom of the image while the green brightness is trying to be constant. Maybe there is a reason for this, but it makes no sense to me.
You might be able to make out the dashed lines in green, but it might take looking at the larger image (click on the image below to see the larger image) to see them. It turns out these dashed lines are in red, green, and blue, but are more evident in red.
In the next image has been filtered to show only the red laser’s effect. Interestingly, there a relatively high percentage of red at the dark end of the green-only ramp, to the point where the dark colors look yellow (green+red) and even reddish.
The red filtered image more clearly shows the red “dashed lines,” particularly in the center of the oval insert. Also, in the oval insert, you should note the diagonal brightness change within a single. The red has more prominent stairsteps on the left (highlighted by the larger dotted yellow rectangle) and right.
Another issue shown in the red filtered image is diagonal steps within a color value. At the top of the display, the diagonals are fairly sharp (look under the 6, 7, and 8) and they are fuzzier in the middle of the image.
Below is the same image filtered to show only the blue component. In this channel, the blue levels have been increased by about 3.5 stops to make them more visible.
Interestingly, the blue laser is slightly on in step 0 but totally off on step 1. The zero/black level is at a subthreshold level that emits light whereas whatever is computing step 1 sets it to a lower current level completely turns off the light output. Having the step 1 go to “blacker than black” causes an inconsistent turn on for step 2.
Setting HL2 at its Brightest and Darkest Settings
After shooting the Max-3 picture, the camera and HL2 were kept in the same position to shoot pictures with the HL2 at full/max brightness and at its minimum setting. In the photos below, the camera exposure and some post-processing in photoshop were used to adjust the overall brightness to be about the same as the Max-3 picture.
At the Max setting (below left), the image overall behaves similarly to the setting 3 levels down. There are the same artifacts of dashed lines, diagonal steps, and the stairsteps as the Max-3 case, but they are in slightly different places.
At the lowest brightness setting (below right), the HL2 simply gives up on the darker colors. The first thing is that level 1 disappears entirely. The effect of the red turning on too much changes the color of the ramp from green to yellow and red.
In the triangular area at the top from a level of 0 to 3 where the image is red, the green laser is visibly off (see green-only picture below left). Increasing the exposure by three stops, revealed that the HL2 was actually emitting some “subthreshold” green light for steps 0 and 1 but then is going visibly off for the steps 2 to 5 in the triangle at the top (see bottom right) just we saw with the blue laser in the Max-3 image.
Approximating the HL2’s Input to Output Response
The first image below is a crop from the HL2’s gray ramp picture (at Max-3 brightness) toward the beginning of this article. The dotted blue line shows a somewhat trapezoidal area that is brighter and more uniform. This picture has been scaled to be the same size as the test pattern.
The second image shows the original gray ramp test pattern with an inset of the original Elf image from a different test pattern.
In the third image, the original test pattern and Elf images above have had their responses modified (in Photoshop), roughly matching the HL2’s photograph. The inset curve on the left shows the compensation curve (in red) that was used, and the picture of the HL2’s Elf image is copied on the right for comparison.
The fourth image shows the same gray ramp from the Hololens 1 (LCOS based), which, while having color uniformity issues due to the waveguide, has a reasonable looking gray ramp.
Issues with Controlling Laser Intensity
Shown below are current versus light output (milli-Watts for lasers and Lumens for the LEDs) for a green laser diode (left) and a green LED (right) from OSRAM. While the exact values will vary from device to device, the general shape of these curves are typical for lasers and LEDs.
As shown above, Lasers have response characteristics that make controlling them inherently them much more complicated than LEDs. Lasers have a threshold current before which they won’t lase/output-light. But then once the threshold is reached, the intensity increases sharply with incremental current input. For LBS, the lasers are driven just below the threshold for “black,” or else they will take too long to turn on when necessary.
A small relative change in current can cause the lasers to go from outputting no light to significant amounts of light. On top of this, the threshold changes with the temperature of the laser. The heat of the laser is going to be affected by its self-heating caused by image content.
The threshold and sharp curve mean there is little difference in current between the laser being off and much brighter than it should be for the darker end of the gray ramp. Because the pulse widths are already very narrow due to the time of a pixel in the scanning process, there is not much opportunity for time-based control such as pulse width modulation to control perceived intensity.
Part of the intensity range must be reserved for compensating for the continually changing beam velocity as it speeds up in the center of the display and slows down on the outsides. On top of everything, there is the overall brightness control.
Summarizing Lasers Modulation Issues
- Modulation requirement due to the human visual response
- Should have more than 4,000 levels (12-bits) to support typical 8-bits “gamma” encoded
- The brightness range on the HL2 is about 538/16 = 33 (or about 5-bits)
- Come at the expense of image control as seen dramatically at the lower brightness levels
- Color Correction of oversaturated primary colors
- Lasers have to compensate scanning speed of lasers
- At left and right sides the lasers are moving at low speed and thus must be dimmed
- Laser thresholds result in both missing pixels and pixels being too bright
- Changes in content will cause temperature changes affecting thresholds
- Sharp nonlinear laser response varying with temperature
- Influenced by a multitude of variables including image content
- Small changes in thresholds cause a significant difference in brightness
- The short duration of pixel times leave little room for pulse width modulation
Some of the issues above could be mitigated somewhat by adding some form of secondary modulation to the laser in the form of an electronically controllable neutral density filter. This would certainly help with the brightness control, but it is easier said than done with lasers and would likely hurt the efficiency.
The HL2 is only talking a dynamic range of about 33X from approximately 16 nits to 538 nits. But 16 nits is too bright for nighttime use and 538 nits is an order of magnitude too dim for bright daylight use. While the HL2 struggle with a 33:1 dynamic range of brightness, DLP has even been able to achieve a 5,000:1 range, from 15,000 nits down to 2.8 nits, in an automotive HUD application using LEDs.
LCOS and DLP perform light modulation by the display device and then control the LED illuminator for the overall brightness, effectively separating issues 1 and 2 on the list above. And they simply don’t have issues 4 through 7 and issue 3 is less of an issue with LEDs. The liquid crystal response curves have a natural “gamma” to them that makes it easier for LCOS to modulate at low light levels. In the case of the DLP with LEDs, it also uses the LED control as part of the intensity modulation.
MicroLEDs will end up putting the burden of modulation brightness control on the device itself, so it will have to address issues 1 to 3 above. It has the advantage of not having the laser’s lasing threshold in its response curve. But I would suspect that it would need some form of “secondary modulation” such as an electronic neutral density filter to support a daytime to nighttime brightness range.
Simply put, the HL2’s intensity control is horrible by today’s standards (or frankly, 1960s color TV standards). The HL2 fails spectacularly such as with a solid green ramp turning red.
The intensity response of the HL2 is extremely poor. The gray ramp response at the low end is worse than any other display can remember that was not broken. It steps by too much through the darker colors which tend to make the images look washed out. The one good thing about the gray responses is that it is monotonic (never seems to make a backward step).
The fundamental problem is that the whole burden of controlling the intensity for pixel modulation, color correction, and overall brightness is put onto the laser, which has a highly non-linear transfer function. In the case of the HL2, it fails miserably.
It is a whole different subject as to whether the HL2 can be used for some indoor enterprise applications where there may be no need to decent color control and nor a wide range of brightness. Even if the LBS display engine could generate decent color control, it would be hard to notice due to the issues with the waveguides themselves. But one would have to be delusional to think someone would want to look at photographs or watch a movie.
Appendix: Short Background On Display Output and Gamma
Human vision is non-linear and sees intensity (and color) in terms of relative difference. It takes a much larger linear difference in intensity between two bright colors than it does between two dark colors. While the basic concept of “gamma” encoding goes back to the days of analog TV, today gamma works as a sort of intensity compression to better utilize a fixed number of digital bits.
The intensity level near zero need to change very slowly, or there will be perceptible steps in a smoothly shaded image. If a display’s intensity increases too rapidly through the lower values, it causes the image will cause the image to look too bright and washed out looking with little perceived changes at the bright-end.
Cambridge in Colour has an excellent short article titled “UNDERSTANDING GAMMA CORRECTION” that explains how and why just about all display devices use gamma curves. The image on the right shows their example of what would happen with linear versus gamma encoding.
Simple Gamma is given by the formula Out=Ingamma. The most common Gamma used is 2.2, but different formats use different variations. The simple gamma curve is too flat near zero compared to human vision, and it wastes levels to get a perceptible difference. Therefore various encoding formats, including sRGB and HDTV, modify the curve using a linear section at the beginning of the curve. For example, the formula for sRGB is given by:
If 0 ≤ In ≤ 0.04045 Then Out = In/12.92; IF 0.04045 < In ≤ 1 Then Out = In2.4
At the low end, the sRGB is starting at 1/12.92th, the slope of evenly stepped values. The graphs below show how the sRGB format differs from a simple 2.2 Gamma, which would start with even lower values. Over the whole range, it matches 2.2 Gamma well. But as the right-hand graph shows, at the low end, the sRGB’s linear (with a 1/12.92 slope) staring function makes a significant difference.
Note that while even at value 32, the linear/unity encoding would be 32/255=~12.5%, where the sRGB output less than 1.5% of full white.
Subtle changes in dark areas are important to the perceived quality of photographs and movies. A 2015 study of movie content showed that the average display luminance (ADL) of a large selection of movies was only about 8% of full white (about 80 out of 255 levels on an sRGB or Gamma 2.2 curve). Not that I think anyone would want to watch a movie on an HL2 🤣.