That magical night of perfect seeing and perfect placement of that long-sought astronomical object has arrived and you do not have to work tomorrow. You head off to the telescope for some imaging. The night could not have gone better and you now have several gigabytes of AVI files captured with your webcam ready for processing. RegiStax munches the AVI files to produce what initially appear to be fabulous images. Wavelet-sharpening, RL-Deconvolution, and some high-pass-filtering goes well. Just one more step before you amaze the world with your perfect images: curves/levels adjustments in PhotoShop. “Yikes! Where did all of these rings and bars come from when I try to bring out some detail from the darker parts of my images?”

Greetings from the land of digitization errors!

“But, I could see all the detail through the eyepiece before I fired up the camera and I was watching the screen before the capture. The image was stunning. There was no twinkling from poor seeing and no real noise was visible at all. It was perfect data!”
Yes, too perfect.

Image stacking, like the sort RegiStax does, is used to improve signal to noise. It is also capable of sorting through the images to find and stack only the best ones to overcome distortions due to poor seeing (atmospheric turbulence), but that is a subject for another essay. Yes, stacking multiple images of the same object is a great way to reduce noise in astrophotography. It can also increase the dynamic range in the final image. Before we go on, we need to (mis)define a few terms.

Image Stacking is a technique of improving an image through the combination of multiple images, often hundreds. There are many ways to combine the images and different reasons for doing things a particular way. For now, just think of image stacking as averaging many images of the same thing.

Bit Depth is the number of bits your camera produces in each color (typically ranges from 8 to 16 with 8-bits being common with webcams). It is also used to define how many bits are used in the stacking of images (typically this is either 16 or 32), the bit depth of the accumulation buffer.

Signal to Noise Ratio (SNR) is exactly as it sounds, the ratio of the signal level to the noise level. It is a measure of the quality of the image – or more specifically the quality of the parts of the image that have that signal. In photon limited situations, bright areas frequently have a higher SNR than the dark areas.

Dynamic Range can be defined a couple of different ways. One way is just 2 raised to the power of the bit depth. So, an image with a bit depth of 8 has a dynamic range of 2**8 or 256. This can be somewhat misleading though. More typically, the dynamic range is defined as the ratio of the brightest signal to the noise level or minimum usable signal level. In images with little or no noise these two definitions result in the same value.

Minimum Usable Signal is often defined as the signal level where the SNR is equal to some (arbitrary) value (often the value of 1 is used).

Digitization Noise (or Error) is the inability of a digital system to represent changes in signal that are less than one bit. There is no such thing as 1.5 counts. Digital signals from imaging systems are almost always integers. Any signal in between the integer steps gets truncated to the next lowest integer. The resultant image has obvious discrete stair-steps in brightness. Sometimes this is called Posterization.

Digitization Level is the integer step in image brightness equivalent to a change of one count in the digitized image.

Digitization errors are frequently overlooked when considering how to set up an imaging system. They should not be though since there are ways to overcome them. I’m not talking about trading that 8-bit webcam in for a multi-thousand-dollar 16-bit imaging system either. One way I’m not going to talk about is extending the dynamic range to larger/brighter signal levels by the High Dynamic Range (HDR) technique. The HDR technique relies on the combination of multiple images with different exposures to add detail from short exposures to areas in the longer exposure images where those images are saturated. Instead I’d like to focus on pulling detail out of the dark areas of our images with a set of properly taken single-exposure images.

I’ve yakked a lot already and have not even mentioned the image above. The image is a set of panels that I’d wish to use to demonstrate how stacking works and, more importantly, what properties your images need to have in order to maximize the benefits of image stacking. So, let me describe the panels in the image above. The upper left image is what I’ll call “the truth” or what we’re trying to image really looks like. Ideally it would be a noiseless image with infinite bit depth. In this example it is just an 8-bit deep region of the Moon near the terminator. Because “the truth” has finite bit depth and limited dynamic range, I need to create some data from a less than perfect camera. To do that I’ve decided my camera is going to be a 4-bit monochrome camera. Images from it will have 16 (2**4) levels from 0 to 15. You can think of this as just the portion of an image from an 8-bit camera that all falls within the lowest 4-bits. Moving to the right along the top row (row 1), I have added different amounts of noise to “the truth” and digitized it to a bit depth of 4. The noise was normally distributed (Gaussian noise) with standard deviations of 0.0 counts (noiseless other than digitization errors), 0.2 counts (noise rarely exceeding 1 count), 0.5 counts (something I’ll come to call optimal noise level), and 8 counts (this is an excessive amount of noise used only to demonstrate the power of stacking [on the dark side ;-) ]) as you go along the top row from columns 2 through 5 respectively. The image in row 1 column 2 might be thought by many to be the perfect image because it has no additional noise. Looking at it in comparison with “the truth” though shows the effects of digitization errors. The vertical banding in the image with little to no variations captured from the subtle changes seen in “the truth” is a result of those real variations falling in between the bits. Image from row 1 column 3 has a little bit of noise and the banding from digitization errors is still visible. Column 4 has enough noise that the image looks like someone printed out “the truth” on a printer that dithers patches with black and white dots (sort of like the way photos were printed in newspapers). Squint or get far enough away and it looks very much like “the truth”. The rightmost column, column 5, has so much noise that it is almost impossible to tell it contains any information from “the truth”.

The images in the rows below the top row are the result of stacking several images with the same level of noise as that image in the top row. The level of noise is the same, but a different, random pattern of noise is applied to each image before digitization (to 4 bits) and stacking. Row 2 is the result of stacking 4 images, row 3 is a stack of 16, row 4 is a stack of 256, and the bottom row, row 5, is a stack of 4096 images. Because there is no added noise, all of the images in column 2 are identical. Stacking does nothing to alter, let alone improve, the images when there is no noise. Stacking requires noise in order to work! In the progression of images in column 3, there is some improvement in the noise level, but even stacking 4096 images cannot remove the vertical banding from digitization errors. The edges of the bands have been softened somewhat and begin to reveal some real detail in their vicinity. The images in column 4 also improve in SNR as more images are stacked. They also show no signs of digitization errors. The vertical banding is gone and the bottommost image (row 5 column 4) is almost completely identical to “the (8-bit) truth”. If you are not convinced stacking does anything useful, column 5 has something to prove. After stacking 4096 images with this excessive level of noise almost all of the detail from “the truth” is revealed, however, the contrast (also the dynamic range) is greatly reduced.

“What is going on here?” Well if you have a set of essentially identical images because they have virtually no noise, stacking them is like averaging one image with a copy of itself. It does not matter how many times you average something with itself. You can only get the original image. Jumping to the 4th column with a noise level of 0.5 counts, this amount of noise is the minimum amount of noise required to eliminate digitization errors via stacking. More noise will also work, but you’ll have to stack more images to get the same SNR as you get with a stack of fewer images with a noise level of 0.5 counts. In this sense, a noise level of 0.5 counts is optimal.

“OK, so why didn’t it work in column 5 with a noise level of 8 counts?” I cheated, but for the good reason of making an important point! “The truth” represents not only reality, but also the proper camera settings for taking images. The image I selected for “the truth” was adjusted so that my 4-bit camera would produce data that was properly exposed. The brightest pixels would be exposed to 95% of full scale. The darkest area also was offset from zero by 2 counts in our 4-bit camera. Looking at your images, and preferably the histogram distribution of pixel values, is essential to avoid other problems and artifacts in the stacking process. 95% of full scale with a minimum offset of at least 2 counts is nearly ideal for a noise level of 0.5 counts. Column 5 is an example of what happens when these values are set improperly for the level of noise present. Before digitization, “the truth” plus the noise in each image resulted in the range of pixel values extending well beyond the 0-15 range our 4-bit camera outputs. Our camera cannot reproduce anything outside the 0-15 range. As a result anything less than 0 becomes 0 and anything greater than 15 becomes 15. When these truncated images are averaged, the statistics are skewed. The black pixels with positive noise have no blacker pixels (black plus negative noise) to average against. You are only averaging positive noise and so the black pixels from “the truth” appear gray in the stack. Similarly, the brightest pixels can’t get any brighter and their average in the stacking process is only averaging negative noise and the result is less than white. So, the final image has lower contrast and a suboptimal dynamic range even though it has a high SNR.

In column 3 with a noise level of 0.2 counts, the data does not have enough noise to eliminate the digitization errors. There is enough noise to help near the transition from one count level to the next, but the amount of noise does not allow this statistical process to reach all the way to the middle of the gap between digitization levels. You can get to 1.2, for example, but not to 1.5. The limited noise does average down quickly though.

In theory, the image in row 4 column 4 should be equivalent to “the truth”. Stacking 256 images should give you a factor of 16 (square root of 256) improvement in dynamic range, or 4 more bits. Because the information content of “the truth” in my example data is limited to 8 bits, there should be no further improvement by stacking additional images. And in fact, the bottom two images in column 4 are identical. Since stacking has added new levels of digitization between those present in the 4-bit version and the result has decreased the noise to a comparable level, the dynamic range, by either definition, has increased as well. The increase in dynamic range is dependent upon proper exposure of the images relative to the amount of noise present, however, and is not guaranteed.

“So, you are telling me that I need to set up my camera so that I always have some visible noise?” No, I’m not. There are objects/scenes that do not have a high dynamic range requirement. In particular, you are not always interested in accessing detail in the deep shadows or darkest portions of your image. In those cases, you probably want to minimize your noise and concentrate on the ability of the stacking process to reduce the effects of seeing (twinkling).

Bottom Line: You need some noise to get any improvement in SNR from the stacking process. There is an optimal amount of noise, half a count standard deviation, that will minimize digitization errors. Whatever level of noise you select, you need to set the camera properties (gain and offset if available) so that the range of data including noise falls within the digitization range of the camera. Don’t run the histograms to the left and right limits! With proper exposure settings, the minimum stack size should be 256 images – no less – to get a 4-bit improvement in “quality of the stacked image” and a lot more under typical (less than optimal) circumstances.