Project 1: Colorizing the Prokudin-Gorskii photo collection
Sergei Mikhailovich Prokudin-Gorskii (1863-1944) photographed various scenes of the Russian Empire in the early 1900s. He recorded three exposures of every scene onto a glass plate using a red, a green, and a blue filter. At the time, it wasn't possible to capture color images, let alone print them. His RGB glass plate negatives, capturing the last years of the Russian Empire, survived and were purchased in 1948 by the Library of Congress.
Overview
The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. In order to do this, you will need to extract the three color channel images, place them on top of each other, and align them so that they form a single RGB color image.
Alignment Algorithm
First, I tried to align the images using a naive approach. I separated the original image into thirds to get the red, green, and blue channels of the image. I then cropped the image channels by 10% on each side and aligned them using normalized cross-correlation. To do this, we search over a displacement window of size 15x15 centered on the center of the image for both the red and blue channels with respect to the green channel, scoring all of the alignments with normalized cross-correlation (every combination of x and y between -15 and 15). I then stacked the aligned channels on top of each other to create a color image. This approach worked well for small images, but for larger images, it was too slow and the results were not very good. I then implemented a more comprehensive alignment algorithm using a multi-scale approach, where we first align the images at a lower resolution with a smaller displacement window and then progressively align the images at higher resolution, making sure to scale up the "best displacement" from the coarser images to the finer images. This tool is also called an "image pyramid". I reduced the search window for coarser images to 75% for the second pyramid level and 50% for the third pyramid level. Otherwise, all methods were the same as the single alignment approach. This approach worked much better for the larger images and was much faster. Both of these algorithms also have an option to use euclidean distance as a scoring metric instead of normalized cross-correlation. In my experience, both of these methods gave me the same image offsets. I decided to use NCC for all of the results below.
Single Align Photo Results
Here are the results of the single alignment algorithm using NCC (Normalized Cross Correlation). For each image, the left shows the unaligned version (original channels simply stacked) and the right shows the aligned result. The offset values are listed below each aligned image.






Pyramid Align Photo Results
Here are the results of the pyramid alignment algorithm using NCC (Normalized Cross Correlation). A 10% border crop was used before image alignment, and 4 levels of image pyramid were used. The finest image pyramid level used a search window of size 15x15, the second level used a search window of size 11x11, and the last two levels used a search window size of 7x7. For each image, the left shows the unaligned version (original channels simply stacked) and the right shows the aligned result. All tif images were converted to jpg for web display, but the original aligned tif images are avaliable in the website git repository. The offset values are listed below each aligned image.




























Extra Photo Results
We were also asked to download four more images and align them using the same methods. Here are the results for the images that I chose, again with unaligned and aligned versions. The aligned images were aligned using the same pyramid alignment algorithm as the other images. The offset values are listed below each aligned image.







