This project aims to apply image processing techniques like convolving with finite difference operators, Gaussian filters, Gaussian and Laplacian stacks, and blending (with masks).
In this part, we aim to construct an edge image with a Gaussian kernel. To find the partial
derivative in x and y (which show the intensity change in the horizontal and vertical
directions, respectively), we first convolve the image with the finite difference operators
D_x = [1, -1]
and D_y = [1, -1]^T
. That is, convolving the
image
with D_x
to get img_gradient_x
and convolving the
image
with D_y
to get img_gradient_y
. Then, we get the
computing the gradient's magnitude by each pixel by calculating $\sqrt{\text{img_gradient_x}^2
+ \text{img_graident_y}^2}$. To get the final result of the edge image, we apply a threshold
to binarize the magnitudes.
Visually looking at the image, I decided that a threshold = 0.2
looked the best.
This highlghted the high frequencies of the image without including too much noise, which
would clutter the binarized edge image. Going slightly lower and higher than 0.2 had the
tradeoffs of even more noise in the form of thicker stray marks and broken lines,
respectively.
Now, we can create and use a derivative of Gaussian filter to convolve the image. First, we
blur the image with a Gaussian kernel, using cv2.getGaussianKernel
. Slide 33 of
our lecture on Image Processing II: Convolution and Derivatives shared a rule
of thumb of choosing a half-width of 3$\sigma$ so I chose 6 as the multiplication factor so I
chose kernel_size = 10
and sigma = 10/6
. We can then use the same
finite difference operators from Part 1.1. The sigma
was chosen after some manual
checks. For both images, I chose threshold = 0.07
.
These images are much cleaner than the ones in Part 1.1 as we can see that the noise near the
bottom of the resulting image are nearly all gone. The edges of the cameraman and the camera
and its stand are also now both thicker.
Convolving the Gaussian with D_x
and D_y
gives us the same result.
To sharpen an image, we first take the original image and blur each individual RGB channel
with a low-pass Gaussian filter, again effectively removing high-frequency details like fine
edges and textures. Then, we take the difference between the raw image and this blurred one to
get the high frequencies. Then, we sharpen the image through unsharp masking by adding back
the high-frequency parts, which are first multiplied with a $\alpha$ factor. For all examples
below, I blurred with the same kernel_size = 10
and
sigma = 10/6
from the earlier parts.
We can see that the image sharpening technique performed decently well on all 3 images (the lowest of the 3 $\alpha$s on each image is the best - with some sharpening but not overly done).
In this part, we aim to take the frequencies of a low-frequency image and apply them to a
high-frequency image. In the first example, we align Nutmeg onto Professor Derek Hoiem. As
expected, the aligned picture of Prof. Hoiem is the same as the original iamge since we're
just aligning and adjusting Nutmeg to Prof. Hoeim. We obtain the grayscale aligned images by
calling rgb2gray
from skimage.color
on the corresponding aligned
image.
The example above is a failed one because the grayscale hybrid image is not very clear. We can see abrupt and clearly unnatural changes at the bottom of the lion's mane. Its face also seems a bit blurry.
We can see that using both color aligned images of nutmeg and image of Prof. Hoiem generated and color aligned image of nutmeg and grayscale aligned image of Prof. Hoiem generated decent results - the hybrid image has a nice contrast and retains the color of Prof. Hoiem's face while including the high frequencies (edges) from nutmeg. The hybrid image of the snowboarder and skier is also clear with some traces of the frequencies of the snowboarder jumping.
To set up for the multiresolution blending in Part 2.4, we will first obtain the Gaussian and
Laplacian stack of the apple image and orange image individually. For both stacks, we go to
num_levels = 5
.
To get the Gaussian stack, at each level, we blur the image from the previous level (at the
first level, we start with the original image) with the Gaussian filter, like before. For both
images, I used sigma = 4
and kernel_size = 6 * 4 = 24
.
To get the Laplacian stack, we take the delta between the previous and current layer in the
Gaussian stack and normalize this delta. At the end of the stack, we add the last level from
the Gaussian stack (no delta here).
Now, we can use the Guassian and Laplacian stacks to blend our apple and orange images into an oraple. To blend, we create a blurred mask of the same shape as both input images to create a smoother final image. For oraple, the mask is split vertically, with the left half completely white and the right half completely black, as shown below. Similar to what we do for the two input images, we also create a Gaussian stack for the blurred mask.
For irregular masks, we can create them using Adobe Photoshop - there is a convenient
selection tool that we can use to highlight the entire image and automatically get a selected
region to create a mask from. Then, we can export this mask. Note that we should export the
mask image from Adobe Photoshop as JPG so our image has 3 channels. Exporting as PNG will give
use 4 channels, which is inconsistent with the input images like orange and apple.
In the first example, we blend Steph Curry's face onto LeBron James's. I chose to not do this
the flipped way because the Curry's face is larger than LeBron's in the aligned images so
LeBron's face would less clear in the blended image. The quality of all 5 images below is not
high. We will use higher quality examples below to better illustrate the results.
The blended image above has rather abrupt changes from bright buildings to ones in the night time. Also, the dark patches in some of Barcelona's city view are awkward. The shadow makes it feel like some clouds are hovering over Barcelona's buildings near the blurred edge.
The blended image now has less sudden changes from one image to another, besides the Bay Bridge.
I enjoyed working on all parts of this project, specifically the image sharpening part because it brought out some details of the image that I wouldn't have noticed at first. It also feels intriguing to see algorithms being able to extract details that don't seem possible to do so when looking at the blurred image with a naked eye. It was also fun to choose my own images for multiresolution blending.