Project 2 - Fun with Filters and Frequencies!

By Ethan Chen

Introduction

This project aims to apply image processing techniques like convolving with finite difference operators, Gaussian filters, Gaussian and Laplacian stacks, and blending (with masks).

Part 1: Fun with Filters

Goals

Part 1.1: Finite Difference Operator

In this part, we aim to construct an edge image with a Gaussian kernel. To find the partial derivative in x and y (which show the intensity change in the horizontal and vertical directions, respectively), we first convolve the image with the finite difference operators D_x = [1, -1] and D_y = [1, -1]^T. That is, convolving the image with D_x to get img_gradient_x and convolving the image with D_y to get img_gradient_y. Then, we get the computing the gradient's magnitude by each pixel by calculating $\sqrt{\text{img_gradient_x}^2 + \text{img_graident_y}^2}$. To get the final result of the edge image, we apply a threshold to binarize the magnitudes.

Visually looking at the image, I decided that a threshold = 0.2 looked the best. This highlghted the high frequencies of the image without including too much noise, which would clutter the binarized edge image. Going slightly lower and higher than 0.2 had the tradeoffs of even more noise in the form of thicker stray marks and broken lines, respectively.

Partial Derivative in x

11_partial_x.jpg

Partial Derivative in y

11_partial_y.jpg

Binarized Edge Image

11_edge_image.jpg

Part 1.2: Derivative of Gaussian (DoG) Filter

Now, we can create and use a derivative of Gaussian filter to convolve the image. First, we blur the image with a Gaussian kernel, using cv2.getGaussianKernel. Slide 33 of our lecture on Image Processing II: Convolution and Derivatives shared a rule of thumb of choosing a half-width of 3$\sigma$ so I chose 6 as the multiplication factor so I chose kernel_size = 10 and sigma = 10/6. We can then use the same finite difference operators from Part 1.1. The sigma was chosen after some manual checks. For both images, I chose threshold = 0.07.

Images after initially blurring the cameraman image

Partial Derivative in x

12_partial_x_blurred.jpg

Partial Derivative in y

12_partial_y_blurred.jpg

Gradient Magnitude

12_gradient_magnitude_blurred.jpg

Edge Image

12_edge_image_blurred.jpg

DoG filters

DoG x

12_DoG_x.jpg

DoG y

12_DoG_y.jpg

Images with DoG filter

Partial Derivative in x

12_partial_x_DoG.jpg

Partial Derivative in y

12_partial_y_DoG.jpg

Gradient Magnitude

12_gradient_magnitude_DoG.jpg

Edge Image

12_edge_image_DoG.jpg

These images are much cleaner than the ones in Part 1.1 as we can see that the noise near the bottom of the resulting image are nearly all gone. The edges of the cameraman and the camera and its stand are also now both thicker.

Convolving the Gaussian with D_x and D_y gives us the same result.

Part 2: Fun with Frequencies!

Part 2.1: Image "Sharpening"

To sharpen an image, we first take the original image and blur each individual RGB channel with a low-pass Gaussian filter, again effectively removing high-frequency details like fine edges and textures. Then, we take the difference between the raw image and this blurred one to get the high frequencies. Then, we sharpen the image through unsharp masking by adding back the high-frequency parts, which are first multiplied with a $\alpha$ factor. For all examples below, I blurred with the same kernel_size = 10 and sigma = 10/6 from the earlier parts.

Original Taj Mahal

taj.jpg

Blurred Taj Mahal

21_taj_alpha_075_blurred.jpg

Sharpened Taj Mahal ($\alpha = 0.75$)

21_taj_alpha_075_sharpened.jpg

Sharpened Taj Mahal ($\alpha = 2$)

21_taj_alpha_2_sharpened.jpg

Sharpened Taj Mahal ($\alpha = 5$)

21_taj_alpha_5_sharpened.jpg

Two other examples

Original Campanile

campanile.jpg

Blurred Campanile

21_campanile_alpha_125_blurred.jpg

Sharpened Campanile ($\alpha = 1.25$)

21_campanile_alpha_125_sharpened.jpg

Sharpened Campanile ($\alpha = 2$)

21_campanile_alpha_2_sharpened.jpg

Sharpened Campanile ($\alpha = 5$)

21_campanile_alpha_5_sharpened.jpg

Original Pragser Wildsee

pragser_wildsee.jpg

Blurred Pragser Wildsee

21_pragser_wildsee_alpha_075_blurred.jpg

Sharpened Pragser Wildsee ($\alpha = 0.75$)

21_pragser_wildsee_alpha_075_blurred.jpg

Sharpened Pragser Wildsee ($\alpha = 2$)

21_pragser_wildsee_alpha_2_sharpened.jpg

Sharpened Pragser Wildsee ($\alpha = 5$)

21_pragser_wildsee_alpha_5_sharpened.jpg

We can see that the image sharpening technique performed decently well on all 3 images (the lowest of the 3 $\alpha$s on each image is the best - with some sharpening but not overly done).

Part 2.2: Hybrid Images

In this part, we aim to take the frequencies of a low-frequency image and apply them to a high-frequency image. In the first example, we align Nutmeg onto Professor Derek Hoiem. As expected, the aligned picture of Prof. Hoiem is the same as the original iamge since we're just aligning and adjusting Nutmeg to Prof. Hoeim. We obtain the grayscale aligned images by calling rgb2gray from skimage.color on the corresponding aligned image.

Nutmeg

nutmeg.jpg

Derek Hoiem

DerekPicture.jpg

Aligned Nutmeg

22_nutmeg_aligned.jpg

Aligned Derek Hoiem

22_DerekPicture_aligned.jpg

Aligned Grayscale Nutmeg

22_nutmeg_aligned.jpg

Aligned Grayscale Derek Hoiem

22_DerekPicture_aligned.jpg

Nutmeg and Derek Both Grayscale Hybrid Image

22_nutmeg_DerekPicture_grayscale_hybrid_image.jpg

Failed example

Lion

lion.jpg

Tiger

tiger.jpg

Aligned Lion

22_snowboarder_jumping_aligned.jpg

Aligned Tiger

22_tiger_aligned.jpg

Lion and Tiger Both Grayscale Hybrid Image

22_lion_tiger_grayscale_hybrid_image.jpg

The example above is a failed one because the grayscale hybrid image is not very clear. We can see abrupt and clearly unnatural changes at the bottom of the lion's mane. Its face also seems a bit blurry.

Another example

Snowboarder

snowboarder_jumping.jpg

Skier

skier_jumping.jpg

Aligned Snowboarder

22_snowboarder_jumping_aligned.jpg

Aligned Skier

22_skier_jumping_aligned.jpg

Snowboarder and Skier Both Grayscale Hybrid Image

22_snowboarder_jumping_skier_jumping_grayscale_hybrid_image.jpg

Favorite Example: Frequency Analysis

Snowboarder

22_frequency_img_snowboarder.jpg

Skier

22_frequency_img_skier.jpg

Aligned Snowboarder

22_frequency_snowboarder_aligned.jpg

Aligned Skier

22_frequency_skier_aligned.jpg

Snowboarder and Skier Both Grayscale Hybrid Image

22_frequency_snowboarder_skier_hybrid_image.jpg

Bells & Whistles

Both Color Nutmeg and Derek Hybrid Image

22_nutmeg_DerekPicture_hybrid_image.jpg

Color Nutmeg and Grayscale Derek Hybrid Image

22_nutmeg_aligned_regular_DerekPicture_aligned_grayscale_hybrid_image.jpg

Grayscale Nutmeg and Regular Derek Hybrid Image

22_nutmeg_aligned_grayscale_DerekPicture_aligned_regular_hybrid_image.jpg

Both Color Snowboarder and Skier Hybrid Image

22_snowboarder_jumping_skier_jumping_hybrid_image.jpg

We can see that using both color aligned images of nutmeg and image of Prof. Hoiem generated and color aligned image of nutmeg and grayscale aligned image of Prof. Hoiem generated decent results - the hybrid image has a nice contrast and retains the color of Prof. Hoiem's face while including the high frequencies (edges) from nutmeg. The hybrid image of the snowboarder and skier is also clear with some traces of the frequencies of the snowboarder jumping.

Multi-resolution Blending and the Oraple journey

Part 2.3: Gaussian and Laplacian Stacks

To set up for the multiresolution blending in Part 2.4, we will first obtain the Gaussian and Laplacian stack of the apple image and orange image individually. For both stacks, we go to num_levels = 5.

To get the Gaussian stack, at each level, we blur the image from the previous level (at the first level, we start with the original image) with the Gaussian filter, like before. For both images, I used sigma = 4 and kernel_size = 6 * 4 = 24.

To get the Laplacian stack, we take the delta between the previous and current layer in the Gaussian stack and normalize this delta. At the end of the stack, we add the last level from the Gaussian stack (no delta here).

Orange

orange.jpeg

Appl

apple.jpeg

Apple Gaussian and Laplacian Stack

23_apple_gaussian_laplacian_stack.jpg

Orange Gaussian and Laplacian Stack

23_orange_gaussian_laplacian_stack.jpg

Part 2.4: Multiresolution Blending (a.k.a. the oraple!)

Now, we can use the Guassian and Laplacian stacks to blend our apple and orange images into an oraple. To blend, we create a blurred mask of the same shape as both input images to create a smoother final image. For oraple, the mask is split vertically, with the left half completely white and the right half completely black, as shown below. Similar to what we do for the two input images, we also create a Gaussian stack for the blurred mask.

Mask for Oraple

oraple.jpg

Recreation Process of Oraple

24_oraple_blended_stack.jpg

Blended Oraple

oraple.jpg

Irregular masks

For irregular masks, we can create them using Adobe Photoshop - there is a convenient selection tool that we can use to highlight the entire image and automatically get a selected region to create a mask from. Then, we can export this mask. Note that we should export the mask image from Adobe Photoshop as JPG so our image has 3 channels. Exporting as PNG will give use 4 channels, which is inconsistent with the input images like orange and apple.

In the first example, we blend Steph Curry's face onto LeBron James's. I chose to not do this the flipped way because the Curry's face is larger than LeBron's in the aligned images so LeBron's face would less clear in the blended image. The quality of all 5 images below is not high. We will use higher quality examples below to better illustrate the results.

LeBron James

lebron.jpg

Steph Curry

curry.jpg

Aligned LeBron James

lebron_aligned.jpg

Aligned Steph Curry

curry_aligned.jpg

Mask of Aligned Steph Curry

curry_aligned_mask.jpg

Blended Image of Steph Curry's face onto LeBron James

blended_curry_lebron.jpg

Barcelona Skyline

barcelona_skyline.jpg

SF Night Sky

sf_night_sky.jpg

Aligned Barcelona Skyline

barcelona_skyline_aligned.jpg

Aligned SF Night Sky

sf_night_sky_aligned.jpg

Mask of Aligned Barcelona Skyline

barcelona_skyline_aligned_mask.jpg

Blended Image of Barcelona's Skyline with SF Night Sky

blended_barcelona_skyline_sf_night_sky.jpg

The blended image above has rather abrupt changes from bright buildings to ones in the night time. Also, the dark patches in some of Barcelona's city view are awkward. The shadow makes it feel like some clouds are hovering over Barcelona's buildings near the blurred edge.

Barcelona Night Sky

barcelona_night_sky.jpg

SF Night Sky

sf_night_sky.jpg

Aligned Barcelona Night Sky

barcelona_night_sky_aligned.jpg

Aligned SF Night Sky

sf_night_sky_aligned.jpg

Mask of Aligned Barcelona Night Sky

barcelona_night_sky_aligned_mask.jpg

Blended Image of Barcelona's Night Sky with SF Night Sky

blended_barcelona_night_sky_sf_night_sky.jpg

Here are the Gaussian and Laplacian stacks for the Barcelona and SF night sky images.

Barcelona Night Sky Gaussian and Laplacian Stack

24_barcelona_night_sky_gaussian_laplacian_stack.jpg

SF Night Sky Gaussian and Laplacian Stack

24_sf_night_sky_gaussian_laplacian_stack.jpg

The blended image now has less sudden changes from one image to another, besides the Bay Bridge.

Reflection

I enjoyed working on all parts of this project, specifically the image sharpening part because it brought out some details of the image that I wouldn't have noticed at first. It also feels intriguing to see algorithms being able to extract details that don't seem possible to do so when looking at the blurred image with a naked eye. It was also fun to choose my own images for multiresolution blending.