Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

用 NumPy 进行图像处理

$
0
0

I recently had to computationally alter some images, an ended up getting interested in some of the basic image manipulation techniques. The result is this post.

In python, there are a number of powerful libraries that make image processing easy, such as OpenCV , SciKit-Image and Pillow . For anyone thinking about doing serious image processing, they should be the first place to look.

However, I am not planning on putting anything into production. Instead the goal of this post is to try and understand the fundamentals of a few simple image processing techniques. Because of this, I am going to stick to using numpy to preform most of the manipulations, although I will use other libraries now and then.

Let's start by loading our libraries

In[1]:

import numpy as np import matplotlib.pylab as plt %matplotlib inline

And loading our image

In[2]:

im = plt.imread("BTD.jpg") im.shape

Out[2]:

(4608, 2592, 3)

We see that image is loaded into an array of dimension 4608 x 2592 x 3.

The first two indices represent the Y and X position of a pixel, and the third represents the RGB colour value of the pixel. Let's take a look at what the image is of

In[3]: def plti(im, h=8, **kwargs): """ Helper function to plot an image. """ y = im.shape[0] x = im.shape[1] w = (y/x) * h plt.figure(figsize=(w,h)) plt.imshow(im, interpolation="none", **kwargs) plt.axis('off') plti(im)
用 NumPy 进行图像处理

It is a photo of a painting of a dog. The painting itself was found in a flat in London I lived in may years ago, abandoned by its previous owner. I don't know what the story behind it is. If you do, please get in touch.

We can see that whichever bumbling fool took that photo of the painting also captured a lot of the wall. We can crop the photo so we are only focused on the painting itself. In numpy, this is just a matter of slicing the image array

In[4]: im = im[400:3800,:2000,:] plti(im)
用 NumPy 进行图像处理

Each pixel of the image is represented by three integers: the RGB value of its colour. Splitting the image into seperate colour components is just a matter of pulling out the correct slice of the image array:

In[6]: fig, axs = plt.subplots(nrows=1, ncols=3, figsize=(15,5)) for c, ax in zip(range(3), axs): tmp_im = np.zeros(im.shape, dtype="uint8") tmp_im[:,:,c] = im[:,:,c] ax.imshow(tmp_im) ax.set_axis_off()
用 NumPy 进行图像处理

When using matplotlib's imshow to display images, it is important to keep track of which data type you are using, as the colour mapping used is data type dependent: if a float is used, the values are mapped to the range 0-1, so we need to cast to type "uint8" to get the expected behavior. A good discussion of this issue can be found here here .

In my first edition of this post I made this mistake. Thanks to commenter Rusty Chris Holleman for pointing out the problem.

Colour Transformations

Representing colour in this way, we can think of each pixel as a point in a three dimensional space. Thinking of colour this way, we can apply various transformations to the colour "point". An interesting example of these is "rotating" the colour.

There is some subtleties - a legal colours officially exist as integer points in a three dimensional cube of side lengths 255. It is possible that a rotation could push a point out of this cube. To get around this I apply a sigmoid transformation to the data, a mapping from the range 0-1 to the full real line. Having applied this transformation we apply the rotation matrix then transform back to colour space.

The rotation matrix is applied pixel-wise to to the image using numpy's Einstein notation function, which I hadn't used before but, but make the operation concise. It is explained well in this post .

The following functions apply a sigmoid to the images colour space, and rotate it about the red axis by some angle, before returning the image to normal colour space.

In[7]: def do_normalise(im): return -np.log(1/((1 + im)/257) - 1) def undo_normalise(im): return (1 + 1/(np.exp(-im) + 1) * 257).astype("uint8") def rotation_matrix(theta): """ 3D rotation matrix around the X-axis by angle theta """ return np.c_[ [1,0,0], [0,np.cos(theta),-np.sin(theta)], [0,np.sin(theta),np.cos(theta)] ] im_normed = do_normalise(im) im_rotated = np.einsum("ijk,lk->ijl", im_normed, rotation_matrix(np.pi)) im2 = undo_normalise(im_rotated) plti(im2)
用 NumPy 进行图像处理

Not bad. It looks even more impressive if we continiously rotate the colour of the pixels. We can animate this transformation using matplotlib's FuncAnimation tool. This post has an excellent introduction to how to use it.

In[8]:

from matplotlib.animation import FuncAnimation fig, ax = plt.subplots(figsize=(5,8)) def update(i): im_normed = do_normalise(im) im_rotated = np.einsum("ijk,lk->ijl", im_normed, rotation_matrix(i * np.pi/10)) im2 = undo_normalise(im_rotated) ax.imshow(im2) ax.set_title("Angle: {}*pi/10".format(i), fontsize=20) ax.set_axis_off() anim = FuncAnimation(fig, update, frames=np.arange(0, 20), interval=50) anim.save('colour_rotation.gif', dpi=80, writer='imagemagick') plt.close() # <img src="colour_rotation.gif">


用 NumPy 进行图像处理

psychedelic.

On the topic of colour, we can also transform the image to greyscale easily. There are a number of ways to do this, but a straight forward way is to take the weighted mean of the RGB value of original image:

In[9]: def to_grayscale(im, weights = np.c_[0.2989, 0.5870, 0.1140]): """ Transforms a colour image to a greyscale image by taking the mean of the RGB values, weighted by the matrix weights """ tile = np.tile(weights, reps=(im.shape[0],im.shape[1],1)) return np.sum(tile * im, axis=2) In[10]:

Viewing all articles
Browse latest Browse all 9596

Trending Articles