用 NumPy 进行图像处理

I recently had to computationally alter some images, an ended up getting interested in some of the basic image manipulation techniques. The result is this post.

In python, there are a number of powerful libraries that make image processing easy, such as OpenCV , SciKit-Image and Pillow . For anyone thinking about doing serious image processing, they should be the first place to look.

However, I am not planning on putting anything into production. Instead the goal of this post is to try and understand the fundamentals of a few simple image processing techniques. Because of this, I am going to stick to using numpy to preform most of the manipulations, although I will use other libraries now and then.

Let's start by loading our libraries

In[1]:

import numpy as np import matplotlib.pylab as plt %matplotlib inline

And loading our image

In[2]:

im = plt.imread("BTD.jpg") im.shape

Out[2]:

(4608, 2592, 3)

We see that image is loaded into an array of dimension 4608 x 2592 x 3.

The first two indices represent the Y and X position of a pixel, and the third represents the RGB colour value of the pixel. Let's take a look at what the image is of

In[3]: def plti(im, h=8, **kwargs): """ Helper function to plot an image. """ y = im.shape[0] x = im.shape[1] w = (y/x) * h plt.figure(figsize=(w,h)) plt.imshow(im, interpolation="none", **kwargs) plt.axis('off') plti(im)
用 NumPy 进行图像处理

It is a photo of a painting of a dog. The painting itself was found in a flat in London I lived in may years ago, abandoned by its previous owner. I don't know what the story behind it is. If you do, please get in touch.

We can see that whichever bumbling fool took that photo of the painting also captured a lot of the wall. We can crop the photo so we are only focused on the painting itself. In numpy, this is just a matter of slicing the image array

In[4]: im = im[400:3800,:2000,:] plti(im)
用 NumPy 进行图像处理

Each pixel of the image is represented by three integers: the RGB value of its colour. Splitting the image into seperate colour components is just a matter of pulling out the correct slice of the image array:

In[6]: fig, axs = plt.subplots(nrows=1, ncols=3, figsize=(15,5)) for c, ax in zip(range(3), axs): tmp_im = np.zeros(im.shape, dtype="uint8") tmp_im[:,:,c] = im[:,:,c] ax.imshow(tmp_im) ax.set_axis_off()
用 NumPy 进行图像处理

When using matplotlib's imshow to display images, it is important to keep track of which data type you are using, as the colour mapping used is data type dependent: if a float is used, the values are mapped to the range 0-1, so we need to cast to type "uint8" to get the expected behavior. A good discussion of this issue can be found here here .

In my first edition of this post I made this mistake. Thanks to commenter Rusty Chris Holleman for pointing out the problem.

Colour Transformations

Representing colour in this way, we can think of each pixel as a point in a three dimensional space. Thinking of colour this way, we can apply various transformations to the colour "point". An interesting example of these is "rotating" the colour.

There is some subtleties - a legal colours officially exist as integer points in a three dimensional cube of side lengths 255. It is possible that a rotation could push a point out of this cube. To get around this I apply a sigmoid transformation to the data, a mapping from the range 0-1 to the full real line. Having applied this transformation we apply the rotation matrix then transform back to colour space.

The rotation matrix is applied pixel-wise to to the image using numpy's Einstein notation function, which I hadn't used before but, but make the operation concise. It is explained well in this post .

The following functions apply a sigmoid to the images colour space, and rotate it about the red axis by some angle, before returning the image to normal colour space.

In[7]: def do_normalise(im): return -np.log(1/((1 + im)/257) - 1) def undo_normalise(im): return (1 + 1/(np.exp(-im) + 1) * 257).astype("uint8") def rotation_matrix(theta): """ 3D rotation matrix around the X-axis by angle theta """ return np.c_[ [1,0,0], [0,np.cos(theta),-np.sin(theta)], [0,np.sin(theta),np.cos(theta)] ] im_normed = do_normalise(im) im_rotated = np.einsum("ijk,lk->ijl", im_normed, rotation_matrix(np.pi)) im2 = undo_normalise(im_rotated) plti(im2)
用 NumPy 进行图像处理

Not bad. It looks even more impressive if we continiously rotate the colour of the pixels. We can animate this transformation using matplotlib's FuncAnimation tool. This post has an excellent introduction to how to use it.

In[8]:

from matplotlib.animation import FuncAnimation fig, ax = plt.subplots(figsize=(5,8)) def update(i): im_normed = do_normalise(im) im_rotated = np.einsum("ijk,lk->ijl", im_normed, rotation_matrix(i * np.pi/10)) im2 = undo_normalise(im_rotated) ax.imshow(im2) ax.set_title("Angle: {}*pi/10".format(i), fontsize=20) ax.set_axis_off() anim = FuncAnimation(fig, update, frames=np.arange(0, 20), interval=50) anim.save('colour_rotation.gif', dpi=80, writer='imagemagick') plt.close() # <img src="colour_rotation.gif">

psychedelic.

On the topic of colour, we can also transform the image to greyscale easily. There are a number of ways to do this, but a straight forward way is to take the weighted mean of the RGB value of original image:

In[9]: def to_grayscale(im, weights = np.c_[0.2989, 0.5870, 0.1140]): """ Transforms a colour image to a greyscale image by taking the mean of the RGB values, weighted by the matrix weights """ tile = np.tile(weights, reps=(im.shape[0],im.shape[1],1)) return np.sum(tile * im, axis=2) In[10]:

用 NumPy 进行图像处理

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本