a very incomplete and biased sampling of cvpr2011 papers and thoughts
This is a set of interesting papers in cvpr2011 as well as my thoughts after seeing these papers. My selection of the papers and comments is very biased because I am much more interested in producing nice pictures than understanding their contents, and I wasn’t able to attend some of the posters/orals because there’s time overlap/ I was doing my own poster/ I didn’t want to squeeze into that terrible crowd around the poster…
Read the rest of this entry »
Variational Bayes approach to kernel estimation
Reference:
[1] James Miskin , David J.C. MacKay, Ensemble Learning for Blind Image Separation and Deconvolution, In Adv. in Independent Component Analysis, M. Girolani, Ed. Springer-Verlag.2000.
[2] R. Fergus et al., Removing camera shake from a single photograph, SIGGRAPH2006.
[3] O. Whyte, J. Sivic, A. Zisserman, and J. Ponce, Non-uniform deblurring for shaken images, CVPR2010.
[4] C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[5] F. Durand, A. Levin, Y. Weiss, and W. T. Freeman, Understanding and evaluating blind deconvolution algorithms, CVPR2009.
[6] A. Levin, Y. Weiss, F. Durand, W. T. Freeman. Efficient Marginal Likelihood Optimization in Blind Deconvolution, CVPR2011.
The standard formulation for blurring is:
,
where J is the observed image, I is the latent sharp image, K is the kernel, and is the noise.
In [1] and [2] the latent image I and kernel K both follows GMM distribution, and the noise is Gaussian. Reference [3] uses a slightly different formulation of convolution, but it also ends up with a bilinear representation of blur with sparse prior on kernel and image parameters.
Although it’s straight forward to write out the conditional distribution as
.
As [5] shows, the marginal distribution over K i.e. typically gives a better estimation of K than MAP estimate. However, with sparse prior on both I and K, this distribution is intractable to estimate. Levin solves this problem with Laplacian approximation, while references [1]-[3] use variational Bayes approach[5].
MAP_K approach to kernel estimation under Gaussian prior
The following note excerpts from “Understanding and Evaluating Blind Deconvolution Algorithms” by Anat Levin.
Let ,
and
be the Fourier transform of original image
, blurry image
and blurring kernel
.
Assuming Gaussian prior on , i.e.
,
where .
Here and
are Fourier transform of image derivative operators
and
.
Then also follows a Gaussian distribution:
The log likihood of $Y_w$ is therefore
where is a constant.
The second term is the likelihood when takes its MAP estimate, and the log term is proportional to the volume of X.
If no assumption is made about the kernel, this constrains the squared MTF of the kernel since the log likelihood is maximized when
(1)
Note that squared MTF of a signal is its spatial auto-correlation. ThereforeEq.(1) can be interpreted as “The auto-correlation in the (denoised) blurry image is the convolution of kernel auto-correlation and sharp image’s auto-correlation.”
However, it also shows that at least under Gaussian prior:
- Levin’s approach does not seem to provide enough constraint on free form kernels. Additional prior on kernel (e.g. non-negative, continuous) might be necessary. However, if the kernel is limited to certain classes, e.g. disk, box filter, the approach seems to be sufficient.
- Although it might not be important to use L-2 norm rather than L-1 norm, the choice of prior $G_w$ is essential.
Matlab+X11的诡异问题
我的设置肯定没有问题,因为xclock确实显示在本地。但是开matlab,一模一样的display设置,窗口每次都是显示在remotehost端,无法理解⋯⋯
算了,在大屏上看figure也挺好的⋯⋯
好吧⋯⋯最后解决了⋯⋯打开matlab的时候用
matlab -noawt
原来用了
matlab -nodesktop
所以figure不是用X11的…
g++和vnl_matrix_inverse的兼容问题
今天终于完全figure out了TA的过程当中遇到的vnl_matrix_inverse在我机器上报private copy constructor错误的问题。看起来似乎是个语法问题,但vxl的文档写明了
x = vnl_matrix_inverse(A)*b;
和
x = vnl_svd(A).solve(b);
是等价的。
后来发现是因为我的g++编译器版本是4.2.1,把g++版本换到4.3.5以后就能通过编译了。
matlab in textmate + iterm + applescript
不知道什么时候生出来的变态想法,不想好好地用matlab自带的desktop+editor,而想用textmate+iterm的组合。我觉得一定程度上是因为感觉上一个文本编辑器+terminal的组合很lightweight,也许就是纯粹洁癖而已。如果一定要使用的原因,那么就是textmate有bundles,啥重复性的任务基本都可以(在写了script后)快捷键一把。
现在还没有100%地搞定,先写一下大致的思想,以后慢慢补全吧。
俺需要的功能,主要就是textmate和iterm的交互了。
- 打开matlab并进入项目主目录。
这个用applescript就能做,其实手动也可以的。唯一的trick是iterm打开的时候要开好x11,否则matlab的graphics没法显示。
补充一下网上可以搜到一个叫pos的utility,可以从finder打开terminal,并cd到当前目录。这个很好用的说。
- 从textmate上进行断点调试,并用bookmark来标记/取消断点。
这个地方有点tricky。俺的想法是从textmate给iterm window 发送dbstop/dbclear等命令,并同时trigger bookmark当前行的快捷键。嗯,很dirty…
比较郁闷的是textmate添加和取消bookmark是一个命令,但是matlab添加和取消断点是不同命令。现在想到个很土的hack是写个函数来读断点状态以确定call哪个函数…
还没有想到怎么清除一个project里面所有的bookmark…
- 看workspace。
这个不知道怎么弄。
DxO
http://www.dxo.com
一个法国公司,做的方向和俺的很很很相关⋯⋯做学术也做产品。这年头看到学术成果能产品化俺就很赞赏。
btw Nokia这两年用的那个EDoF technique就是买的他们家的,我原来写过这篇blog提到了。
奇怪的现象
在不同物距标定光心,竟然晃动能有40-50pixel,图像的尺寸是2820×1876.
检查了数据好几遍,算法的稳定性挺好的,稍微有点噪声抖动在10pixel以内。直接看数据光心也确实是动了。
不同的物距,magnification 和distortion的弯曲度也是不一样,这倒是不奇怪。
Notes: Optical Imaging and Aberrations
1. Foundations of geometric optics
Light travels with different velocity in different medium. This speed is often characterized by the reflective index :
where
is the speed of light in vaccum and
denotes the (phase) speed in the medium.
Consequently, the time a ray takes to travel from point to
though a path
is proportional to the optical path length (OPL):
Fermat’s Principle verifies that this “time” is stationary with respect to small changes of that path. The term “stationary” means that the actual path taken by the ray is either a maximum, minimum or a saddle point in OPL. More specifically, for two paths and
between which deviation is up to the amount
:
Fermat’s Principle directly gives the three Laws of Gaussian optics:
- In homogeneous medium, by Fermat’s Principle, the ray propagates rectilinearly.
- At an interface of two medium of reflective index
and
, an incident ray is refracted according to Snell Law:
- At an interface of two medium, an incident ray is reflected according to
In Eq.~(3) and (4) and
are angle between incident and reflected ray and surface normal. From the Snell Law, one can see that the OPL of two refracted neighboring rays between two planes that are perpendicular to one or both of them are the same.
Consider a pencil of rays that are along the gradients of OPL, or orthogonal to wavefronts (which is defined as the surface formed by endpoint of paths of the same OPL and starting point). By the Fermat’s Principle, Malus-Dupin Theorem states that these rays remain perpendicular to the wavefront after being refracted.
Hamilton Point Characteristics Function of two points
and
is defined as the OPL between these two points. Derivatives of
at either point is related by the unit ray direction
(at
) and
(at
) by
Here and
are refractive index of medium, and
and
are differential operators, at
and
respectively.
2. Gaussian Imaging
To be continued on:
Gaussian and Newtonion imaging equations, magnification factors, Lagrange invariants, focal length
Cases: spherical refracting surfaces, thin lens, general symmetric refractive lens system
camera control的几个trick
在拍照前要确保写进去的新文件原来不存在,也就是不能overwrite
因为overwrite速度会偏慢,然后程序又送新的命令进去,camera太busy了整个控制程序就崩溃了…
单纯refocus不拍照需要打开liveview,否则不work…
用的program基本和canon sdk给的那个例子一样