gzr2017 / imageprocessing100wen Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ryoppippi/gasyori100knock

3.6K 97.0 1.1K 18.96 MB

「画像処理100本ノック」中文版本！为图像处理初学者设计的 100 个问题。

Home Page: https://github.com/yoyoyo-yo/Gasyori100knock

License: MIT License

Python 54.45% C++ 31.32% Jupyter Notebook 13.27% HTML 0.97%

imageprocessing100wen's Issues

Question_81_90/README.md: 第24行，多写了一个字母'j'

问题八十一，第24行，多写了一个字母'j':
"在Hessian角点检测中，$\det{H}$将极大点视为j角点。"
上下文没有提到这个j，感觉应该是多余的。

Q25：当放大倍数大于等于2时，出现报错

报错信息如下，原因是y代表的y坐标超过了原图(128*128)本身的大小

解决方案：由于是运算过程中产生可比如127.5最后向上(round)取整得到128，将向上取整改为向下取整(floor)可以解决

向上取整：
向下取整：

answer_10.cpp 中值问题

// get and assign median
std::sort(vs, vs + (kernel_size * kernel_size));
out.atcv::Vec3b(y, x)[c] = (uchar)vs[int(floor(count / 2)) + 1];
既然取的是中值，索引从0开始，不应该+1吧

q4答案缺失

q4答案链接失效日版有，还有后面几个线性滤波的代码kernel的大小都是动态申请的（cpp），一般现在的vc应该都过不了编译把

Page not found.

你好，当我点击“答案”下方的链接时，
答案
Python >> answers/answer_11.py
C++ >> answers_cpp/answer_11.cpp
跳转出来的页面是“Page not found”，所以一直都是点开上面的文件夹查看答案文件的。

ValueError: could not broadcast input array from shape (130,130,3) into shape (128,128,3)

问题集中出现在仿射变换（28-31）中
img[1:H+1, 1:W+1] = img 这一赋值语句，感觉是超出容限的问题，在pycharm中运行源码报错，请教下该如何解决？

关于引用的许可

你好，我写了关于这个题目系列的csdn博客文章，其中引用了你的翻译内容，希望得到许可。
还有，方便留个联系方式（email or qq）相互交流吗？

问题65处的说明的一处错误

算法步骤的步骤二的第4点，代码中写的是x2，x4，x8。
说明中写的是x2，x4，x6。

answer28-31运行成功案例（e.g. answer28）

import cv2
import numpy as np
import matplotlib.pyplot as plt


# Affine
def affine(img, a, b, c, d, tx, ty):
    H, W, C = img.shape

    # temporary image
    _img = np.zeros((H + 2, W + 2, C), dtype=np.float32)
    _img[1:H + 1, 1:W + 1] = img

    # get new image shape
    H_new = np.round(H * d).astype(int)
    W_new = np.round(W * a).astype(int)
    out = np.zeros((H_new + 1, W_new + 1, C), dtype=np.float32)

    # get position of new image
    x_new = np.tile(np.arange(W_new), (H_new, 1))
    y_new = np.arange(H_new).repeat(W_new).reshape(H_new, -1)

    # get position of original image by affine
    adbc = a * d - b * c
    x = np.round((d * x_new - b * y_new) / adbc).astype(int) - tx + 1
    y = np.round((-c * x_new + a * y_new) / adbc).astype(int) - ty + 1

    x = np.minimum(np.maximum(x, 0), W + 1).astype(int)
    y = np.minimum(np.maximum(y, 0), H + 1).astype(int)

    # assgin pixcel to new image
    out[y_new, x_new] = _img[y, x]

    out = out[:H_new, :W_new]
    out = out.astype(np.uint8)

    return out


# Read image
img = cv2.imread("../imori.jpg").astype(np.float32)

# Affine
out = affine(img, a=1, b=0, c=0, d=1, tx=30, ty=-30)

# Save result
cv2.imshow("result", out)
cv2.waitKey(0)
cv2.imwrite("out.jpg", out)

In the description, the morphological operation is dilating before eroding. But in your code, you erode before you dilate. And we have the same problem with open operation.
File: ImageProcessing100Wen/Question_41_50/answers_py/answer_49.py, answer_50.py

answer_32.cpp中傅立叶正逆变换的问题

line 46:val /= sqrt(height * width);
line 73: g = std::abs(val) / sqrt(height * width);
我昨天在做的时候就觉得很奇怪，怎么会能用错误的式子给出正确解答，看到answer_32.cpp就明白了...这愣是把逆变换的中的g = std::abs(val) /(height * width);拆分成了在正逆变换中分别除以sqrt(height * width)的形式...😂

answer_22.cpp中的标准差计算

  // get sum
  for (int y = 0; y < height; y++){
    for (int x = 0; x < width; x++){
      for (int c = 0; c < 3; c++){
        val = (float)img.at<cv::Vec3b>(y, x)[c];
        sum += val;
        squared_sum += (val * val);
      }
    }
  }

  // get standard deviation
  m = sum / (height * width * channel);
  s = sqrt(squared_sum / (height * width * channel) - m * m);

这一部分的均值计算是没有错的，但是，在计算标准差时，怎么能直接把所有rgb值的平方加起来算均值减去m^2？
https://pdfs.semanticscholar.org/fee0/d0c91442d465f343af18b5fce8aeff594d02.pdf
参考文章中STANDARD DEVIATION计算，应该是先计算出均值之后算每个值平方减去均值的平方再开根号。

q27中的公式问题

像素间距离公式有问题

a容易与下面的权重函数中的a混淆
如果我没理解错的话, ax/ay是指x/y方向上的放大倍数, 公式像a*x/a*y, 尤其是在渲染后的pdf中

question31_40文件夹中readme中出错

使用仿射变换，输出（1）那样的$x$轴倾斜$30$度的图像（ $d_x=30$），这种变换被称为X-sharing。
使用仿射变换，输出（2）那样的y轴倾斜$30$度的图像（ $d_y=30$），这种变换被称为Y-sharing。
使用仿射变换，输出（3）那样的$x$轴、$y$轴都倾斜$30$度的图像($d_x = 30$，$d_y = 30$)。

以及底下的两个公式，t_x，t_y都应该替换为d_x和d_y

楼主实现的canny是: p_low属于edge, 仅当他周围有一个p_high.

但实际上的算法应该是: p_low属于edge, 当他通过一系列p_low连接到了某个p_high. (所以会比楼主的算法产生更多的边缘点). 然后这应该是用dfs实现的吧.

def otsu_binarization(img, th=128):
    max_sigma = 0
    max_t = 0
    
    # determine threshold
    for _t in range(1, 255):
        v0 = out[np.where(out < _t)]
        m0 = np.mean(v0) if len(v0) > 0 else 0.
        w0 = len(v0) / (H * W)
        v1 = out[np.where(out >= _t)]
        m1 = np.mean(v1) if len(v1) > 0 else 0.
        w1 = len(v1) / (H * W)
        sigma = w0 * w1 * ((m0 - m1) ** 2)
        if sigma > max_sigma:
            max_sigma = sigma
            max_t = _t

    # Binarization
    print("threshold =", max_t)
    th = max_t
    out[out < th] = 0
    out[out >= th] = 255

    return out

作者v0 = out[np.where(out < _t)]中的out可能是指函数传入参的img吧，不是太理解

代码疑问

在文件https://github.com/gzr2017/ImageProcessing100Wen/blob/master/Question_01_10/answers_cpp/answer_9.cpp

for (int y = 0; y < height; y++){
    for (int x = 0; x < width; x++){
      for (int c = 0; c < channel; c++){

      v = 0;

      for (int dy = -pad; dy < pad + 1; dy++){
        for (int dx = -pad; dx < pad + 1; dx++){
          if (((x + dx) >= 0) && ((y + dy) >= 0)){
            v += (double)img.at<cv::Vec3b>(y + dy, x + dx)[c] * kernel[dy + pad][dx + pad];
          }
        }
      }
      out.at<cv::Vec3b>(y, x)[c] = v;
      }
    }
  }

这个

if (((x + dx) >= 0) && ((y + dy) >= 0))

是不是有问题。

应该改成(b-a)*(val-c)/(d-c)+a

gzr2017 / imageprocessing100wen Goto Github PK

imageprocessing100wen's Issues

Recommend Projects

Recommend Topics

Recommend Org