Giter VIP home page Giter VIP logo

imageprocessing100wen's Issues

关于jpeg压缩

jpeg图片压缩四个步骤依次为,颜色模式转换及采样、DCT变换、量化、编码。

Q25:当放大倍数大于等于2时,出现报错

报错信息如下,原因是y代表的y坐标超过了原图(128*128)本身的大小
image

解决方案:由于是运算过程中产生可比如127.5最后向上(round)取整得到128,将向上取整改为向下取整(floor)可以解决

  • 向上取整:
    image

  • 向下取整:
    image

answer_10.cpp 中值问题

// get and assign median
std::sort(vs, vs + (kernel_size * kernel_size));
out.atcv::Vec3b(y, x)[c] = (uchar)vs[int(floor(count / 2)) + 1];
既然取的是中值,索引从0开始,不应该+1吧

q4答案缺失

q4答案链接失效日版有,还有后面几个线性滤波的代码kernel的大小都是动态申请的(cpp),一般现在的vc应该都过不了编译把

Page not found.

你好,当我点击“答案”下方的链接时,
答案
Python >> answers/answer_11.py
C++ >> answers_cpp/answer_11.cpp
跳转出来的页面是“Page not found”,所以一直都是点开上面的文件夹查看答案文件的。

关于引用的许可

你好,我写了关于这个题目系列的csdn博客文章,其中引用了你的翻译内容,希望得到许可。
还有,方便留个联系方式(email or qq)相互交流吗?

answer28-31运行成功案例(e.g. answer28)

import cv2
import numpy as np
import matplotlib.pyplot as plt


# Affine
def affine(img, a, b, c, d, tx, ty):
    H, W, C = img.shape

    # temporary image
    _img = np.zeros((H + 2, W + 2, C), dtype=np.float32)
    _img[1:H + 1, 1:W + 1] = img

    # get new image shape
    H_new = np.round(H * d).astype(int)
    W_new = np.round(W * a).astype(int)
    out = np.zeros((H_new + 1, W_new + 1, C), dtype=np.float32)

    # get position of new image
    x_new = np.tile(np.arange(W_new), (H_new, 1))
    y_new = np.arange(H_new).repeat(W_new).reshape(H_new, -1)

    # get position of original image by affine
    adbc = a * d - b * c
    x = np.round((d * x_new - b * y_new) / adbc).astype(int) - tx + 1
    y = np.round((-c * x_new + a * y_new) / adbc).astype(int) - ty + 1

    x = np.minimum(np.maximum(x, 0), W + 1).astype(int)
    y = np.minimum(np.maximum(y, 0), H + 1).astype(int)

    # assgin pixcel to new image
    out[y_new, x_new] = _img[y, x]

    out = out[:H_new, :W_new]
    out = out.astype(np.uint8)

    return out


# Read image
img = cv2.imread("../imori.jpg").astype(np.float32)

# Affine
out = affine(img, a=1, b=0, c=0, d=1, tx=30, ty=-30)

# Save result
cv2.imshow("result", out)
cv2.waitKey(0)
cv2.imwrite("out.jpg", out)

q81

答案缺失

Answer_50 has a problem

In the description, the morphological operation is dilating before eroding. But in your code, you erode before you dilate. And we have the same problem with open operation.
File: ImageProcessing100Wen/Question_41_50/answers_py/answer_49.py, answer_50.py

answer_32.cpp中傅立叶正逆变换的问题

line 46:val /= sqrt(height * width);
line 73: g = std::abs(val) / sqrt(height * width);
我昨天在做的时候就觉得很奇怪,怎么会能用错误的式子给出正确解答,看到answer_32.cpp就明白了...这愣是把逆变换的中的g = std::abs(val) /(height * width);拆分成了在正逆变换中分别除以sqrt(height * width)的形式...😂

answer_22.cpp中的标准差计算

  // get sum
  for (int y = 0; y < height; y++){
    for (int x = 0; x < width; x++){
      for (int c = 0; c < 3; c++){
        val = (float)img.at<cv::Vec3b>(y, x)[c];
        sum += val;
        squared_sum += (val * val);
      }
    }
  }

  // get standard deviation
  m = sum / (height * width * channel);
  s = sqrt(squared_sum / (height * width * channel) - m * m);

这一部分的均值计算是没有错的,但是,在计算标准差时,怎么能直接把所有rgb值的平方加起来算均值减去m^2?
https://pdfs.semanticscholar.org/fee0/d0c91442d465f343af18b5fce8aeff594d02.pdf
参考文章中STANDARD DEVIATION计算,应该是先计算出均值之后算每个值平方减去均值的平方再开根号。

q27中的公式问题

像素间距离公式有问题

  • a容易与下面的权重函数中的a混淆
  • 如果我没理解错的话, ax/ay是指x/y方向上的放大倍数, 公式像a*x/a*y, 尤其是在渲染后的pdf中

question31_40文件夹中readme中出错

  1. 使用仿射变换,输出(1)那样的$x$轴倾斜$30$度的图像( $d_x=30$),这种变换被称为X-sharing。
  2. 使用仿射变换,输出(2)那样的y轴倾斜$30$度的图像( $d_y=30$),这种变换被称为Y-sharing。
  3. 使用仿射变换,输出(3)那样的$x$轴、$y$轴都倾斜$30$度的图像($d_x = 30$,$d_y = 30$)。

以及底下的两个公式,t_x,t_y都应该替换为d_x和d_y

关于第七个问题平均池化

平均池化的输入矩阵大小和输出矩阵大小应该不是一样大,为什么代码中要将均值化后的值赋给每一个像素?

Canny算法最后一步 大小阈值 错了吧

首先谢谢楼主的翻译!

首先定义p_low为值大于小阈值的像素, p_high为>大阈值点的像素.

楼主实现的canny是: p_low属于edge, 仅当他周围有一个p_high.

但实际上的算法应该是: p_low属于edge, 当他通过一系列p_low连接到了某个p_high. (所以会比楼主的算法产生更多的边缘点). 然后这应该是用dfs实现的吧.

answer_28.py can`t runing

I cant`t run this anser even at colaboratory.
error below:img[1:H+1, 1:W+1] = _img
NameError: name '_img' is not defined

q26

两个min()里面应该减2不然会越界,以及某大神pr的公式错误

关于answer_4.py

def otsu_binarization(img, th=128):
    max_sigma = 0
    max_t = 0
    
    # determine threshold
    for _t in range(1, 255):
        v0 = out[np.where(out < _t)]
        m0 = np.mean(v0) if len(v0) > 0 else 0.
        w0 = len(v0) / (H * W)
        v1 = out[np.where(out >= _t)]
        m1 = np.mean(v1) if len(v1) > 0 else 0.
        w1 = len(v1) / (H * W)
        sigma = w0 * w1 * ((m0 - m1) ** 2)
        if sigma > max_sigma:
            max_sigma = sigma
            max_t = _t

    # Binarization
    print("threshold =", max_t)
    th = max_t
    out[out < th] = 0
    out[out >= th] = 255

    return out

作者v0 = out[np.where(out < _t)]中的out可能是指函数传入参的img吧,不是太理解

代码疑问

在文件https://github.com/gzr2017/ImageProcessing100Wen/blob/master/Question_01_10/answers_cpp/answer_9.cpp

for (int y = 0; y < height; y++){
    for (int x = 0; x < width; x++){
      for (int c = 0; c < channel; c++){

      v = 0;

      for (int dy = -pad; dy < pad + 1; dy++){
        for (int dx = -pad; dx < pad + 1; dx++){
          if (((x + dx) >= 0) && ((y + dy) >= 0)){
            v += (double)img.at<cv::Vec3b>(y + dy, x + dx)[c] * kernel[dy + pad][dx + pad];
          }
        }
      }
      out.at<cv::Vec3b>(y, x)[c] = v;
      }
    }
  }

这个

if (((x + dx) >= 0) && ((y + dy) >= 0))

是不是有问题。

问题41处的一点疑惑

获取edge时使用的fx、fy为uint8格式,是否会导致fx、fy计算edge时出现数据溢出反而导致edge的数值偏小?

anwser_21.cpp有一些错误

line 41:
out.atcv::Vec3b(y, x)[_c] = (uchar)((b - a) / (d - c) * (val - c) + a);

这里先计算(b - a) / (d - c)会损失数据,导致结果错误,若b-a=255, d-c=178, 斜率会得出1,结果错误。

应该改成(b-a)*(val-c)/(d-c)+a

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.