Comments (4)
I'd suggest filtering for the red pixels, finding the bounding box of those red pixels, and cropping it out.
A simple way to do this is:
Pix *pix1 = pixRead()
Pix *pix2 = pixMaskOverColorPixels(pix1, 50, 1); // filter for red pixels
Box *box1;
pixClipToForeground(pix2, NULL, &box1); // get bounding box of red pixels
Pix *pix3 = pixClipRectangle(pix1, box1, NULL); // extract that region from input image
from leptonica.
@DanBloomberg
How to remove the noise in the picture above? I have tried other methods to remove it
from leptonica.
The image is grayscale. The noise has the characteristic that it is small but black, so thresholding will not help.
Because the noise is small you might think to use morphological filtering. That is a viable approach. However, because the text has thin regions, it would need to be done carefully, and with reconstructive seed-filling into connected components after removal of the noise. See prog/italic_reg.c for how this is done.
Another approach is to locate the connected components and filter out the small ones. This is simple.
Pix *pix4 = pixConvertTo1(pix3, 128); // convert to 1 bpp
Pix *pix5 = pixSelectBySize(pix4, 2, 2, 8, L_SELECT_IF_EITHER, L_SELECT_IF_GT, 0); // remove small noise
If you want to also remove the outline from the original red box, do this:
Pix *pix4 = pixConvertTo1(pix3, 128); // convert to 1 bpp
pixSetOrClearBorder(pix4, 3, 3, 3, 3, PIX_CLR); // clear the pixels within 3 from the border
Pix *pix5 = pixSelectBySize(pix4, 2, 2, 8, L_SELECT_IF_EITHER, L_SELECT_IF_GT, 0); // remove small noise
Putting it all together:
Pix *pix1 = pixRead(<your input image>)
Pix *pix2 = pixMaskOverColorPixels(pix1, 50, 1); // filter for red pixels
Box *box1;
pixClipToForeground(pix2, NULL, &box1); // get bounding box of red pixels
Pix *pix3 = pixClipRectangle(pix1, box1, NULL); // extract that region from input image
Pix *pix4 = pixConvertTo1(pix3, 128); // convert from rgb to 1 bpp
pixSetOrClearBorder(pix4, 3, 3, 3, 3, PIX_CLR); // clear the pixels within 3 from the border
Pix *pix5 = pixSelectBySize(pix4, 2, 2, 8, L_SELECT_IF_EITHER, L_SELECT_IF_GT, 0); // remove small noise
Note that the text is much thinner than the original you show above. However, the thinning occurred when I saved your image from github. It was not done by leptonica.
from leptonica.
here's the result with the border from the red pixels removed:
from leptonica.
Related Issues (20)
- sw_execute not finding CMAKE_CXX_COMPILER_ID HOT 8
- Unable to build leptonica.dll. HOT 5
- RFE: open files with `O_CLOEXEC` HOT 6
- My. net6 program uses the Tesseract library, which works fine under Windows, but fails under Linux HOT 1
- Security binary scan reports XZ Utils library but unable to detect the version HOT 2
- Build fails under Windows HOT 27
- 1.83.1: test suite fails in `ioformats_reg` unit HOT 7
- JPEG decompression is not unique, which can cause colorcontent_reg to fail on the number of colors in wyom.jpg HOT 1
- Suggest noting that libtiff must have been compiled with JPEG compression support for tests to run correctly HOT 4
- Leptonica 1.84 release schedule HOT 28
- How to detect the four corners of paper or ID card HOT 8
- 1.84.0 fails to build in FreeBSD 14.0 due missing reference to libm using CMake HOT 24
- CMake build in 1.84.0 adds include to non-existant path
- 1.84.1: compile time warnings HOT 3
- pixWriteJp2k: unexpected strong color artifacts on featureless image HOT 7
- pixOrientDetect with .bmp file HOT 6
- Logic error in pixcmapIsValid() HOT 5
- Regression: file not found on MacOS when opening /tmp file HOT 2
- Identifying and removing asterisks HOT 8
- pixDeskewGeneral(...) failing to detect a 45 degree skew? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from leptonica.