Comments (21)
The issue is still open, but I'm pretty far along on local hardware. Will publish early next week & start the review process.
from libpng.
I'm taking a whack at this. @glennrp what is the recommended path to get external contributions into libpng?
from libpng.
You may submit a GIT pull request or just email me a patch.
from libpng.
Is code in process of being accepted?
Or still open?
from libpng.
@edelsohn
Should I implement runtime detection of supporting VSX by cpu? If yes please tell me platforms which must be supported (currently linux is the only supported one).
from libpng.
Does libpng support runtime detection of SSE4 vs AVX vs AVX2 vs AVX512?
The intended target for the issue and bounty is PPC64 Linux support, not other OSes.
from libpng.
@edelsohn There is no runtime detection support for Intel but there is ones for mips and ARM. Now linux support for ppc64 has been made, so I guess it is okay for this issue.
from libpng.
Does runtime support work for PPC64 on Linux? I'm unsure how to interpret your comment about "linux support for ppc64 has been made"
from libpng.
@edelsohn I mean runtime detection of "if current CPU is able to run Altivec && VSX code" . Yes, it is done for PPC64 Linux
from libpng.
@barkovv @glennrp What is the final speed up and how does this compare with other SIMD architectures of similar width, such as ARM Neon and Intel SSE?
from libpng.
@edelsohn According to John Bowler's words:
- Analysis of the results suggests a 1/74% speed up from using the Altivec code; this is suspiciously large.
- Final results: about 8% improvement in (just) decode time of typical PNGs
from libpng.
I saw that in the pull request.
What is the speedup for other SIMD architectures for the same measurements? Is the POWER VSX code achieving equivalent or better speedup?
8% improvement for decode seems small, but I don't know what to compare it against.
from libpng.
@edelsohn
I've just made tests for Intel. There are some quick results:
libpng-noopt $ ./timepng ../Earth10k.png
1.519951620
libpng-opt $ ./timepng ../Earth10k.png
1.192934654
Calculations:
>>> 1.192934654/1.509859932
0.79009623920532
So, according to timepng benchmarking, PowerPC VSX optimisation is on pair with Intel SSE (74% and 79%).
from libpng.
I can measure this results deeper and with more accuracy if @jbowler will provide some more information about his measure methods and scripts.
from libpng.
I'm running Earth10k.png through pngcrush for timing. It's not surprising that optimization will make a significant improvement because the image has mostly AVG and PAETH filtering.
from libpng.
I'm not questioning the result. I greatly appreciate the excellent work on the patch to implement POWER8 VSX optimizations. The milestone for this issue is not some perfect, unrealistic, magical improvement in performance nor some super-human effort to achieve an order of magnitude advantage. The goal for the issue is the implementation of architecture-specific SIMD optimizations for POWER8 VSX that are equivalent in form to Intel SSE, with the hope of achieving equivalent speedup.
I want to reach out to the experts for this important library to get their assessment of the benefit produced by the patch. Is the improvement produced by this patch about right?
from libpng.
from libpng.
@edelsohn
I don't understand you. My point is that PowerPC VSX optimization are equivalent to Intel SSE one.
Do I need to approve it by profiling? Maybe you want to @glennrp and @jbowler to approve it?
What kind of actions must be perfomed to approve this work?
from libpng.
@barkovv I want to know if @glennrp agrees with the test methodology. Again, I am not trying to create new hurdles. I simply want to be able to point at an objective test recommended by the libpng community that says, "yes, this is good enough."
from libpng.
from libpng.
from libpng.
Related Issues (20)
- bug: png_check_sig API changed in 1.6.41 HOT 13
- How to disable warnings
- libpng: APIs that are currently used HOT 8
- 1.6.42: test sute fails in one unit HOT 39
- LoongArch LSX: Follow up on checking for compiler intrinsics inside ./configure HOT 6
- Possible miscalculation of buffer length in `png_icc_profile_error` HOT 3
- Questions about choosing zlib over other compression algorithms and other issues HOT 6
- Use PNG_ABORT instead of abort in png_safe_error HOT 3
- CMake: How to configure options for pnglibconf generation? HOT 4
- Need to restore STDERR in pngtest.c HOT 3
- 16-bit channels, possible issue? HOT 7
- Minor Syntax Issues in the `/libpng/contrib/gregbook` HOT 1
- Possible integer overflow in pngtests.c HOT 2
- Implicit fallthroughs HOT 8
- libpng version 1.6.43 dll only worked on the debugged version.crashed with the released version. HOT 3
- [Build][CMake][Windows] Issue to build on Windows when cygwin (awk ) is present in the PATH HOT 4
- libpng-1.6.43.tar.gz is corrupt HOT 2
- Sovereign Tech Fund: Fellowship for Maintainers
- Potential Vulnerability in libpng Leading to Hang or Infinite Loop when Processing Malformed PNG Files HOT 4
- png_set_cHRM() fails when using ACEScg coordinates HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libpng.