Giter VIP home page Giter VIP logo

qatzip's Introduction

Intel® QuickAssist Technology (QAT) QATzip Library

Table of Contents

Introduction

QATzip is a user space library which builds on top of the Intel® QuickAssist Technology user space library, to provide extended accelerated compression and decompression services by offloading the actual compression and decompression request(s) to the Intel® Chipset Series. QATzip produces data using the standard gzip* format (RFC1952) with extended headers or lz4* blocks with lz4* frame format. The data can be decompressed with a compliant gzip* or lz4* implementation. QATzip is designed to take full advantage of the performance provided by Intel® QuickAssist Technology.

The currently supported formats include:

Data Format Algorithm QAT device Description
QZ_DEFLATE_4B deflate* QAT 1.x and QAT 2.0 Data is in DEFLATE* with a 4 byte header
QZ_DEFLATE_GZIP deflate* QAT 1.x and QAT 2.0 Data is in DEFLATE* wrapped by Gzip* header and footer
QZ_DEFLATE_GZIP_EXT deflate* QAT 1.x and QAT 2.0 Data is in DEFLATE* wrapped by Intel® QAT Gzip* extension header and footer
QZ_DEFLATE_RAW deflate* QAT 1.x and QAT 2.0 Data is in raw DEFLATE* without any additional header. (Only support compression, decompression will fallback to software)
QZ_LZ4 lz4* QAT 2.0 Data is in LZ4* wrapped by lz4* frame
QZ_LZ4S lz4s* QAT 2.0 Data is in LZ4S* blocks

Licensing

The Licensing of the files within this project is split as follows:

Intel® Quickassist Technology (QAT) QATzip - BSD License. Please see the LICENSE file contained in the top level folder. Further details can be found in the file headers of the relevant files.

Example Intel® Quickassist Technology Driver Configuration Files contained within the folder hierarchy config_file - Dual BSD/GPLv2 License. Please see the file headers of the configuration files, and the full GPLv2 license contained in the file LICENSE.GPL within the config_file folder.

Features

  • Acceleration of compression and decompression utilizing Intel® QuickAssist Technology, including a utility to compress and decompress files.

  • Dynamic memory allocation for zero copy, by exposing qzMalloc() and qzFree() allowing working buffers to be pinned, contiguous buffers that can be used for DMA operations to and from the hardware.

  • Instance over-subscription, allowing a number of threads in the same process to seamlessly share a smaller number of hardware instances.

  • Memory allocation backed by huge page and kernel memory to provide access to pinned, contiguous memory. Allocating from huge-page when kernel memory contention.

  • Configurable accelerator device sharing among processes.

  • Optional software failover for both compression and decompression services. QATzip may switch to software if there is insufficient system resources including acceleration instances or memory. This feature allows for a common software stack between server platforms that have acceleration devices and non-accelerated platforms.

  • Provide streaming interface of compression and decompression to achieve better compression ratio and throughput for data sets that are submitted piecemeal.

  • 'qzip' utility supports compression from regular file, pipeline and block device.

  • For QATzip GZIP* format, try hardware decompression first before switching to software decompression.

  • Enable adaptive polling mechanism to save CPU usage in stress mode.

  • 'qzip' utility supports compression files and directories into 7z format.

  • Support QATzip Gzip* format, it includes 10 bytes header and 8 bytes footer:

    | ID1 (1B) | ID2(0x8B) (1B) | Compression Method (8 = DEFLATE*) (1B) | Flags (1B) | Modification Time (4B) | Extra Flags (1B) | OS (1B) | Deflate Block| CRC32(4B)| ISIZE(4B)|

  • Support QATzip Gzip* extended format. This consists of the standard 10 byte Gzip* header and follows RFC 1952 to extend the header by an additional 14 bytes. The extended headers structure is below:

    | Length of ext. header (2B) | SI1('Q') (1B) | SI2('Z') (1B) | Length of subheader (2B) | Intel(R) defined field 'Chunksize' (4B) | Intel(R) defined field 'Blocksize' (4B) |

  • Support Intel® QATzip 4 byte header, the header indicates the length of the compressed block followed by the header.

    | Intel(R) defined Header (4B)|deflate\* block|

  • Support QATzip lz4* format. This format is structured as follows:

    | MagicNb(4B) |FLG(1B)|BD(1B)| CS(8B)|HC(1B)| |lz4\* Block | EndMark(4B)|

Hardware Requirements

This QATzip library supports compression and decompression offload to the following acceleration devices:

Software Requirements

This release was validated on the following:

  • QATzip has been tested with the latest Intel® QuickAssist Acceleration Driver. Please download the QAT driver from the link Intel® QuickAssist Technology
  • QATzip has been tested by Intel® on CentOS* 7.8.2003 with kernel 3.10.0-1127.19.1.el7.x86_64
  • Zlib* library of version 1.2.7 or higher
  • Suggest GCC* of version 4.8.5 or higher
  • lz4* library of version 1.8.3 or higher
  • zstd* library of version 1.5.0 or higher

Additional Information

  • For QAT 1.x, the compression level in QATzip could be mapped to standard zlib* as below:

    • QATzip level 1 - 4, similar to zlib* level 1 - 4.
    • QATzip level 5 - 8, we map them to QATzip level 4.
    • QATzip level 9, we will use software zlib* to compress as level 9.
  • For QAT 2.0, the compression level in QATzip could be mapped to standard zlib* or lz4* as below:

    • Will be updated in future releases.
  • QATzip Compression Level Mapping:

    QATzip Level QAT Level QAT 2.0(deflate*, LZ4*, LZ4s*) QAT1.7/1.8(Deflate*)
    1 CPA_DC_L1 2(HW_L1) DEPTH_1
    2 CPA_DC_L2 2(HW_L1) DEPTH_4
    3 CPA_DC_L3 2(HW_L1) DEPTH_8
    4 CPA_DC_L4 2(HW_L1) DEPTH_16
    5 CPA_DC_L5 2(HW_L1) DEPTH_16
    6 CPA_DC_L6 8(HW_L6) DEPTH_16
    7 CPA_DC_L7 8(HW_L6) DEPTH_16
    8 CPA_DC_L8 8(HW_L6) DEPTH_16
    9 CPA_DC_L9 16(HW_L9) DEPTH_16
    10 CPA_DC_L10 16(HW_L9) Unsupported
    11 CPA_DC_L11 16(HW_L9) Unsupported
    12 CPA_DC_L12 16(HW_L9) Unsupported

Limitations

  • The partitioned internal chunk size of 16 KB is disabled, this chunk is used for QAT hardware DMA.
  • For stream object, user should reset the stream object by calling qzEndStream() before reuse it in the other session.
  • For stream object, user should clear stream object by calling qzEndStream() before clear session object with qzTeardownSession(). Otherwise, memory leak happens.
  • For stream object, stream length must be smaller than strm_buff_sz, or QATzip would generate multiple deflate block in order and has the last block with BFIN set.
  • For stream object, we will optimize the performance of the pre-allocation process using a thread-local stream buffer list in a future release.
  • For 7z format, decompression only supports *.7z archives compressed by qzip.
  • For 7z format, decompression only supports software.
  • For 7z format, the header compression is not supported.
  • For lz4* (de)compression, QATzip only supports 32KB history buffer.
  • For zstd format compression, qzstd only supports hw_buffer_sz which is less than 128KB.
  • Stream APIs only support "DEFLATE_GZIP", "DEFLATE_GZIP_EXT", "DEFLATE_RAW" for compression and "DEFLATE_GZIP", "DEFLATE_GZIP_EXT" for decompression now.

Installation Instructions

Install with the in-tree QAT package

  • Please refer to link.

Install with the out-of-tree QAT package

  1. The Installation of the out-of-tree QAT package refer to link.

Note

  • If you run QAT as non-root user, more steps need to be manually applied, please refer to link.
  • If SVM is not enabled, memory passed to QAT hardware must be DMA’able, Intel provides a USDM component which allocates/frees DMA-able memory. Please refer to link for USDM setting.
  1. Install the package dependencies by running the below command:
sudo dnf install -y autoconf automake libtool zlib-devel lz4-devel
For Debian-based distros like Ubuntu, use these names for the latter two packages:
sudo apt -y install zlib1g-dev liblz4-dev
  1. Configure the QATzip library by running the following commands:
cd QATzip/
export QZ_ROOT=`pwd`
export ICP_ROOT=/QAT/PACKAGE/PATH
./autogen.sh
./configure

Note
For more configure options, please run "./configure -h" for help.

  1. Build and install the QATzip library by running the below commands:
make clean
make
sudo make install

Configuration

Note
This section is only required when you are using out-of-tree QAT package. if you are using qatlib with in-tree QAT package, please refer to link for details on configuring qatlib.

QAT programmer’s guide which provides information on the architecture of the software and usage guidelines, allows customization of runtime operation.

The Intel® QATzip comes with some tuning example conf files to use. you can replace the old conf file(under /etc/) by them. The detailed info about Configurable options, please refer Programmer's Guide manual.

The process section name(in configuration file) is the key change for QATzip. There are two way to change:

  • QAT Driver default conf file does not contain a [SHIM] section which the Intel® QATzip requires by default. You can follow below step to replace them.
  • The default section name in the QATzip can be modified if required by setting the environment variable "QAT_SECTION_NAME".

To update the configuration file, copy the configure file(s) from the directory of $QZ_ROOT/config_file/$YOUR_PLATFORM/$CONFIG_TYPE/*.conf to the directory of /etc

YOUR_PLATFORM: the QAT hardware platform, c6xx for Intel® C62X Series Chipset, dh895xcc for Intel® Communications Chipset 8925 to 8955 Series

CONFIG_TYPE: tuned configure file(s) for different usage, multiple_process_opt for multiple process optimization, multiple_thread_opt for multiple thread optimization.

Restart QAT driver

    service qat_service restart

With current configuration, each PCI-e device in C6XX platform could support 32 processes in maximum.

Enable qzstd

If you want to enable lz4s + postprocessing pipeline, you have to compile qzstd which is a sample app to support ZSTD format compression/decompression. Before enabling qzstd, make sure that you have installed zstd static lib.

Compile qzstd

    cd $QZ_ROOT
    ./autogen.sh
    ./configure --enable-lz4s-postprocessing
    make clean
    make qzstd

test qzstd

    qzstd $your_input_file

Test QATzip

Run the following command to check if the QATzip is setup correctly for compressing or decompressing files:

    qzip -k $your_input_file  -O gzipext -A deflate

File compression in 7z:

    qzip -O 7z FILE1 FILE2 FILE3... -o result.7z

Dir compression in 7z:

    qzip -O 7z DIR1 DIR2 DIR3... -o result.7z

Decompression file in 7z:

    qzip -d result.7z

Dir Decompression with -R:

If the DIR contains files that are compressed by qzip and using gzip/gzipext format, then it should be add -R option to decompress them:

    qzip -d -R DIR

Performance Test With QATzip

Please run the QATzip (de)compression performance test with the following command. Please update the drive configuration and process/thread argument in run_perf_test.sh before running the performance test. Note that when number for threads changed, the argument "max_huge_pages_per_process" in run_perf_test.sh should be changed accordingly, at least 6 times of threads number.

    cd $QZ_ROOT/test/performance_tests
    ./run_perf_test.sh

QATzip API Manual

Please refer to file QATzip-man.pdf under the docs folder Please refer to the link for QAT documents

Open Issues

Known issues relating to the QATzip are described in this section.

QATAPP-26069

Title Buffers allocated with qzMalloc() can't be freed after calling qzMemDestory
Reference QATAPP-26069
Description If the users call qzFree after qzMemDestory, they may encounter free memory error "free(): invalid pointe"
Implication User use qzMalloc API to allocate continuous memory
Resolution Ensure qzMemDestory is invoked after qzFree, now we use attribute destructor to invoke qzMemDestory
Affected OS Linux

Intended Audience

The target audience is software developers, test and validation engineers, system integrators, end users and consumers for QATzip integrated Intel® Quick Assist Technology

Legal

Intel® disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel® representative to obtain the latest forecast , schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

Intel, the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others

qatzip's People

Contributors

adenilsoncavalcanti avatar bbrowne-intel avatar cfzhu avatar davidsha-intel avatar daweiq avatar embg avatar forde-dev avatar garenjian-intel avatar gmcfadde avatar haiyan1x avatar iomartin avatar ipuustin avatar junwa15x avatar liangintel avatar lihuizha avatar linxqiao avatar mathana96 avatar mprinn avatar mythi avatar pfl avatar scorchfly avatar wangzhux avatar wkoux avatar xinghongchenintel avatar yanpen1x avatar yanzegux avatar yuxcao avatar yyao9 avatar zhangp8x avatar zm6int avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qatzip's Issues

initStream: insufficient output buffer for SW decompression

This looks faulty:

stream_buf->in_buf = qzMalloc(stream_buf->buf_len, NODE_0, PINNED_MEM);
stream_buf->out_buf = qzMalloc(stream_buf->buf_len, NODE_0, PINNED_MEM);

because stream_buf->buf_len = qz_sess->sess_params.strm_buff_sz. Since the input buffer is read full (up to strm_buff_sz bytes), how to fit in the decompressed data in the output buffer that has the same size?

I'm getting invalid stored block lengths errors from zlib with this and they disappear when I make out_buf 10xbuf_len.

stopQAT: reset pcie_count

g_process.pcie_count is not correctly de-initialized in stopQAT. Please consider the following patch:

diff --git a/src/qatzip.c b/src/qatzip.c
index 967ae03..c7b9f2c 100755
--- a/src/qatzip.c
+++ b/src/qatzip.c
@@ -346,6 +346,7 @@ static void stopQat(void)
     (void)icp_sal_userStop();
     g_process.num_instances = (Cpa16U)0;
     g_process.qz_init_status = QZ_NONE;
+    g_process.pcie_count = -1;
 }

The problem is that in "no HW" environment (pcie_count == 0), calling qzInit() again after a qzClose(), the initialization is not completed correctly (QZ_DUPLICATE is returned).

configure: add bash shebang

It seems configure comes with bash syntax and running it fails with other shells:

./configure: 49: ./configure: Syntax error: "(" unexpected

Some errors occurred during installation.

Hello Team,

There is no any errors while running configure:
[root@locahost:QATzip-0.2.7]$ ./configure --with-ICP_ROOT=/usr/local/src/ssl/qat1.7.l.4.4.0-00023
Checking for zlib.h... OK
Checking for pthread.h... OK
Checking for stdio.h... OK
Checking for sys/time.h... OK
Checking for stdarg.h... OK
Checking for stdlib.h... OK
Checking for sys/types.h... OK
Checking for sys/stat.h... OK
Checking for string.h... OK
Checking for unistd.h... OK
Checking for memory.h... OK
Checking for stdint.h... OK
Checking for zlib version-1.2.3... OK
Checking for ICP_ROOT:/usr/local/src/ssl/qat1.7.l.4.4.0-00023... OK
Checking for /usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/include/cpa.h... OK
Checking for /usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/include/dc/cpa_dc.h... OK
Checking for /usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/lookaside/access_layer/include/icp_sal_poll.h... OK
Checking for /usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/lookaside/access_layer/include/icp_sal_user.h... OK
Checking for /usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/utilities/libusdm_drv/qae_mem.h... OK
Checking for pthread library... OK
Checking for zlib library... OK
Checking for usdm library... OK
Checking for qat library... OK
Checking for QAT driver version... OK
Checking for vfprintf function... OK
Checking for stand file API... OK
Checking for strtoul API... OK
Checking for atexit API... OK
Checking for getopt_long API... OK
Checking for gettimeofday API... OK
Checking for build static library... OK
Checking for build shared library... OK

But some errors occurred while running 'make all install':
[root@localhost:QATzip-0.2.7]$ make all install
make -C /usr/local/src/ssl/QATzip-0.2.7/src libqatzip.a
make[1]: Entering directory `/usr/local/src/ssl/QATzip-0.2.7/src'
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/include -I/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/include/dc -I/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/lookaside/access_layer/include -I/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/utilities/libusdm_drv -I/usr/local/src/ssl/QATzip-0.2.7/include -I/usr/local/src/ssl/QATzip-0.2.7/src -c qatzip_sw.c -o qatzip_sw.o

qatzip_sw.c: In function ‘qzSWCompress’:
qatzip_sw.c:143: error: ‘z_const’ undeclared (first use in this function)
qatzip_sw.c:143: error: (Each undeclared identifier is reported only once
qatzip_sw.c:143: error: for each function it appears in.)
qatzip_sw.c:143: error: expected ‘)’ before ‘Bytef’
qatzip_sw.c:143: error: expected ‘;’ before ‘src’
qatzip_sw.c: In function ‘qzSWDecompress’:
qatzip_sw.c:219: error: ‘z_const’ undeclared (first use in this function)
qatzip_sw.c:219: error: expected ‘)’ before ‘Bytef’
qatzip_sw.c:219: error: expected ‘;’ before ‘src’
make[1]: *** [qatzip_sw.o] Error 1
make[1]: Leaving directory `/usr/local/src/ssl/QATzip-0.2.7/src'
make: *** [libqatzip.a] Error 2

OS: CentOS 6.8
kernel: 2.6.32-642.el6.x86_64
QATzip version: 0.2.7
QAT Driver version: qat1.7.l.4.4.0-00023

How to work around this problem?

Thanks in advance!

qaeOpenFd:772 Unable to initialize memory file handle /dev/usdm_drv

On after installing the QAT drivers and the QATzip as per instructions, when I try to run the test application I get the following error:
./test -m 3
qaeOpenFd:772 Unable to initialize memory file handle /dev/usdm_drv
qaeOpenFd:772 Unable to initialize memory file handle /dev/usdm_drv
Error no hardware, switch to SW if permitted
g_process.qz_init_status = QZ_NO_HW
Error from pthread_exit qzInit failed

I do have the QAT card present in the server. OS is RedHat 7.4

Replace zlib or snappy

Is there any way of replacing zlib/snappy just by only recompile and link my app code, and not to change any of my code? Just want to take a simple test.

undefined reference to `qaeMemFreeNUMA'

Hi,
I've installed the driver For Intel® Xeon® with Intel® C62X Series Chipset and the QuickAssist Technology(QAT) OpenSSL Engine* was able to run successfully. In installing the driver, the contiguous memory driver could not be built due to some error and load the User Space DMA-able Memory (USDM) Component instead.

And installed libelf-dev through

apt-get install libelf-dev

before running ./configure

It seems that qaeMemFreeNUMA and cpaDcCompressData ... could not be found.

root@ubuntu:/opt/QATzip# make all install
make -C /opt/QATzip/src libqatzip.a
make[1]: Entering directory '/opt/QATzip/src'
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip.c -o qatzip.o
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip_counter.c -o qatzip_counter.o
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip_gzip.c -o qatzip_gzip.o
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip_sw.c -o qatzip_sw.o
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip_mem.c -o qatzip_mem.o
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qatzip_utils.c -o qatzip_utils.o
ar rcs libqatzip.a qatzip.o qatzip_counter.o qatzip_gzip.o qatzip_sw.o qatzip_mem.o qatzip_utils.o
make[1]: Leaving directory '/opt/QATzip/src'
make -C /opt/QATzip/src libqatzip.so
make[1]: Entering directory '/opt/QATzip/src'
gcc qatzip.o qatzip_counter.o qatzip_gzip.o qatzip_sw.o qatzip_mem.o qatzip_utils.o -o libqatzip.so -fstack-protector -fPIC -pie -z relro -z now -Wl,-z,noexecstack -L/opt/QAT/build -Wl,-R/opt/QAT/build -shared -Wl,-soname,libqatzip.so -lqat_s -lusdm_drv_s -lz -lpthread -lnuma
make[1]: Leaving directory '/opt/QATzip/src'
make -C /opt/QATzip/utils qzip
make[1]: Entering directory '/opt/QATzip/utils'
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -m64 -I/opt/QAT/quickassist/include -I/opt/QAT/quickassist/include/dc -I/opt/QAT/quickassist/lookaside/access_layer/include -I/opt/QAT/quickassist/utilities/libusdm_drv -I/opt/QATzip/include -c qzip.c -o qzip.o
gcc qzip.o -o qzip -fstack-protector -fPIC -pie -z relro -z now -Wl,-z,noexecstack -L/opt/QAT/build -Wl,-R/opt/QAT/build -lqat_s -lusdm_drv_s -lz -lpthread -lnuma /opt/QATzip/src/libqatzip.a
/opt/QATzip/src/libqatzip.a(qatzip.o): In function cleanUpInstMem': qatzip.c:(.text+0x36d): undefined reference to qaeMemFreeNUMA'
qatzip.c:(.text+0x3a6): undefined reference to qaeMemFreeNUMA' qatzip.c:(.text+0x455): undefined reference to qaeMemFreeNUMA'
qatzip.c:(.text+0x4c5): undefined reference to qaeMemFreeNUMA' qatzip.c:(.text+0x4fe): undefined reference to qaeMemFreeNUMA'
/opt/QATzip/src/libqatzip.a(qatzip.o):qatzip.c:(.text+0x595): more undefined references to qaeMemFreeNUMA' follow /opt/QATzip/src/libqatzip.a(qatzip.o): In function doCompressIn':
qatzip.c:(.text+0x82a): undefined reference to cpaDcCompressData' /opt/QATzip/src/libqatzip.a(qatzip.o): In function doCompressOut':
qatzip.c:(.text+0xad0): undefined reference to icp_sal_DcPollInstance' qatzip.c:(.text+0xdfc): undefined reference to crc32_combine'
/opt/QATzip/src/libqatzip.a(qatzip.o): In function doDecompressIn': qatzip.c:(.text+0x1183): undefined reference to cpaDcDecompressData'
/opt/QATzip/src/libqatzip.a(qatzip.o): In function doDecompressOut': qatzip.c:(.text+0x154b): undefined reference to icp_sal_DcPollInstance'
/opt/QATzip/src/libqatzip.a(qatzip.o): In function stopQat.part.3': qatzip.c:(.text+0x186b): undefined reference to cpaDcStopInstance'
qatzip.c:(.text+0x18aa): undefined reference to icp_sal_userStop' /opt/QATzip/src/libqatzip.a(qatzip.o): In function qzInit':
qatzip.c:(.text+0x1a8c): undefined reference to icp_sal_userStartMultiProcess' qatzip.c:(.text+0x1a9c): undefined reference to cpaDcGetNumInstances'
qatzip.c:(.text+0x1bbd): undefined reference to cpaDcGetInstances' qatzip.c:(.text+0x1c1b): undefined reference to cpaDcInstanceGetInfo2'
qatzip.c:(.text+0x1c37): undefined reference to cpaDcQueryCapabilities' /opt/QATzip/src/libqatzip.a(qatzip.o): In function qzSetupHW':
qatzip.c:(.text+0x1fbd): undefined reference to cpaDcBufferListGetMetaSize' qatzip.c:(.text+0x2010): undefined reference to cpaDcGetSessionSize'
qatzip.c:(.text+0x202f): undefined reference to qaeMemAllocNUMA' qatzip.c:(.text+0x2064): undefined reference to cpaDcInitSession'
qatzip.c:(.text+0x2097): undefined reference to cpaDcGetNumIntermediateBuffers' qatzip.c:(.text+0x20a7): undefined reference to numa_set_preferred'
qatzip.c:(.text+0x212a): undefined reference to qaeMemAllocNUMA' qatzip.c:(.text+0x215f): undefined reference to qaeMemAllocNUMA'
qatzip.c:(.text+0x2196): undefined reference to qaeMemAllocNUMA' qatzip.c:(.text+0x21f9): undefined reference to qaeMemAllocNUMA'
qatzip.c:(.text+0x2329): undefined reference to qaeMemAllocNUMA' /opt/QATzip/src/libqatzip.a(qatzip.o):qatzip.c:(.text+0x2363): more undefined references to qaeMemAllocNUMA' follow
/opt/QATzip/src/libqatzip.a(qatzip.o): In function qzSetupHW': qatzip.c:(.text+0x25ea): undefined reference to qaeVirtToPhysNUMA'
qatzip.c:(.text+0x25f3): undefined reference to cpaDcSetAddressTranslation' qatzip.c:(.text+0x261d): undefined reference to cpaDcStartInstance'
/opt/QATzip/src/libqatzip.a(qatzip.o): In function removeSession': qatzip.c:(.text+0x271a): undefined reference to cpaDcRemoveSession'
qatzip.c:(.text+0x273b): undefined reference to qaeMemFreeNUMA' /opt/QATzip/src/libqatzip.a(qatzip_sw.o): In function qzSWCompress':
qatzip_sw.c:(.text+0x19d): undefined reference to deflateSetHeader' qatzip_sw.c:(.text+0x1f1): undefined reference to deflate'
qatzip_sw.c:(.text+0x275): undefined reference to deflateEnd' qatzip_sw.c:(.text+0x2a6): undefined reference to deflateInit2_'
/opt/QATzip/src/libqatzip.a(qatzip_sw.o): In function qzSWDecompress': qatzip_sw.c:(.text+0x348): undefined reference to inflateInit2_'
qatzip_sw.c:(.text+0x35b): undefined reference to inflateEnd' qatzip_sw.c:(.text+0x381): undefined reference to inflate'
/opt/QATzip/src/libqatzip.a(qatzip_mem.o): In function qzMalloc': qatzip_mem.c:(.text+0xe4): undefined reference to qaeMemAllocNUMA'
/opt/QATzip/src/libqatzip.a(qatzip_mem.o): In function qzFree': qatzip_mem.c:(.text+0x216): undefined reference to qaeMemFreeNUMA'
collect2: error: ld returned 1 exit status
Makefile:40: recipe for target 'qzip' failed
make[1]: *** [qzip] Error 1
make[1]: Leaving directory '/opt/QATzip/utils'
Makefile:92: recipe for target 'qzip' failed
make: *** [qzip] Error 2

why "can not detect int size"?

./configure
--prefix=$NGINX_INSTALL_DIR
--without-http_rewrite_module
--with-http_ssl_module
--with-http_stub_status_module
--with-http_v2_module
--with-stream
--with-stream_ssl_module
--add-dynamic-module=modules/nginx_qatzip_module
--add-dynamic-module=modules/nginx_qat_module/
--with-cc-opt="-DNGX_SECURE_MEM -I$OPENSSL_LIB/include -I$QZ_ROOT/include -Wno-error=deprecated-declarations"
--with-ld-opt="-Wl,-rpath=$OPENSSL_LIB/lib -L$OPENSSL_LIB/lib -L$QZ_ROOT/src -lqatzip -lz"

checking for OS

  • Linux 3.10.0-693.el7.x86_64 x86_64
    checking for C compiler ... found
  • using GNU C compiler
  • gcc version: 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
    checking for gcc -pipe switch ... found
    checking for --with-ld-opt="-Wl,-rpath=/etc/nginx/.openssl/lib -L/etc/nginx/.openssl/lib -L/usr/local/src/QATzip-v1.0.7/src -lqatzip -lz" ... found
    checking for -Wl,-E switch ... found
    checking for gcc builtin atomic operations ... found but is not working
    checking for C99 variadic macros ... found but is not working
    checking for gcc variadic macros ... found but is not working
    checking for gcc builtin 64 bit byteswap ... found
    checking for unistd.h ... found
    checking for inttypes.h ... found
    checking for limits.h ... found
    checking for sys/filio.h ... not found
    checking for sys/param.h ... found
    checking for sys/mount.h ... found
    checking for sys/statvfs.h ... found
    checking for crypt.h ... found
    checking for Linux specific features
    checking for epoll ... found but is not working
    checking for O_PATH ... found
    checking for sendfile() ... found but is not working
    checking for sendfile64() ... found but is not working
    checking for sys/prctl.h ... found
    checking for prctl(PR_SET_DUMPABLE) ... found but is not working
    checking for prctl(PR_SET_KEEPCAPS) ... found but is not working
    checking for capabilities ... found
    checking for crypt_r() ... found
    checking for sys/vfs.h ... found
    checking for nobody group ... found
    checking for poll() ... found
    checking for /dev/poll ... not found
    checking for kqueue ... not found
    checking for crypt() ... not found
    checking for crypt() in libcrypt ... found
    checking for F_READAHEAD ... not found
    checking for posix_fadvise() ... found
    checking for O_DIRECT ... found
    checking for F_NOCACHE ... not found
    checking for directio() ... not found
    checking for statfs() ... found
    checking for statvfs() ... found
    checking for dlopen() ... not found
    checking for dlopen() in libdl ... found
    checking for sched_yield() ... found
    checking for sched_setaffinity() ... found
    checking for SO_SETFIB ... not found
    checking for SO_REUSEPORT ... found
    checking for SO_ACCEPTFILTER ... not found
    checking for SO_BINDANY ... not found
    checking for IP_TRANSPARENT ... found
    checking for IP_BINDANY ... not found
    checking for IP_BIND_ADDRESS_NO_PORT ... not found
    checking for IP_RECVDSTADDR ... not found
    checking for IP_SENDSRCADDR ... not found
    checking for IP_PKTINFO ... found
    checking for IPV6_RECVPKTINFO ... found
    checking for TCP_DEFER_ACCEPT ... found
    checking for TCP_KEEPIDLE ... found
    checking for TCP_FASTOPEN ... found
    checking for TCP_INFO ... found
    checking for accept4() ... found
    checking for int size ...objs/autotest: error while loading shared libraries: libqatzip.so.1: cannot open shared object file: No such file or directory
    bytes

./configure: error: can not detect int size

FreeBSD compile failed - error: unknown argument: -fstack-protector-strong

Hi,
i compiled qatzip library but i have issue with this parameters.

FreeBSD clang version 8.0.1 (tags/RELEASE_801/final 366581) (based on LLVM 8.0.1)
Target: x86_64-unknown-freebsd12.1

cc  -O2 -pipe -fstack-protector-strong -fno-strict-aliasing  -DQAT_UIO -Wformat -Wformat-security -DICP_POLL_ONLY -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I/usr/home/maxfx/Documents/FreeBSD-Ports/qatzip/work/intel-qat-1.7.b.3.7.0/quickassist/qat/freebsd/sys/modules/qat/qat_common/../../../../src/ -I/usr/home/maxfx/Documents/FreeBSD-Ports/qatzip/work/intel-qat-1.7.b.3.7.0/quickassist/qat/freebsd/sys/modules/qat/qat_common/../../../../../common/freeBSD/drivers/crypto/qat/qat_common -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -fno-common  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include   -MD  -MF.depend.adf_freebsd_pfvf_ctrs_dbg.o -MTadf_freebsd_pfvf_ctrs_dbg.o -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999 -I/usr/src/sys/compat/linuxkpi/common/include -c /usr/home/maxfx/Documents/FreeBSD-Ports/qatzip/work/intel-qat-1.7.b.3.7.0/quickassist/qat/freebsd/sys/modules/qat/qat_common/../../../../src//adf_freebsd_pfvf_ctrs_dbg.c -o adf_freebsd_pfvf_ctrs_dbg.o
ld -m elf_x86_64_fbsd -fstack-protector-strong -d -warn-common --build-id=sha1 -r -d -o qat_common.ko adf_accel_engine.o adf_freebsd_admin.o adf_aer.o adf_freebsd_cfg.o adf_ctl_drv.o adf_heartbeat.o adf_freebsd_heartbeat_dbg.o adf_dev_mgr.o adf_hw_arbiter.o adf_init.o adf_transport.o adf_isr.o adf_fw_counters.o adf_dev_err.o adf_freebsd_dev_processes.o adf_freebsd_uio.o adf_freebsd_uio_cleanup.o adf_freebsd_kmc.o qat_freebsd.o qat_hal.o qat_uclo.o adf_vf_isr.o adf_pf2vf_msg.o adf_vf2pf_msg.o adf_pf2vf_capabilities.o adf_freebsd_transport_debug.o adf_clock.o adf_freebsd_cnvnr_ctrs_dbg.o adf_freebsd_pfvf_ctrs_dbg.o
ld: error: unknown argument: -fstack-protector-strong
*** Error code 1

Stop.
make[4]: stopped in /usr/home/maxfx/Documents/FreeBSD-Ports/qatzip/work/intel-qat-1.7.b.3.7.0/quickassist/qat/freebsd/sys/modules/qat/qat_common
*** Error code 1

call qzCompress to compress a 8KB data block with hardware mode, but decompress with software mode meet errors "invalid block type"

There is 3 QAT acceleration device(s) in the system:
qat_dev0 - type: c6xx, inst_id: 0, node_id: 0, bsf: 0000:33:00.0, #accel: 5 #engines: 10 state: up
qat_dev1 - type: c6xx, inst_id: 1, node_id: 0, bsf: 0000:34:00.0, #accel: 5 #engines: 10 state: up
qat_dev2 - type: c6xx, inst_id: 2, node_id: 0, bsf: 0000:35:00.0, #accel: 5 #engines: 10 state: up

  1. qzCompress is called with parameter last being equal to 1.
    (did not change any config and apply QAT hardware instance to run compress)

A 8192 bytes data block is compressed to 7809 bytes

  1. qzDecompress is called via codes below (Zlib1.2.11's inflate(stream, Z_SYNC_FLUSH) is called eventually):

QzSession_T *sess
sess->hw_session_stat = QZ_NO_HW; // so that qzDecompress could switch to software mode
sess->thd_sess_stat = QZ_OK; // ensure session is normal

deCompress is called and run-time error log looks like :

qzDecompress data_fmt: 2
decompression src_len=7809, hdr->extra.qz_e.src_sz = 8192, g_process.qz_init_status = 0, sess->hw_session_stat = 11, isQATProcessable = 1, switch to software.
Start qzSWDecompressMultiGzip: src_len 7809 dest_len 8192
decomp_sw data_fmt: 2

****** inflate init done with win_bits: 31 *****
ERR: inflate failed with Z_DATA_ERROR
Exit qzSWDecompress total_in: 7802 total_out: 8192 avail_in: 7 avail_out: 0 msg: invalid block type src_len: 0 dest_len: 0
Exit qzSWDecompressMultiGzip: src_len 0 dest_len 0

Performance Test

Is there any introduction to the QATzip principle and performance test results?

CAP_SYS_ADMIN needed with 4.x kernels

qzip run by a non-root user needs CAP_SYS_ADMIN.

The hugepage code in usdm_drv reads process' pagemap which added limitations with 4.x:

Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs.
In 4.0 and 4.1 opens by unprivileged fail with -EPERM.  Starting from
4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN.

qzip falls back to software at parallelism=8

I'm evaluating the QuickAssist 8970 card and I've run into an issue doing basic testing with qzip. I'm using a build with --enable-debug

My configuration is set up for multi-process. The config is as follows, repeated for all three endpoints.

[GENERAL]
ServicesEnabled = dc
DcIntermediateBufferSizeInKB = 64

ConfigVersion = 2

#Default values for number of concurrent requests
CyNumConcurrentSymRequests = 512
CyNumConcurrentAsymRequests = 64

#Statistics, valid values: 1,0
statsGeneral = 1
statsDh = 1
statsDrbg = 1
statsDsa = 1
statsEcc = 1
statsKeyGen = 1
statsDc = 1
statsLn = 1
statsPrime = 1
statsRsa = 1
statsSym = 1
KptEnabled = 0

# This flag is to enable SSF features (CNV and BnP)
StorageEnabled = 0

# Disable public key crypto and prime number
# services by specifying a value of 1 (default is 0)
PkeServiceDisabled = 0

# Specify size of intermediate buffers for which to
# allocate on-chip buffers. Legal values are 32 and
# 64 (default is 64). Specify 32 to optimize for
# compressing buffers <=32KB in size.
DcIntermediateBufferSizeInKB = 64

##############################################
# Kernel Instances Section
##############################################
[KERNEL]
NumberCyInstances = 0
NumberDcInstances = 0

# Crypto - Kernel instance #0
Cy0Name = "IPSec0"
Cy0IsPolled = 0
Cy0CoreAffinity = 0

# Data Compression - Kernel instance #1
Dc0Name = "IPComp0"
Dc0IsPolled = 0
Dc0CoreAffinity = 0

[SHIM]
NumberCyInstances = 0
NumberDcInstances = 4
NumProcesses = 32
LimitDevAccess = 0

Dc0Name = "Dc0"
Dc0IsPolled = 1
Dc0CoreAffinity = 1
Dc1Name = "Dc1"
Dc1IsPolled = 1
Dc1CoreAffinity = 1
Dc2Name = "Dc2"
Dc2IsPolled = 1
Dc2CoreAffinity = 1
Dc3Name = "Dc3"
Dc3IsPolled = 1
Dc3CoreAffinity = 1

I calculating these settings based on the documentation here https://01.org/sites/default/files/downloads//336210qatswprogrammersguiderev006.pdf and tweaked for a desired number of 32 processes.

Status looks good:

# adf_ctl status
Checking status of all devices.
There is 3 QAT acceleration device(s) in the system:
 qat_dev0 - type: c6xx,  inst_id: 0,  node_id: 0,  bsf: 0000:05:00.0,  #accel: 5 #engines: 10 state: up
 qat_dev1 - type: c6xx,  inst_id: 1,  node_id: 0,  bsf: 0000:06:00.0,  #accel: 5 #engines: 10 state: up
 qat_dev2 - type: c6xx,  inst_id: 2,  node_id: 0,  bsf: 0000:07:00.0,  #accel: 5 #engines: 10 state: up

This configuration works great with the following test command. note: large-test-data is a 3.7GiB OS image.

# qzip -k -L1 large-test-data > /dev/null

And also in parallel for parallelism 2-7.

# parallel -j7 -N0 qzip -k -L1 large-test-data > /dev/null ::: {1..7}

But when I try parallelism=8, every qzip process falls back to software

# parallel -j8 -N0 qzip -k -L1 large-test-data > /dev/null ::: {1..8}
g_process.qz_init_status = QZ_NO_HW
g_process.qz_init_status = QZ_NO_HW
g_process.qz_init_status = QZ_NO_HW
[repeated several times]

Additionally, when looking at the fw_counters values, I only see the counters increasing for a single endpoint (the first one). The other counters do not increase at all. For the paralleism=8 test, none of the counters increase. Any ideas?

qzClose doesn't seem to work as advertised

The documentation claims that qzClose "terminates a QATzip session". However, if you look at the code, the session pointer is not really used except to check whether it's NULL. If it's not NULL, the function appears to proceed to remove ALL sessions that are known about. This seems wrong.

I am trying to use QATzip to add hardware acceleration in a streaming application. It can handle many connections at a time and I'm trying to figure out whether I should have one global session or one session per connection, etc. It would seem I can't be calling this qzClose() function while I still have any active sessions, based on the code. Which is not what I expected from reading the documentation.

It seems to work better if I just don't call qzClose at all, and now that maybe makes sense. However, the problem I'm seeing is that hugepages are leaking and I eventually run out of them after perhaps 15GB of compression traffic. If you have any insight as to why that might be happening please comment. :) I currently create many sessions, and call qzTeardownSession when I'm done with one but not qzClose.

Not improve performance when used qatzip

1、version

Deiver: qat1.7.l.4.9.0-00008
Qat_Engine:v0.5.44
OpenSSL-1.1.1g
QATzip:v1.0.1

2、./nginx/sbin/nginx -V

nginx version: openresty/1.15.8.1
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
built with OpenSSL 1.1.1g 21 Apr 2020
TLS SNI support enabled
configure arguments: --prefix=/export/servers/OpenResty-1.15.8/nginx --with-cc-opt='-O2 -ggdb -O2 -I/export/servers/OpenResty-1.15.8/include -I/export/servers/OpenSSL-1.1.1g/include -I/export/servers/qat/QATzip/include -I/export/servers/zlib-1.2.11/include -D NGX_SECURE_MEM -D JD_NGX_SSL_HANDSHAKE_TIME -D JD_NGX_HTTP_UPSTREAM_RANDOM -Wno-error=deprecated-declarations' --add-module=../ngx_devel_kit-0.3.1rc1 --add-module=../echo-nginx-module-0.61 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.31 --add-module=../ngx_lua-0.10.15 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../rds-json-nginx-module-0.15 --add-module=../rds-csv-nginx-module-0.09 --add-module=../ngx_stream_lua-0.0.7 --with-ld-opt='-Wl,-rpath,/export/servers/OpenResty-1.15.8/luajit/lib -Wl,-rpath=/export/servers/OpenSSL-1.1.1g/lib -L/export/servers/OpenSSL-1.1.1g/lib -L/export/servers/qat/QATzip/lib64 -lqatzip -L/export/servers/zlib-1.2.11/lib -lz -lssl' --with-pcre=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/pcre-8.39 --with-pcre-jit --with-threads --with-http_auth_request_module --with-http_ssl_module --with-http_gzip_static_module --with-http_stub_status_module --with-http_v2_module --with-http_realip_module --with-http_addition_module --with-http_slice_module --add-module=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/lua-ssl-nginx-module --add-module=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/ngx_http_dyups_module --add-module=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/ngx_http_sticky_module --with-stream --with-stream_ssl_module --with-openssl-async --with-http_gunzip_module --with-pcre-opt='-g -Ofast -fPIC -m64 -march=native -fstack-protector-strong -D_FORTIFY_SOURCE=2' --add-dynamic-module=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/nginx_qat_module --add-dynamic-module=/root/rpmbuild/BUILD/OpenResty-1.15.8-2.3-56.851dbdb/thirdparty/nginx_qatzip_module --with-stream --with-stream_ssl_preread_module

3、ldd nginx/sbin/nginx

linux-vdso.so.1 => (0x00007fff5dadb000)
libqatzip.so.1 => /export/servers/qat/QATzip/lib64/libqatzip.so.1 (0x00007f62f51d1000)
libz.so.1 => /export/servers/zlib-1.2.11/lib/libz.so.1 (0x00007f62f4fb6000)
libssl.so.1.1 => /export/servers/OpenSSL-1.1.1g/lib/libssl.so.1.1 (0x00007f62f4d1f000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f62f4b1b000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f62f48ff000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f62f46c8000)
libluajit-5.1.so.2 => /export/servers/OpenResty-1.15.8/luajit/lib/libluajit-5.1.so.2 (0x00007f62f4449000)
libm.so.6 => /lib64/libm.so.6 (0x00007f62f4147000)
libcrypto.so.1.1 => /export/servers/OpenSSL-1.1.1g/lib/libcrypto.so.1.1 (0x00007f62f3c8a000)
libc.so.6 => /lib64/libc.so.6 (0x00007f62f38bd000)
libqat_s.so => /export/servers/qat/QAT_Driver/lib/libqat_s.so (0x00007f62f35e8000)
libusdm_drv_s.so => /export/servers/qat/QAT_Driver/lib/libusdm_drv_s.so (0x00007f62f33d0000)
/lib64/ld-linux-x86-64.so.2 (0x00007f62f583c000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007f62f31cd000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f62f2fb7000)
libudev.so.1 => /lib64/libudev.so.1 (0x00007f62f2da1000)
librt.so.1 => /lib64/librt.so.1 (0x00007f62f2b99000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007f62f2994000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007f62f2745000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007f62f2540000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007f62f2328000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f62f2102000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f62f1ef2000)

4、 ./bin/wrk -c 500 -t 50 -H 'Accept-Encoding: gzip' http://my-qat.jd.com:5543/

5、cat nginx.conf

      load_module modules/ngx_ssl_engine_qat_module.so;
      load_module modules/ngx_http_qatzip_filter_module.so;

      events {
             use epoll;
             multi_accept on;
             worker_connections  102400;
      }

      ssl_engine {
             use_engine qat;
             default_algorithms ALL;
             qat_engine {
                      qat_offload_mode async;
                      qat_notify_mode poll;
                      qat_poll_mode heuristic;
                      qat_sw_fallback on;
             }
      }
http {
       gzip  on;
       gzip_http_version   1.0;
       gzip_proxied any;
    qatzip_sw failover;
    qatzip_min_length 128;
    qatzip_comp_level 1;
    qatzip_buffers 16 8k;
    qatzip_types text/css text/javascript text/xml text/plain text/x-component application/javascript application/json application/xml application/rss+xml font/truetype font/opentype application/vnd.ms-fontobject image/svg+xml application/octet-stream image/jpeg;
    qatzip_chunk_size   64k;
    qatzip_stream_size  256k;
    qatzip_sw_threshold 256;
}

6、top and perf top

top-cpu-qat-zip

WX20200624-103151@2x

I think not improve performance when used qatzip

qzCompressStream() fails to stream

I'm having a problem very similar to #18, but for the case of HW compression.

It looks like qzCompressStream() compresses full data blobs only and cannot be used with streaming data.

My use case that fails:
I'm compressing a big file in smaller data blobs, setting last = 1 when the last blob is written. Then I'm de-compressing the compressed stream and comparing it to my original data. However, the reading stops already at the first smaller data blob I sent for compression.

The source code suggests that if QzSessionParams_T.data_fmt set to QZ_DEFLATE_GZIP_EXT (the default) then every data chunk is wrapped with a gzip header and footer.

Am I supposed to use QZ_DEFLATE_RAW for a session and to wrap the stream with a header and footer myself?

run_perf_test.sh: stderr skewing test results

Currently, stderr is written to result_comp and result_decomp. Since
throughput is calculated with awk ... result_comp, stderr messages can
vastly skew the reported compthroughput value. This problem is also
theoretically possible for decompthroughput, but it was not observed.

We fix this by only writing stdout to result_comp and result_decomp, NOT stderr.
Instead, stderr is written to result_comp_stderr/result_decomp_stderr, which
are deleted when this script is invoked.

I will create a PR addressing this issue.

request: examples using qzCompressStream/qzDecompressStream

It would be helpful to have some examples using the stream API. qzip is limited to the buffer API and the test cases in test/main.c do not provide a real-world example since they attempt to allocate an input buffer equal to the size of the input file rather than setting a suitable buffer and making several calls to the qz function and setting last at the end.

I ask because I've run into a case where qzDecompressStream sets strm->in_sz to a negative number and the documentation is unclear about how to handle that case. Hopefully I can provide code soon, but the issue appears to be that in_sz is set to copied_input + consumed but consumed can include a value from pending_in, which should not be deducted from in_sz. This seems to occur in a case where qzDecompressStream goes through the loop once, processes some data and has some output, then goes through the loop again but pending_in is less than buf_len so it returns (with a debug msg to batch more data)

QATzip Unable to Find Hardware

I am trying to install QATzip on a box with a QuickAssist 8950. I was able to compile it successfully, but when I run it, I get a message:

"Error no hardware, switch to SW if permitted"

The machine is not virtualized; the intent is to eventually run QATzip inside a (single) Docker container, but I'm currently just trying to get it to work outside of Docker. I am guessing the problem is due to the QAT driver: I built and installed the driver from the 01.org website, but when I run:

service qat_service start

I see the following error message:

[ 1343.743455] dh895xcc 0000:0c:00.0: Cannot use PF with IOMMU enabled

I tried adding intel_iommu=off to grub, but this made no difference. When I run service qat_service status, it reports that it is able to see the QAT acceleration device. I'd greatly appreciate any assistance you can provide.

Question: Xeon D-2100'NT' QAT

The Xeon D-2100 series with the "NT" designation have some amount of integrated QAT. Although it's not called out in the README, do you know if QATzip is compatible with this SOC?

Thanks!

doCompressOut's behavior does not match doDecompressIn/doDecompressOut

This mismatch does not impact default behavior with default compress format, but it should not be correct logically.

doCompressOut:

                outputFooterGen(qz_sess, resl, data_fmt);
                qz_sess->next_dest += outputFooterSz(data_fmt);    <<<< outputFooterSz(data_fmt) could be zero
                qz_sess->qz_out_len += outputFooterSz(data_fmt);   <<<< outputFooterSz(data_fmt) could be zero

doDecompressIn:

        src_avail_len -= (outputHeaderSz(data_fmt) + src_send_sz + stdGzipFooterSz()); <<<< stdGzipFooterSz()
        dest_avail_len -= dest_receive_sz;

        dest_ptr += dest_receive_sz;

        src_ptr += (src_send_sz + stdGzipFooterSz());    <<<<  stdGzipFooterSz() >0 always
        remaining -= (src_send_sz + stdGzipFooterSz());  <<<<< stdGzipFooterSz() >0 always 

doDecompressOut:

           qz_sess->next_dest += resl->produced;
            qz_sess->qz_in_len += (outputHeaderSz(data_fmt) + src_send_sz +
                                   stdGzipFooterSz());

qzSWCompress fails to stream

I'm using the SW fallback via the streaming API. It looks qzSWCompress compresses full data blobs only and cannot be used with streaming data.

My use case that fails:
I'm compressing a big file in smaller data blobs, setting last = 1 when the last blob is written. Then I'm de-compressing the compressed stream and comparing it to my original data. However, the reading stops already at the first smaller data blob I sent for compression.

QATzip support static link zlib-1.2.11 library?

./sbin/nginx -V
nginx version: nginx/1.16.1
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
built with OpenSSL 1.1.1g 21 Apr 2020
TLS SNI support enabled
configure arguments: --prefix=/export/servers/asynch_mode_nginx --with-http_ssl_module --with-http_stub_status_module --with-http_v2_module --with-stream --with-stream_ssl_module --with-pcre=/export/servers/myqat/pcre-8.40 --with-pcre-jit --with-zlib=/export/servers/myqat/zlib-1.2.11 --with-pcre-opt='-g -Ofast -fPIC -m64 -march=native -fstack-protector-strong -D_FORTIFY_SOURCE=2' --with-zlib-opt='-g -Ofast -fPIC -m64 -march=native -fstack-protector-strong -D_FORTIFY_SOURCE=2' --add-dynamic-module=modules/nginx_qatzip_module --add-dynamic-module=modules/nginx_qat_module --with-cc-opt=' -fPIC -DNGX_SECURE_MEM -I/export/servers/Openssl-1.1.1g/include -I/export/servers/myqat/qat/QAT-1.7/QATzip/include -Wno-error=deprecated-declarations' --with-ld-opt='-Wl,-rpath=/export/servers/Openssl-1.1.1g/lib -L/export/servers/Openssl-1.1.1g/lib -L/export/servers/myqat/qat/QAT-1.7/QATzip/src -lqatzip -L/export/servers/myqat/zlib-1.2.11 -lz -lssl'

ldd ./sbin/nginx
linux-vdso.so.1 => (0x00007ffddecf2000)
libqatzip.so.1 => /export/servers/qat/QATzip/lib64/libqatzip.so.1 (0x00007ff951723000)
libssl.so.1.1 => /export/servers/Openssl-1.1.1g/lib/libssl.so.1.1 (0x00007ff95148c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007ff951288000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff95106c000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007ff950e35000)
libcrypto.so.1.1 => /export/servers/Openssl-1.1.1g/lib/libcrypto.so.1.1 (0x00007ff950978000)
libc.so.6 => /lib64/libc.so.6 (0x00007ff9505ab000)
libqat_s.so => /export/servers/myqat/qat/qat-1.7-driver/lib/libqat_s.so (0x00007ff9502d6000)
libusdm_drv_s.so => /export/servers/myqat/qat/qat-1.7-driver/lib/libusdm_drv_s.so (0x00007ff9500be000)
libz.so.1 => /lib64/libz.so.1 (0x00007ff94fea8000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff951cbb000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007ff94fca5000)
libudev.so.1 => /lib64/libudev.so.1 (0x00007ff94fa8f000)
librt.so.1 => /lib64/librt.so.1 (0x00007ff94f887000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007ff94f682000)
libm.so.6 => /lib64/libm.so.6 (0x00007ff94f380000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007ff94f131000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff94ef1b000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007ff94ed16000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007ff94eafe000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007ff94e8d8000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007ff94e6c8000)

I found nginx link /lib64/libz.so.1,not /export/servers/myqat/zlib-1.2.11,QATzip not support static link zlib-1.2.11 library?

Does QAT communication chipest 8950 support stateful Deflate Compression?

Hi,
Now I am trying to run the stateful data compression sample located in /dc/stateful_sample .My QAT is COMMUNICATION CHIPSET 8950. However,I found that this function is not supported by QAT. When i want to run this sample,the debug output is as following:

[root@node5 stateful_sample]# ./dc_stateful_sample
main(): Starting Stateful Compression Sample Code App ...
dcStatefulSample(): cpaDcQueryCapabilities
dcStatefulSample(): Error: Unsupported functionality compression
main():
Stateful Compression Sample Code App failed

Then i found the definition of the compression capabilities in sal_compression.c :

#ifdef CNV_STRICT_MODE
pInstanceCapabilities->statefulDeflateCompression = CPA_FALSE;
#else
pInstanceCapabilities->statefulDeflateCompression = CPA_TRUE;

I did not find the definition of CNV_STRICT_MODE,so is my QAT really not support stateful deflate compression?

Regards.

`qzSWCompress` fails to handle small input buf (20 bytes in this case)

Recurring steps:

  1. cd $QATZIP_ROOT && ./configure --with-ICP_ROOT=$ICP_ROOT --enable-debug && make all install
  2. cd $QATZIP_ROOT/test && echo "THIS IS TEST STRING" > tinydata && ./test -m 4 -B swBack -i tinydata -t 1
[root@qat0 test]# echo "THIS IS TEST STRING" > tinydata && ./test -m 4 -B swBack -i tinydata -t 1
2 MB page is 0x2aaaaac00000
Inserting 0x2aaaaac00000 at slot 0
Read 20 bytes from file tinydata
Found 0x2aaaaac00000 at slot 0
Found 0x2aaaaac00000 at slot 0
Hello from qzCompressAndDecompress tid=0, count=2, service=0, verify_data=0
Number of instance: 4
qzInit  rc = 0
qzSetupSession rc = 0
thread 0 before Compressed 20 bytes into 40
qzCompressCrc data_fmt: 2, input crc32 is 0x0
compression src_len=20, sess_params.input_sz_thrshold = 1024, process.qz_init_status = 0, sess->hw_session_stat = 0, qz_sess->sess_params.comp_lvl = 1, switch to software.
ERR: deflate failed with return code: 0
ERROR: Compression FAILED with return value: -2
		freeing 0x2aaaaac01800
Found 0x2aaaaac00000 at slot 0
		freeing 0x2aaaaac01c00
Found 0x2aaaaac00000 at slot 0
		freeing 0x2aaaaac01400
Found 0x2aaaaac00000 at slot 0
Call stopQat.
[INFO]: Compression num_th 1
th_id: 1975191296 comp_hw_count: 0 comp_sw_count: 1 decomp_hw_count: 0 decomp_sw_count: 0

Why include qat driver header files in qatzip.h?

If there is no #include <cpa_dc.h> in qatzip.h, then I don't need to add include_directories("${QatDrv_INCLUDE_DIRS}") in CMakelists.txt because I have set the properties INTERFACE_INCLUDE_DIRECTORIES "${QatDrv_INCLUDE_DIRS}". But now the <qatzip.h> in system include, the file can't find <cpa_dc.h>.

So can you move #include <cpa_dc.h> to another file.

the below code in <qatzip.h>

#include <cpa_dc.h>
#if (CPA_DC_API_VERSION_NUM_MAJOR >= 3) && (CPA_DC_API_VERSION_NUM_MINOR >= 0)
#define QZ_DEFLATE_COMP_LVL_MAXIMUM   (12)
#else
#define QZ_DEFLATE_COMP_LVL_MAXIMUM   (9)
#endif

I wrote

if (NOT DEFINED ENV{ICP_ROOT})
        message(FATAL_ERROR "Not defined environment variable: ICP_ROOT.")
endif()
message(STATUS "ICP_ROOT=$ENV{ICP_ROOT}")

set(QatDrv_INCLUDE_DIRS
  $ENV{ICP_ROOT}/quickassist/include
  $ENV{ICP_ROOT}/quickassist/include/dc
  $ENV{ICP_ROOT}/quickassist/lookaside/access_layer/include
  $ENV{ICP_ROOT}/quickassist/include/lac
  $ENV{ICP_ROOT}/quickassist/utilities/libusdm_drv
  $ENV{ICP_ROOT}/quickassist/utilities/libusdm_drv/include
  )

set(QatDrv_COMPONENTS qat_s usdm_drv_s)
foreach(component ${QatDrv_COMPONENTS})
  find_library(QatDrv_${component}_LIBRARIES
        NAMES ${component}
        HINTS $ENV{ICP_ROOT}/build/)
  mark_as_advanced(
    QatDrv_INCLUDE_DIRS
    QatDrv_${component}_LIBRARIES)
  list(APPEND QatDrv_LIBRARIES "${QatDrv_${component}_LIBRARIES}")
endforeach()

include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(QatDrv
  REQUIRED_VARS QatDrv_LIBRARIES QatDrv_INCLUDE_DIRS
  )

if(QatDrv_FOUND)
  foreach(component ${QatDrv_COMPONENTS})
    if (NOT TARGET QatDrv::${component})
      add_library(QatDrv::${component} SHARED IMPORTED GLOBAL)
      set_target_properties(QatDrv::${component} PROPERTIES
        INTERFACE_INCLUDE_DIRECTORIES "${QatDrv_INCLUDE_DIRS}"   <<<<<<<<<<<<<< This properties is link to QatDrv::usdm_drv_s and QatDrv::qat_s.
        IMPORTED_LINK_INTERFACE_LANGUAGES "C"
        IMPORTED_LOCATION "${QatDrv_${component}_LIBRARIES}")
    endif()
  endforeach()
endif()

If there is no #include <cpa_dc.h> in qatzip.h, then I don't need to add include_directories("${QatDrv_INCLUDE_DIRS}") in CMakelists.txt. So can you move #include <cpa_dc.h> to another file.

if(HAVE_QATZIP AND HAVE_QATDRV)
   #  include_directories("${QatDrv_INCLUDE_DIRS}")   <============== This is only for  <cpa_dc.h> file. It shouldn't add this here.
  target_link_libraries(compressor PRIVATE
          QatDrv::qat_s
          QatDrv::usdm_drv_s
          qatzip::qatzip
          )
endif()

Are Xeon D QAT boards supported?

The hardware section here only mentions the QAT adapters. However, QuickAssist is also available on several Xeon D boards. Does this library support QAT on Xeon D as well?

You can see the Skylake Xeon D boards with QAT here.

Do you know the reason about segfault

I use qatzip in Ceph. But when I test the performance. it always meet the error.

Do you know the reason?

This log is from Ceph log, but the error seems from qatzip library.

 ceph version 17.0.0-9995-g1ba2b31b78c (1ba2b31b78c83869a3d755ed43c82805fc6c1ae8) quincy (dev)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f8c372013c0]
 2: deflateReset()
 3: qzSWCompress()
 4: qzCompress()
 5: (QatAccel::compress(ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list&, boost::optional<int>&)+0xce) [0x7f8c3668cd3e]
 6: (ZlibCompressor::compress(ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list&, boost::optional<int>&)+0x53) [0x7f8c30033ea3]
 7: (RGWPutObj_Compress::process(ceph::buffer::v15_2_0::list&&, unsigned long)+0x8e) [0x7f8c379506fe]
 8: (RGWPutObj::execute(optional_yield)+0xd5c) [0x7f8c37b7d17c]
 9: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0x1387) [0x7f8c377552a7]
 10: (process_request(rgw::sal::Store*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSink*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x3868) [0x7f8c3775a7d8]
 11: /usr/local/lib/libradosgw.so.2(+0x4af6e9) [0x7f8c376be6e9]
 12: /usr/local/lib/libradosgw.so.2(+0x4b0dd5) [0x7f8c376bfdd5]
 13: make_fcontext()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

And the information is from dmesg.

[10796.294735] radosgw[57746]: segfault at 0 ip 00007f5d49498a41 sp 00007f5d3c36cd00 error 6
[10796.294735] radosgw[58036]: segfault at 8 ip 00007f5d4949821b sp 00007f5d246f3cf8 error 4 in libqatzip.so.1.0.7[7f5d49497000+8000]
[10796.294755]  in libqatzip.so.1.0.7[7f5d49497000+8000]

[10796.294765] Code: 95 80 00 00 00 48 83 40 10 01 4a 8d 04 ed 00 00 00 00 48 89 44 24 10 48 8b 81 58 02 00 00 8b 4c 24 30 4a 8b 04 e8 48 8b 40 08 <89> 08 8b 45 20 8d 14 c0 c1 ea 03 81 c2 00 04 00 00 3b 54 24 34 89
[10796.294766] Code: 7e 3f 48 8b 8a 70 02 00 00 48 63 d0 48 8d 34 52 48 8d 14 b2 48 8d 14 d1 eb 10 0f 1f 44 00 00 83 c0 01 48 83 c2 68 39 f8 74 2d <48> 8b 72 08 48 3b 72 10 75 eb 48 3b 72 18 75 e5 48 3b 72 20 75 df
[10796.706838] c6xx 0000:d0:00.0: Process 57492 radosgw exit with orphan rings
[10796.706897] c6xx 0000:d0:00.0: Process 57492 radosgw exit with orphan rings
[10796.706950] c6xx 0000:ce:00.0: Process 57492 radosgw exit with orphan rings
[10796.707002] c6xx 0000:ce:00.0: Process 57492 radosgw exit with orphan rings
[10796.707055] c6xx 0000:cc:00.0: Process 57492 radosgw exit with orphan rings
[10796.707108] c6xx 0000:cc:00.0: Process 57492 radosgw exit with orphan rings

[11418.031149] radosgw[59489]: segfault at 7f14fcd2d46e ip 00007f1bf471852b sp 00007f1b08168cc0 error 4
[11418.031149] radosgw[59346]: segfault at 7f14fbc40503 ip 00007f1bf471852b sp 00007f19eb2fdcc0 error 4
[11418.031149] radosgw[59487]: segfault at 7f14fcd5b31b ip 00007f1bf471852b sp 00007f19eb27ccc0 error 4 in libz.so.1.2.11[7f1bf4716000+11000]
[11418.031167]  in libz.so.1.2.11[7f1bf4716000+11000]
[11418.031167]  in libz.so.1.2.11[7f1bf4716000+11000]


[11418.031174] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031174] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00

[11418.031181] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031199] radosgw[59364]: segfault at 7f14fd19f0ee ip 00007f1bf471852b sp 00007f1b30168cc0 error 4
[11418.031207] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031208] radosgw[59213]: segfault at 7f14fce84066 ip 00007f1bf471852b sp 00007f1b24177cc0 error 4
[11418.031217] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031231] radosgw[59538]: segfault at 7f14fc6c30fc ip 00007f1bf471852b sp 00007f1bb8364cc0 error 4
[11418.031233] radosgw[59164]: segfault at 7f14fcc635fb ip 00007f1bf471852b sp 00007f1b0046ecc0 error 4


[11418.031242] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031243] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031242] radosgw[59253]: segfault at 7f14fbf50a54 ip 00007f1bf471852b sp 00007f1b3046ecc0 error 4

[11418.031250] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031253] radosgw[59413]: segfault at 7f14fd1b135d ip 00007f1bf471852b sp 00007f1bd81e9cc0 error 4
[11418.031258] Code: 01 00 00 8b 93 ac 00 00 00 8b bb a0 00 00 00 44 8b 83 b0 00 00 00 48 8b 4b 60 8d 42 02 8b b3 80 00 00 00 41 89 d1 44 23 4b 58 <0f> b6 04 01 8b 8b 90 00 00 00 d3 e6 48 8b 4b 78 31 f0 23 83 8c 00
[11418.031284] radosgw[59382]: segfault at 1748 ip 00007f1bf470a921 sp 00007f1b106f3d90 error 4
[11418.031296] Code: 8b 04 20 4c 8b 68 08 4d 85 ed 0f 84 8c 00 00 00 ba 01 00 00 00 31 f6 4c 89 ff e8 da cf ff ff 48 8b 53 10 49 89 45 08 48 01 ea <4c> 8b aa 48 02 00 00 4b 8b 4c 25 00 48 8b 41 08 48 83 78 08 00 74
[11418.270954] c6xx 0000:d0:00.0: Process 59016 radosgw exit with orphan rings

QATzip Test Performance Metrics

Hi,

What kind of performance should be expected with qzip and $QZ_ROOT/test/performance_tests/run_perf_test.sh?
Also, what steps can be taken to improve performance of $QZ_ROOT/test/test and qzip?

The only metrics I have found online are from NGINX who have recorded 68 Gbs Verified Compression.

However, attempts with run_perf_test.sh have shown much smaller numbers:

compthroughput=8.42882 Gbps
decompthroughput=6.6209 Gbps

This is with the C62X chipset.

Thanks!

Misleading instructions in README

I tried to follow the README when testing ./util/qzip on a machine with QAT HW, but it failed constantly with

mmap: exceeded max huge pages allocations for this process

Turned out the README is a bit misleading. It says

Enable huge page

echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
rmmod usdm_drv
insmod $ICP_ROOT/build/usdm_drv.ko max_huge_pages=1024 max_huge_pages_per_process=16

I looks like the lib allocates one huge page per CPU core. My machine has more than 16 cores. I got qzip working only after replacing 16 with the number of CPU cores.

run_perf_test.sh: move shebang to top of file

Hi, please move the shebang in QATzip/test/performance_tests/run_perf_test.sh to the top of the file so that it will work properly [with Ubuntu 18.04]. You have recorded similar issues with #13 and #25, however this issue is for a different file. I will attach a PR related to this issue.

Using a relative path for $ICP_ROOT leads to a compile error

Either make this clear in the README or fix in the Makefile (ideally the later). Using an absolute path for $ICP_ROOT works as expected.

$ ./configure --with-ICP_ROOT=../driver && make all
<snip>
gcc -Wall -Werror -std=gnu99 -pedantic -O2 -fstack-protector -fPIE -fPIC -D_FORTIFY_SOURCE=2 -fno-strict-overflow -fno-delete-null-pointer-checks -fwrapv -m64 -DADF_PCI_API -I../driver/quickassist/include -I../driver/quickassist/include/dc -I../driver/quickassist/lookaside/access_layer/include -I../driver/quickassist/utilities/libusdm_drv -I/home/[email protected]/tmp-qat/QATzip/include -I/home/[email protected]/tmp-qat/QATzip/src -c qatzip.c -o qatzip.o
qatzip.c:48:10: fatal error: cpa.h: No such file or directory
 #include "cpa.h"
          ^~~~~~~
compilation terminated.
Makefile:58: recipe for target 'qatzip.o' failed
make[1]: *** [qatzip.o] Error 1
make[1]: Leaving directory '/home/[email protected]/tmp-qat/QATzip/src'
Makefile:91: recipe for target 'libqatzip.a' failed
make: *** [libqatzip.a] Error 2

This appears to be a duplicate of this issue but it is hard to tell since Intel blocked the discussion for non-collaborators ;-).

Inaccurate calculation of throughput in performance test

I noticed the way of calculating the compression/decompression throughput in run_performance_test.sh is to just adding up the throughput number of all the threads in result_comp/result_decomp.

for((numProc_comp = 0; numProc_comp < $process; numProc_comp ++))
do
    $QZ_ROOT/test/test -m 4 -l 1000 -t $thread -D comp $extra_args >> result_comp 2>&1  &
done
wait
compthroughput=`awk '{sum+=$8} END{print sum}' result_comp`
echo "compthroughput=$compthroughput Gbps"

echo > result_decomp
for((numProc_decomp = 0; numProc_decomp < $process; numProc_decomp ++))
do
    $QZ_ROOT/test/test -m 4 -l 1000 -t $thread -D decomp $extra_args >> result_decomp 2>&1  &
done
wait
decompthroughput=`awk '{sum+=$8} END{print sum}' result_decomp`
echo "decompthroughput=$decompthroughput Gbps"

It seems to have the assumption that all threads start and finish at the same time, which is apparently not true in most cases.

For example, below is the output of result_comp after running the performance test script
[INFO] srv=COMP, tid=1, verify=0, count=1000, msec=8782055, bytes=1029744, 0.873621 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=8895848, bytes=1029744, 0.862446 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=9924964, bytes=1029744, 0.773019 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=9935799, bytes=1029744, 0.772177 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=8787976, bytes=1029744, 0.873033 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=8872350, bytes=1029744, 0.864730 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=9932433, bytes=1029744, 0.772438 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=9947483, bytes=1029744, 0.771270 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=8867084, bytes=1029744, 0.865244 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=8892059, bytes=1029744, 0.862814 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=9919737, bytes=1029744, 0.773427 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=9930240, bytes=1029744, 0.772609 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=8843361, bytes=1029744, 0.867565 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=8902020, bytes=1029744, 0.861848 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=9906058, bytes=1029744, 0.774495 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=9932502, bytes=1029744, 0.772433 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=7337955, bytes=1029744, 1.045549 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=7390413, bytes=1029744, 1.038127 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=10132327, bytes=1029744, 0.757199 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=10137029, bytes=1029744, 0.756848 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=7311454, bytes=1029744, 1.049339 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=7355766, bytes=1029744, 1.043017 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=10117367, bytes=1029744, 0.758319 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=10140230, bytes=1029744, 0.756609 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=7371144, bytes=1029744, 1.040841 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=7394235, bytes=1029744, 1.037591 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=10113654, bytes=1029744, 0.758597 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=10136290, bytes=1029744, 0.756903 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=2, verify=0, count=1000, msec=7283644, bytes=1029744, 1.053345 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=3, verify=0, count=1000, msec=7319118, bytes=1029744, 1.048240 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=0, verify=0, count=1000, msec=10126481, bytes=1029744, 0.757636 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080% [INFO] srv=COMP, tid=1, verify=0, count=1000, msec=10134585, bytes=1029744, 0.757031 Gbps, input_len=1029744, comp_len=211675, ratio=20.556080%

The compression throughput being printed in the terminal is
compthroughput=27.5284 Gbps

Should it be more accurate to always do it as follows?

Throughput = Total Processed Data of All Threads / (End Time of Last Thread - Start Time of First Thread)

dataLenInBytes in destBuffers set to wrong size.

g_process.qz_inst[i].dest_buffers[j]->pBuffers->dataLenInBytes

This seems to set the destBuffer pData size to the maximum expected size of the compressed input data, not the amount of memory that has actually been allocated for this buffer via qzMalloc on line 779.

You've probably not faced an issue from this because the HW probably doesn't really care about the value of this field. I'm trying to trap and forward QAT API calls and I kept getting a segfault when trying to copy destBuffer (my forwarding library was trying to memcpy dataInLenBytes from pData and was either copying garbage off the heap, or running into an unmapped page)...

Compiling outside source directory doesn't work

When integrating QATzip to other sandboxed build environments, it would be important to get compilation to work when not in the source code directory. Currently it doesn't work:

[QATzip]$ mkdir b
[QATzip]$ cd b
[QATzip/b]$ ../configure
../configure: line 48: src/qatzip_internal.h: No such file or directory

Assertion fail in qzCompress() when sometime, use QZ_DEFLATE_RAW format

OS: Ubuntu 18.04.6 x86_64 Server
Kernel: 5.4.0-65
HW: Intel QuickAssist 8970
 

QzSessionParams

direction: QZ_DIR_COMPRESS
data_fmt: QZ_DEFLATE_RAW
max_forks: 0

other ... Default
 

Console output:

qatzip.c:1687: qzCompressCrc: Assertion `*dest_len == sess->total_out' failed.
 

gdb backtrace

(gdb) bt
#0  0x00007ff4a51a8fb7 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ff4a51aa921 in __GI_abort () at abort.c:79
#2  0x00007ff4a519a48a in __assert_fail_base (fmt=0x7ff4a5321750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x7ff26acec331 "*dest_len == sess->total_out", file=file@entry=0x7ff26acec328 "qatzip.c",
    line=line@entry=1687, function=function@entry=0x7ff26acec6c0 <__PRETTY_FUNCTION__.6849> "qzCompressCrc") at assert.c:92
#3  0x00007ff4a519a502 in __GI___assert_fail (assertion=assertion@entry=0x7ff26acec331 "*dest_len == sess->total_out",
    file=file@entry=0x7ff26acec328 "qatzip.c", line=line@entry=1687,
    function=function@entry=0x7ff26acec6c0 <__PRETTY_FUNCTION__.6849> "qzCompressCrc") at assert.c:101
#4  0x00007ff26ace90f1 in qzCompressCrc (sess=sess@entry=0x7ff245b3b800, src=src@entry=0x7ff1a8c0b010 "",
    src_len=0x7ff2537fd8f0, dest=0x7ff0056b0010 "\354]\t`\023U\032~\223\206\022.\023<\353IЪqQ)\326u\213EH -\023,l=\351*J\261\002\301
    \213.\024-\352B\240T\023\306j\301\253\270\256ƃ5\353YE\245\350*)\b\004P)\340Q\320Ŋ\256\004\253R
    \217Ū\v\331\367\336\377\377i\337@\345Xtw\335|M\372\315\377\375\377;\346\275\067of^&-c̢\365pZ\323KKN\237\064\346Za\362\067
    ;,\213\343l\361:\343\364\254\276\327\217\037s\372\365\340e\032\377uh\337I\023\257u\217-+\351;\246\244tB߲1W\\3v2\023\340\211;\363\234\256
    \230\060\021l+c\275\222Ye\213\254&L*\233\060\361\264\353Ɩ"..., dest_len=0x7ff2537fd8f4, last=1, crc=<optimized out>) at qatzip.c:1687
#5  0x00007ff26ace9135 in qzCompress (sess=0x7ff245b3b800, src=0x7ff1a8c0b010 "", src_len=<optimized out>, dest=<optimized out>,
    dest_len=<optimized out>, last=<optimized out>) at qatzip.c:1524
#6  0x000055a0ca76ae10 in ft_compress_worker_func (seg_state=0x7ff24400d870, comp_state=0x7ff244132460) at migration/ft-message.c:945
#7  0x00007ff4a91d2c70 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#8  0x00007ff4a91d22a5 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#9  0x00007ff4a55626db in start_thread (arg=0x7ff2537fe700) at pthread_create.c:463
#10 0x00007ff4a528b71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) 

 

Update:

Initial understanding after testing, this situation occur raw format compression and resl->status equal CPA_DC_VERIFY_ERROR.

make install problems

I've discovered the following installation gaps with v1.0.1:

  1. --prefix=/usr fails
ln -s -f /usr/lib64/libqatzip.so /lib64
ln: '/usr/lib64/libqatzip.so' and '/lib64/libqatzip.so' are the same file
make: *** [Makefile:78: install] Error 1
  1. make install does not handle SOVERSIONS correctly but only copies the .so file

Compile error on Ubuntu 16.04

I am trying to compile this on Ubuntu, and I am getting this link error:
$ make all install
..
..
gcc qzip.o -o qzip -fstack-protector -fPIC -pie -z relro -z now -Wl,-z,noexecstack -L/home/zhuang1/QAT/QAT1.7.Upstream.L.1.0.3_42/build -Wl,-R/home/zhuang1/QAT/QAT1.7.Upstream.L.1.0.3_42/build /home/zhuang1/QAT/QATzip/src/libqatzip.a -lqat_s -lusdm_drv_s -lz -lpthread
/home/zhuang1/QAT/QATzip/src/libqatzip.a(qatzip.o): In function qzInit': qatzip.c:(.text+0x209c): undefined reference to icp_adf_get_numDevices'
collect2: error: ld returned 1 exit status
Makefile:40: recipe for target 'qzip' failed
make[1]: *** [qzip] Error 1
make[1]: Leaving directory '/home/zhuang1/QAT/QATzip/utils'
Makefile:93: recipe for target 'qzip' failed
make: *** [qzip] Error 2

I don't see icp_adf_get_numDevices() defined anywhere, any ideas?

qzip: snprintf format-truncation

qzip.c: In function ‘mkPath’:
qzip.c:478:42: error: ‘%s’ directive output may be truncated writing up to 1023 bytes into a region of size between 0 and 1023 [-Werror=format-truncation=]
  478 |         snprintf(path, MAX_PATH_LEN, "%s/%s", dirpath, file);
      |                                          ^~
In file included from /usr/include/stdio.h:867,
                 from qzip.c:41:
/usr/include/bits/stdio2.h:67:10: note: ‘__builtin___snprintf_chk’ output between 2 and 2048 bytes into a destination of size 1024
   67 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   68 |        __bos (__s), __fmt, __va_arg_pack ());
      |        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[1]: *** [Makefile:43: qzip.o] Error 1
make[1]: Leaving directory '/QATzip/utils'
make: *** [Makefile:97: qzip] Error 2

My fix:

--- a/utils/qzip.c
+++ b/utils/qzip.c
@@ -474,7 +474,7 @@ void mkPath(char *path, const char *dirpath, char *file)

     len = strlen(dirpath);

-    if (len < MAX_PATH_LEN && strlen(file) < MAX_PATH_LEN - len) {
+    if (len + strlen(file) + 1 < MAX_PATH_LEN) {
         snprintf(path, MAX_PATH_LEN, "%s/%s", dirpath, file);
     } else {
         assert(0);

Can the c3xx series support qatzip tools?

I want to compress one file into another file by qat that is c3xx series but i dont know how to use.
How can i use qatzip to compress file that is 1g size.
thank you!!!

how I start MultiProcess with same qat section

I use qatzip in Ceph. But during debugging, when using 1 OSD, there is no error and it works normally. But when I start multiple OSDs, I get this Error message

Error userStarMultiProcess(-1), switch to SW if permitted
g_process.qz_init_status = QZ_NO_HW

They have same section about all OSDs. export QAT_SECTION_NAME=SSL

So how I start MultiProcess with same qat section?

non-standard/incomplete configure/make scripts

Reproduced on tags/v0.2.7.
./configure --prefix=/opt/qat --includedir=/opt/qat/include
make -j8
make install, which shows as:
install -D -m 750 /home/QATzip/src/libqatzip.a /opt/qat/lib
install -D -m 750 /home/QATzip/src/libqatzip.so /opt/qat/lib
install -D -m 750 /home/QATzip/include/qatzip.h /usr/include/
ln -s -f /opt/qat/lib/libqatzip.so /lib64/libqatzip.so
install -D -m 777 /home/QATzip/utils/qzip /opt/qat/bin

The include file is installed to /usr/incude, regardless of any prefix or includedir setting.
The lib file is installed to /lib64, regardless of any prefix setting. Just linked back to /opt/qat.
The bin/lib sub folders are not created. Instead, bin/lib files are copied and overwritten.

ls -l /opt/qat
-rwxrwxrwx 1 root root 57552 Dec 30 05:26 bin
-rwxr-x--- 1 root root 50904 Dec 30 05:26 lib

make install DESTDIR=/home/build
No effect. All other configure-based project supports the installation redirection.

Single threaded execution results in a hung process

doCompressIn((void *)sess);

The way doCompressIn is currently written, it won't release control until the entire input has been offloaded to the QAT. doCompressOut must execute alongside in a separate thread, dequeuing completed requests to the QAT, in order for doCompressIn to finish...
So the attempt to execute in a single thread on line 1580 results in a deadlock.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.