Skip to content

Commit

Permalink
zstd: Import upstream v1.5.7
Browse files Browse the repository at this point in the history
In addition to keeping the kernel's copy of zstd up to date, this update
was requested by Intel to expose upstream's APIs that allow QAT to accelerate
the LZ match finding stage of Zstd.

This patch is imported from the upstream tag v1.5.7-kernel [0], which is signed
with upstream's signing key EF8FE99528B52FFD [1]. It was imported from upstream
using this command:

  export ZSTD=/path/to/repo/zstd/
  export LINUX=/path/to/repo/linux/
  cd "$ZSTD/contrib/linux-kernel"
  git checkout v1.5.7-kernel
  make import LINUX="$LINUX"

This patch has been tested on x86-64, and has been boot tested with
a zstd compressed kernel & initramfs on i386 and aarch64. I benchmarked
the patch on x86-64 with gcc-14.2.1 on an Intel i9-9900K by measruing the
performance of compressed filesystem reads and writes.

Component,  Level,  Size delta, C. time delta,  D. time delta
Btrfs    ,      1,      +0.00%,         -6.1%,          +1.4%
Btrfs    ,      3,      +0.00%,         -9.8%,          +3.0%
Btrfs    ,      5,      +0.00%,         +1.7%,          +1.4%
Btrfs    ,      7,      +0.00%,         -1.9%,          +2.7%
Btrfs    ,      9,      +0.00%,         -3.4%,          +3.7%
Btrfs    ,     15,      +0.00%,         -0.3%,          +3.6%
SquashFS ,      1,      +0.00%,           N/A,          +1.9%

The major changes that impact the kernel use cases for each version are:

v1.5.7: https://github.com/facebook/zstd/releases/tag/v1.5.7
* Add zstd_compress_sequences_and_literals() for use by Intel's QAT driver
  to implement Zstd compression acceleration in the kernel.
* Fix an underflow bug in 32-bit builds that can cause data corruption when
  processing more than 4GB of data with a single `ZSTD_CCtx` object, when an
  input crosses the 4GB boundry. I don't believe this impacts any current kernel
  use cases, because the `ZSTD_CCtx` is typically reconstructed between
  compressions.
* Levels 1-4 see 5-10% compression speed improvements for inputs smaller than
  128KB.

v1.5.6: https://github.com/facebook/zstd/releases/tag/v1.5.6
* Improved compression ratio for the highest compression levels. I don't expect
  these see much use however, due to their slow speeds.

v1.5.5: https://github.com/facebook/zstd/releases/tag/v1.5.5
* Fix a rare corruption bug that can trigger on levels 13 and above.
* Improve compression speed of levels 5-11 on incompressible data.

v1.5.4: https://github.com/facebook/zstd/releases/tag/v1.5.4
* Improve copmression speed of levels 5-11 on ARM.
* Improve dictionary compression speed.

Signed-off-by: Nick Terrell <terrelln@fb.com>
  • Loading branch information
Nick Terrell committed Mar 13, 2025
1 parent 7eb1721 commit 65d1f55
Show file tree
Hide file tree
Showing 60 changed files with 8,749 additions and 4,381 deletions.
87 changes: 82 additions & 5 deletions include/linux/zstd.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */
/*
* Copyright (c) Yann Collet, Facebook, Inc.
* Copyright (c) Meta Platforms, Inc. and affiliates.
* All rights reserved.
*
* This source code is licensed under both the BSD-style license (found in the
Expand Down Expand Up @@ -160,7 +160,6 @@ typedef ZSTD_parameters zstd_parameters;
zstd_parameters zstd_get_params(int level,
unsigned long long estimated_src_size);


/**
* zstd_get_cparams() - returns zstd_compression_parameters for selected level
* @level: The compression level
Expand All @@ -173,9 +172,20 @@ zstd_parameters zstd_get_params(int level,
zstd_compression_parameters zstd_get_cparams(int level,
unsigned long long estimated_src_size, size_t dict_size);

/* ====== Single-pass Compression ====== */

typedef ZSTD_CCtx zstd_cctx;
typedef ZSTD_cParameter zstd_cparameter;

/**
* zstd_cctx_set_param() - sets a compression parameter
* @cctx: The context. Must have been initialized with zstd_init_cctx().
* @param: The parameter to set.
* @value: The value to set the parameter to.
*
* Return: Zero or an error, which can be checked using zstd_is_error().
*/
size_t zstd_cctx_set_param(zstd_cctx *cctx, zstd_cparameter param, int value);

/* ====== Single-pass Compression ====== */

/**
* zstd_cctx_workspace_bound() - max memory needed to initialize a zstd_cctx
Expand All @@ -190,6 +200,20 @@ typedef ZSTD_CCtx zstd_cctx;
*/
size_t zstd_cctx_workspace_bound(const zstd_compression_parameters *parameters);

/**
* zstd_cctx_workspace_bound_with_ext_seq_prod() - max memory needed to
* initialize a zstd_cctx when using the block-level external sequence
* producer API.
* @parameters: The compression parameters to be used.
*
* If multiple compression parameters might be used, the caller must call
* this function for each set of parameters and use the maximum size.
*
* Return: A lower bound on the size of the workspace that is passed to
* zstd_init_cctx().
*/
size_t zstd_cctx_workspace_bound_with_ext_seq_prod(const zstd_compression_parameters *parameters);

/**
* zstd_init_cctx() - initialize a zstd compression context
* @workspace: The workspace to emplace the context into. It must outlive
Expand Down Expand Up @@ -424,6 +448,16 @@ typedef ZSTD_CStream zstd_cstream;
*/
size_t zstd_cstream_workspace_bound(const zstd_compression_parameters *cparams);

/**
* zstd_cstream_workspace_bound_with_ext_seq_prod() - memory needed to initialize
* a zstd_cstream when using the block-level external sequence producer API.
* @cparams: The compression parameters to be used for compression.
*
* Return: A lower bound on the size of the workspace that is passed to
* zstd_init_cstream().
*/
size_t zstd_cstream_workspace_bound_with_ext_seq_prod(const zstd_compression_parameters *cparams);

/**
* zstd_init_cstream() - initialize a zstd streaming compression context
* @parameters The zstd parameters to use for compression.
Expand Down Expand Up @@ -583,6 +617,18 @@ size_t zstd_decompress_stream(zstd_dstream *dstream, zstd_out_buffer *output,
*/
size_t zstd_find_frame_compressed_size(const void *src, size_t src_size);

/**
* zstd_register_sequence_producer() - exposes the zstd library function
* ZSTD_registerSequenceProducer(). This is used for the block-level external
* sequence producer API. See upstream zstd.h for detailed documentation.
*/
typedef ZSTD_sequenceProducer_F zstd_sequence_producer_f;
void zstd_register_sequence_producer(
zstd_cctx *cctx,
void* sequence_producer_state,
zstd_sequence_producer_f sequence_producer
);

/**
* struct zstd_frame_params - zstd frame parameters stored in the frame header
* @frameContentSize: The frame content size, or ZSTD_CONTENTSIZE_UNKNOWN if not
Expand All @@ -596,7 +642,7 @@ size_t zstd_find_frame_compressed_size(const void *src, size_t src_size);
*
* See zstd_lib.h.
*/
typedef ZSTD_frameHeader zstd_frame_header;
typedef ZSTD_FrameHeader zstd_frame_header;

/**
* zstd_get_frame_header() - extracts parameters from a zstd or skippable frame
Expand All @@ -611,4 +657,35 @@ typedef ZSTD_frameHeader zstd_frame_header;
size_t zstd_get_frame_header(zstd_frame_header *params, const void *src,
size_t src_size);

/**
* struct zstd_sequence - a sequence of literals or a match
*
* @offset: The offset of the match
* @litLength: The literal length of the sequence
* @matchLength: The match length of the sequence
* @rep: Represents which repeat offset is used
*/
typedef ZSTD_Sequence zstd_sequence;

/**
* zstd_compress_sequences_and_literals() - compress an array of zstd_sequence and literals
*
* @cctx: The zstd compression context.
* @dst: The buffer to compress the data into.
* @dst_capacity: The size of the destination buffer.
* @in_seqs: The array of zstd_sequence to compress.
* @in_seqs_size: The number of sequences in in_seqs.
* @literals: The literals associated to the sequences to be compressed.
* @lit_size: The size of the literals in the literals buffer.
* @lit_capacity: The size of the literals buffer.
* @decompressed_size: The size of the input data
*
* Return: The compressed size or an error, which can be checked using
* zstd_is_error().
*/
size_t zstd_compress_sequences_and_literals(zstd_cctx *cctx, void* dst, size_t dst_capacity,
const zstd_sequence *in_seqs, size_t in_seqs_size,
const void* literals, size_t lit_size, size_t lit_capacity,
size_t decompressed_size);

#endif /* LINUX_ZSTD_H */
30 changes: 20 additions & 10 deletions include/linux/zstd_errors.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */
/*
* Copyright (c) Yann Collet, Facebook, Inc.
* Copyright (c) Meta Platforms, Inc. and affiliates.
* All rights reserved.
*
* This source code is licensed under both the BSD-style license (found in the
Expand All @@ -12,13 +13,18 @@
#define ZSTD_ERRORS_H_398273423


/*===== dependency =====*/
#include <linux/types.h> /* size_t */
/* ===== ZSTDERRORLIB_API : control library symbols visibility ===== */
#define ZSTDERRORLIB_VISIBLE

#ifndef ZSTDERRORLIB_HIDDEN
# if (__GNUC__ >= 4) && !defined(__MINGW32__)
# define ZSTDERRORLIB_HIDDEN __attribute__ ((visibility ("hidden")))
# else
# define ZSTDERRORLIB_HIDDEN
# endif
#endif

/* ===== ZSTDERRORLIB_API : control library symbols visibility ===== */
#define ZSTDERRORLIB_VISIBILITY
#define ZSTDERRORLIB_API ZSTDERRORLIB_VISIBILITY
#define ZSTDERRORLIB_API ZSTDERRORLIB_VISIBLE

/*-*********************************************
* Error codes list
Expand All @@ -43,33 +49,37 @@ typedef enum {
ZSTD_error_frameParameter_windowTooLarge = 16,
ZSTD_error_corruption_detected = 20,
ZSTD_error_checksum_wrong = 22,
ZSTD_error_literals_headerWrong = 24,
ZSTD_error_dictionary_corrupted = 30,
ZSTD_error_dictionary_wrong = 32,
ZSTD_error_dictionaryCreation_failed = 34,
ZSTD_error_parameter_unsupported = 40,
ZSTD_error_parameter_combination_unsupported = 41,
ZSTD_error_parameter_outOfBound = 42,
ZSTD_error_tableLog_tooLarge = 44,
ZSTD_error_maxSymbolValue_tooLarge = 46,
ZSTD_error_maxSymbolValue_tooSmall = 48,
ZSTD_error_cannotProduce_uncompressedBlock = 49,
ZSTD_error_stabilityCondition_notRespected = 50,
ZSTD_error_stage_wrong = 60,
ZSTD_error_init_missing = 62,
ZSTD_error_memory_allocation = 64,
ZSTD_error_workSpace_tooSmall= 66,
ZSTD_error_dstSize_tooSmall = 70,
ZSTD_error_srcSize_wrong = 72,
ZSTD_error_dstBuffer_null = 74,
ZSTD_error_noForwardProgress_destFull = 80,
ZSTD_error_noForwardProgress_inputEmpty = 82,
/* following error codes are __NOT STABLE__, they can be removed or changed in future versions */
ZSTD_error_frameIndex_tooLarge = 100,
ZSTD_error_seekableIO = 102,
ZSTD_error_dstBuffer_wrong = 104,
ZSTD_error_srcBuffer_wrong = 105,
ZSTD_error_sequenceProducer_failed = 106,
ZSTD_error_externalSequences_invalid = 107,
ZSTD_error_maxCode = 120 /* never EVER use this value directly, it can change in future versions! Use ZSTD_isError() instead */
} ZSTD_ErrorCode;

/*! ZSTD_getErrorCode() :
convert a `size_t` function result into a `ZSTD_ErrorCode` enum type,
which can be used to compare with enum list published above */
ZSTDERRORLIB_API ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult);
ZSTDERRORLIB_API const char* ZSTD_getErrorString(ZSTD_ErrorCode code); /*< Same as ZSTD_getErrorName, but using a `ZSTD_ErrorCode` enum argument */


Expand Down
Loading

0 comments on commit 65d1f55

Please sign in to comment.