MyGit

v1.5.0

facebook/zstd

版本发布时间: 2021-05-15 00:01:54

facebook/zstd最新发布版本:v1.5.6(2024-03-31 02:57:28)

v1.5.0 is a major release featuring large performance improvements as well as API changes.

Performance

Improved Middle-Level Compression Speed

1.5.0 introduces a new default match finder for the compression strategies greedy, lazy, and lazy2, (which map to levels 5-12 for inputs larger than 256K). The optimization brings a massive improvement in compression speed with slight perturbations in compression ratio (< 0.5%) and equal or decreased memory usage.

Benchmarked with gcc, on an i9-9900K:

level silesia.tar speed delta enwik7 speed delta
5 +25% +25%
6 +50% +50%
7 +40% +40%
8 +40% +50%
9 +50% +65%
10 +65% +80%
11 +85% +105%
12 +110% +140%

On heavily loaded machines with significant cache contention, we have internally measured even larger gains: 2-3x+ speed at levels 5-7. 🚀

The biggest gains are achieved on files typically larger than 128KB. On files smaller than 16KB, by default we revert back to the legacy match finder which becomes the faster one. This default policy can be overriden manually: the new match finder can be forcibly enabled with the advanced parameter ZSTD_c_useRowMatchFinder, or through the CLI option --[no-]row-match-finder.

Note: only CPUs that support SSE2 realize the full extent of this improvement.

Improved High-Level Compression Ratio

Improving compression ratio via block splitting is now enabled by default for high compression levels (16+). The amount of benefit varies depending on the workload. Compressing archives comprised of heavily differing files will see more improvement than compression of single files that don’t vary much entropically (like text files/enwik). At levels 16+, we observe no measurable regression to compression speed.

level 22 compression

file ratio 1.4.9 ratio 1.5.0 ratio % delta
silesia.tar 4.021 4.041 +0.49%
calgary.tar 3.646 3.672 +0.71%
enwik7 3.579 3.579 +0.0%

The block splitter can be forcibly enabled on lower compression levels as well with the advanced parameter ZSTD_c_splitBlocks. When forcibly enabled at lower levels, speed regressions can become more notable. Additionally, since more compressed blocks may be produced, decompression speed on these blobs may also see small regressions.

Faster Decompression Speed

The decompression speed of data compressed with large window settings (such as --long or --ultra) has been significantly improved in this version. The gains vary depending on compiler brand and version, with clang generally benefiting the most.

The following benchmark was measured by compressing enwik9 at level --ultra -22 (with a 128 MB window size) on a core i7-9700K.

Compiler version D. Speed improvement
gcc-7 +15%
gcc-8 +10 %
gcc-9 +5%
gcc-10 +1%
clang-6 +21%
clang-7 +16%
clang-8 +16%
clang-9 +18%
clang-10 +16%
clang-11 +15%

Average decompression speed for “normal” payload is slightly improved too, though the impact is less impressive. Once again, mileage varies depending on exact compiler version, payload, and even compression level. In general, a majority of scenarios see benefits ranging from +1 to +9%. There are also a few outliers here and there, from -4% to +13%. The average gain across all these scenarios stands at ~+4%.

Library Updates

Dynamic Library Supports Multithreading by Default

It was already possible to compile libzstd with multithreading support. But it was an active operation. By default, the make build script would build libzstd as a single-thread-only library.

This changes in v1.5.0. Now the dynamic library (typically libzstd.so.1 on Linux) supports multi-threaded compression by default. Note that this property is not extended to the static library (typically libzstd.a on Linux) because doing so would have impacted the build script of existing client applications (requiring them to add -pthread to their recipe), thus potentially breaking their build. In order to avoid this disruption, the static library remains single-threaded by default. Luckily, this build disruption does not extend to the dynamic library, which can be built with multi-threading support while existing applications linking to libzstd.so and expecting only single-thread capabilities will be none the wiser, and remain completely unaffected.

The idea is that starting from v1.5.0, applications can expect the dynamic library to support multi-threading should they need it, which will progressively lead to increased adoption of this capability overtime. That being said, since the locally deployed dynamic library may, or may not, support multi-threading compression, depending on local build configuration, it’s always better to check this capability at runtime. For this goal, it’s enough to check the return value when changing parameter ZSTD_c_nbWorkers , and if it results in an error, then multi-threading is not supported.

Q: What if I prefer to keep the libraries in single-thread mode only ? The target make lib-nomt will ensure this outcome.

Q: Actually, I want both static and dynamic library versions to support multi-threading ! The target make lib-mt will generate this outcome.

Promotions to Stable

Moving up to the higher digit 1.5 signals an opportunity to extend the stable portion of zstd public API. This update is relatively minor, featuring only a few non-controversial newcomers.

ZSTD_defaultCLevel() indicates which level is default (applied when selecting level 0). It completes existing ZSTD_minCLevel() and ZSTD_maxCLevel(). Similarly, ZSTD_getDictID_fromCDict() is a straightforward equivalent to already promoted ZSTD_getDictID_fromDDict().

Deprecations

Zstd-1.4.0 stabilized a new advanced API which allows users to pass advanced parameters to zstd. We’re now deprecating all the old experimental APIs that are subsumed by the new advanced API. They will be considered for removal in the next Zstd major release zstd-1.6.0. Note that only experimental symbols are impacted. Stable functions, like ZSTD_initCStream(), remain fully supported.

The deprecated functions are listed below, together with the migration. All the suggested migrations are stable APIs, meaning that once you migrate, the API will be supported forever. See the documentation for the deprecated functions for more details on how to migrate.

Header File Locations

Zstd has slightly re-organized the library layout to move all public headers to the top level lib/ directory. This is for consistency, so all public headers are in lib/ and all private headers are in a sub-directory. If you build zstd from source, this may affect your build system.

Single-File Library

We have moved the scripts in contrib/single_file_libs to build/single_file_libs. These scripts, originally contributed by @cwoffenden, produce a single compilation-unit amalgamation of the zstd library, which can be convenient for integrating Zstandard into other source trees. This move reflects a commitment on our part to support this tool and this pattern of using zstd going forward.

Windows Release Artifact Format

We are slightly changing the format of the Windows release .zip files, to match our other release artifacts. The .zip files now bundle everything in a single folder whose name matches the archive name. The contents of that folder exactly match what was previously included in the root of the archive.

Signed Releases

We have created a signing key for the Zstandard project. This release and all future releases will be signed by this key. See #2520 for discussion.

Changelog

相关地址:原始地址 下载(tar) 下载(zip)

1、 zstd-1.5.0.tar.gz 1.77MB

2、 zstd-1.5.0.tar.gz.sha256 84B

3、 zstd-1.5.0.tar.gz.sig 858B

4、 zstd-1.5.0.tar.zst 1.35MB

5、 zstd-1.5.0.tar.zst.sha256 85B

6、 zstd-1.5.0.tar.zst.sig 858B

7、 zstd-v1.5.0-win32.zip 1.5MB

8、 zstd-v1.5.0-win64.zip 1.66MB

查看:2021-05-15发行的版本