0.127.0
版本发布时间: 2024-04-25 17:53:49
jqnatividad/qsv最新发布版本:0.134.0(2024-09-10 20:11:27)
📊 Enhanced Frequency Analysis 📊
This a quick release adding several frequency
enhancements for more detailed frequency analysis. The frequency
command now includes a percentage column, calculates other
values, and supports limiting unique counts and negative limits.
These options provides additional context for Datapusher+, qsv-pro and describegpt
so their metadata inferences are more accurate and comprehensive.
Previously, for a 775-row CSV file containing one column named state
with entries for all 50 states, frequency
only showed[^1]:
qsv frequency freq_state_example.csv | qsv table
field value count
state NY 100
state NJ 70
state CA 60
state MA 55
state FL 45
state TX 43
state NM 40
state AZ 39
state NV 38
state MI 35
Now, there's a new percentage
column and other
values calculation, both of which have configurable options:
qsv frequency freq_state_example.csv | qsv table
field value count percentage
state NY 100 12.90323
state NJ 70 9.03226
state CA 60 7.74194
state MA 55 7.09677
state FL 45 5.80645
state TX 43 5.54839
state NM 40 5.16129
state AZ 39 5.03226
state NV 38 4.90323
state MI 35 4.51613
state Other (40) 250 32.25806
This release is also out of cycle to address a big performance regression in the excel
command caused by unnecessary formula info retrieval for the --error-format
option introduced in 0.126.0. This has been fixed, and the excel
command is now back to its speedy self.
Added
-
frequency
: added percentage column;other
values calculation, implementing https://github.com/jqnatividad/qsv/issues/1774 https://github.com/jqnatividad/qsv/pull/1775 -
benchmarks
: added newfrequency
andexcel
benchmarks https://github.com/jqnatividad/qsv/commit/b83ad3aae1cdf9a1750201cbf9b3ccd4ac3a4192
Changed
- contrib(bashly): update completions.bash for qsv v0.126.0 by @rzmk in https://github.com/jqnatividad/qsv/pull/1771
- build(deps): bump mimalloc from 0.1.39 to 0.1.41 by @dependabot in https://github.com/jqnatividad/qsv/pull/1772
- build(deps): bump qsv-stats from 0.14.0 to 0.15.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1773
- updated several indirect dependencies
- applied select clippy recommendations
Fixed
-
excel
: fixed performance regression because qsv was unnecessarily getting formula info (an expensive operation) for--error-format
option even when not required https://github.com/jqnatividad/qsv/commit/772af3420c44c864e06cd2cb61606900bff17947 - renamed 0.126.0 sqlp_vs_duckdb benchmark results so they're next to each other for easy direct comparison. https://github.com/jqnatividad/qsv/commit/7bcd59e301965b9e8737a9230d1236e8d34ab4bf.
Per the benchmarks,sqlp
is 2.87 times faster than duckdb v0.10.2 for a simple aggregation (0.066 secs vs 0.19 secs), and 1.42 times faster for an "expensive" aggregation (0.143 secs vs 0.203 secs).
Full Changelog: https://github.com/jqnatividad/qsv/compare/0.126.0...0.127.0
[^1]: with its default --limit
setting of 10 only show the top 10 unique values in the column, sorted by occurence
1、 qsv-0.127.0-aarch64-apple-darwin.zip 123.73MB
2、 qsv-0.127.0-aarch64-unknown-linux-gnu.zip 14.83MB
3、 qsv-0.127.0-geocode-index.bincode 14.23MB
4、 qsv-0.127.0-geocode-index.bincode.cities15000 14.23MB
5、 qsv-0.127.0-geocode-index.bincode.cities15000.sz 5.63MB
6、 qsv-0.127.0-i686-pc-windows-msvc.zip 14.3MB
7、 qsv-0.127.0-i686-unknown-linux-gnu.zip 15.32MB
8、 qsv-0.127.0-x86_64-apple-darwin.zip 73.38MB
9、 qsv-0.127.0-x86_64-pc-windows-gnu.zip 32.05MB
10、 qsv-0.127.0-x86_64-pc-windows-msvc.zip 73.56MB
11、 qsv-0.127.0-x86_64-unknown-linux-gnu.zip 131.16MB
12、 qsv-0.127.0-x86_64-unknown-linux-musl.zip 58.72MB
13、 qsv-0.127.0.msi 33.18MB