MyGit

0.123.0

jqnatividad/qsv

版本发布时间: 2024-03-05 22:18:55

jqnatividad/qsv最新发布版本:0.127.0(2024-04-25 17:53:49)

OPEN DATA DAY 2024 Release! 🎉🎉🎉

In celebration of Open Data Day, we're releasing qsv 0.123.0 - the biggest release ever with 330+ commits! qsv 0.123.0 continues to focus on performance, stability and reliability as we continue setting the stage for qsv's big brother - qsv pro.

We've been baking qsv pro for a while now, and it's almost ready for release. qsv pro is a cross-platform Desktop Data Wrangling tool marrying an Excel-like UI with the power of qsv, backed by cloud-based data cleaning, enrichment and enhancement service that's easy to use for casual Excel users and Data Publishers, yet powerful enough for data scientists and data engineers.

Stay tuned!

Highlights:

# with fast path optimization turned off
/usr/bin/time qsv sqlp taxi.csv --no-optimizations "select VendorID,sum(total_amount) from taxi group by VendorID order by VendorID"
VendorID,total_amount
1,52377417.52985942
2,89959869.13054822
4,600584.610000027
(3, 2)
        6.09 real         6.82 user         0.16 sys

# with fast path optimization, fully exploiting Polars' multithreaded, mem-mapped CSV reader!
 /usr/bin/time qsv sqlp taxi.csv "select VendorID,sum(total_amount) from taxi group by VendorID order by VendorID"
VendorID,total_amount
1,52377417.52985942
2,89959869.13054822
4,600584.610000027
(3, 2)
        0.14 real         1.09 user         0.09 sys

# in contrast, csvq takes 72.46 seconds - 517.57x slower
/usr/bin/time csvq "select VendorID,sum(total_amount) from taxi group by VendorID order by VendorID"
+----------+---------------------+
| VendorID |  SUM(total_amount)  |
+----------+---------------------+
| 1        |  52377417.529256366 |
| 2        |    89959869.1264675 |
| 4        |   600584.6099999828 |
+----------+---------------------+
       72.46 real        65.15 user        75.17 sys

"Traditional" SQL engines

qsv and csvq both operate on "bare" CSVs. For comparison, let's contrast qsv's performance against "traditional" SQL engines that require setup and import (aka ETL). Not counting setup and import time (which alone, takes several minutes), we get:

sqlite3.43.2 takes 2.910 seconds - 20.79x slower

sqlite> .timer on
sqlite> select VendorID,sum(total_amount) from taxi group by VendorID order by VendorID;
1,52377417.53
2,89959869.13
4,600584.61
Run Time: real 2.910 user 2.569494 sys 0.272972

PostgreSQL 15.6 using PgAdmin 4 v6.12 takes 18.527 seconds - 132.34x slower

Screenshot 2024-03-06 at 10 14 04 AM

even with an index, qsv sqlp is still 5.96x faster

Screenshot 2024-03-08 at 7 57 57 AM

Added

Changed

Fixed

Removed

New Contributors

Full Changelog: https://github.com/jqnatividad/qsv/compare/0.122.0...0.123.0

相关地址:原始地址 下载(tar) 下载(zip)

1、 qsv-0.123.0-aarch64-apple-darwin.zip 100.5MB

2、 qsv-0.123.0-aarch64-unknown-linux-gnu.zip 13.63MB

3、 qsv-0.123.0-geocode-index.bincode 13.58MB

4、 qsv-0.123.0-geocode-index.bincode.cities15000 13.58MB

5、 qsv-0.123.0-geocode-index.bincode.cities15000.sz 2.35MB

6、 qsv-0.123.0-i686-pc-windows-msvc.zip 13.27MB

7、 qsv-0.123.0-i686-unknown-linux-gnu.zip 14.23MB

8、 qsv-0.123.0-x86_64-apple-darwin.zip 110.99MB

9、 qsv-0.123.0-x86_64-pc-windows-gnu.zip 30.66MB

10、 qsv-0.123.0-x86_64-pc-windows-msvc.zip 115MB

11、 qsv-0.123.0-x86_64-unknown-linux-gnu.zip 141.31MB

12、 qsv-0.123.0-x86_64-unknown-linux-musl.zip 42.18MB

13、 qsv-0.123.0.msi 31.6MB

查看:2024-03-05发行的版本