v0.8.33
版本发布时间: 2022-04-16 02:46:27
datahub-project/datahub最新发布版本:v0.13.3(2024-05-24 07:11:13)
Release Highlights
User Experience
Refreshed the ML Entity page to match the feel of all other entity types; improved ML lineage functionality
Ingestion Improvements
- Airflow Improvements - as demoed in March Town Hall
- Add support to capture Airflow execution runs from lineage backend
- Introduce new High level API for generating dataflow/job/dataprocessinstance
- MS SQL ingestion now captures table & column descriptions
- Trino platform support for Great Expectations
- New Presto-on-Hive ingestion source
- BigQuery ingestion now supports extraction of usage info from audit logs
- Fix to Looker ingestion to extract Explore Views from join names
- Fix to Tableau ingestion to avoid duplicating schema in URNs for upstream tables
- Simplify & annotate Redshift Usage source
Full Commit Log
- feat(gms): Expose kafka listener concurrency as a GMS setting by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4536
- feat(ingest): add option for external Spark cluster by @kevinhu in https://github.com/datahub-project/datahub/pull/4571
- fix(upgrade): Renaming kafka producer since it clashes with spring-internal by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4573
- feat(GraphQL): Add data platform query to GraphQL API by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4574
- build(ui): Fix Windows UI lint by @mattmatravers in https://github.com/datahub-project/datahub/pull/4556
- doc: make note prominent on quickstart by @anshbansal in https://github.com/datahub-project/datahub/pull/4558
- fix(protobuf) minor bugfixes for protobuf by @leifker in https://github.com/datahub-project/datahub/pull/4553
- feat(docs) Improves docs around developing datahub, removes deprecated docs on building metadata service by @pedro93 in https://github.com/datahub-project/datahub/pull/4552
- chore: cleanup extra file by @anshbansal in https://github.com/datahub-project/datahub/pull/4541
- feat(snowflake): reduce permissions provisioned by default by @anshbansal in https://github.com/datahub-project/datahub/pull/4543
- fix(ingestion): Redshift usage refactoring - simplify, annotate, fix bugs by @rslanka in https://github.com/datahub-project/datahub/pull/4572
- fix(graphql): Adding PRE FabricType to GraphQL by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4582
- feat(search) - add DATETIME FieldType by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/4407
- fix(tableau): fix for incorrect schema returned by tableau api for sn… by @mayurinehate in https://github.com/datahub-project/datahub/pull/4577
- chore: update default cli for managed ingestion by @anshbansal in https://github.com/datahub-project/datahub/pull/4581
- feat(okta) - add support for filtering/searching when ingesting Okta groups and users by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/4586
- doc(snowflake): add example of table pattern by @anshbansal in https://github.com/datahub-project/datahub/pull/4580
- fix(doc): try to fix broken link by @daha in https://github.com/datahub-project/datahub/pull/4593
- fix(bigquery): incorrect lineage when views are present by @anshbansal in https://github.com/datahub-project/datahub/pull/4568
- feat(metadata-service): Supporting a configurable Authorizer Chain by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4584
- fix(search): Make sure home page and search pages are consistent by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4588
- fix(browse): Reduce browse aggregation size by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4601
- doc: add page for handling deprecations, breaking changes etc. by @anshbansal in https://github.com/datahub-project/datahub/pull/4590
- docs(GraphQL): fix typo by @Falci in https://github.com/datahub-project/datahub/pull/4605
- feat(search): Add SearchScore annotation to use fields for search ranking by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4596
- feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation. by @rslanka in https://github.com/datahub-project/datahub/pull/4585
- feat(tableau): add some logic to normalize table names in tableau by @gabe-lyons in https://github.com/datahub-project/datahub/pull/4609
- fix: urlencode slash in urns too by @daha in https://github.com/datahub-project/datahub/pull/4527
- fix(bigquery): fix lineage bug, improve docs, add dataset filter config by @anshbansal in https://github.com/datahub-project/datahub/pull/4607
- fix(protobuf) fix test instabilitity by @leifker in https://github.com/datahub-project/datahub/pull/4612
- fix(ui): Fix dashboard tags display by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4611
- feat(ui): Adding GraphQL queries to fetch entity deprecation status by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4614
- feat(ingest): enable connection string for all sqlalchemy datasources by @ms32035 in https://github.com/datahub-project/datahub/pull/4508
- fix(docs): add grant statements for redshift-ingestion by @Abhiram98 in https://github.com/datahub-project/datahub/pull/4559
- chore: fix lint and remove incorrect integration mark from unit tests by @anshbansal in https://github.com/datahub-project/datahub/pull/4621
- feat: adding gradle, pip cache via gh cache, docker cache via dockerhub by @anshbansal in https://github.com/datahub-project/datahub/pull/4387
- doc(scheduling): make it easier to find ui ingestion by @anshbansal in https://github.com/datahub-project/datahub/pull/4610
- feat(glue): add CatalogId parameter for cross-account access by @BoyuanZhangDE in https://github.com/datahub-project/datahub/pull/4608
- doc(cli): add env variables and options for ingest command by @anshbansal in https://github.com/datahub-project/datahub/pull/4598
- fix(ingest): Restricting pytest docker version to <0.12 by @treff7es in https://github.com/datahub-project/datahub/pull/4639
- fix(cypress) - add waits for cypress search test to remove flakiness by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/4640
- Revert "feat: adding gradle, pip cache via gh cache, docker cache via dockerhub" by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4637
- feat(search): Only reindex if the mappings for an existing field changed by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4629
- feat: add presto-on-hive metadata ingestion source by @jchen0824 in https://github.com/datahub-project/datahub/pull/4625
- feat(ingest): add trino platform for great expectations by @ms32035 in https://github.com/datahub-project/datahub/pull/4594
- fix(kafka): Stop overriding kafka registry props with empty values by @jsotelo in https://github.com/datahub-project/datahub/pull/4604
- [model]: Dataprocess instance entity to model datajob/jobflow runs by @treff7es in https://github.com/datahub-project/datahub/pull/4459
- feat(ingest): add Urn python library for DataJob, DataFlow, Domain and Tag by @tc350981 in https://github.com/datahub-project/datahub/pull/4618
- fix(ingestion): ensure source/sink reports are always logged by @anshbansal in https://github.com/datahub-project/datahub/pull/4592
- fix(ingestion): extract explore views from join name in Looker by @dyanarose in https://github.com/datahub-project/datahub/pull/4627
- feat(ingestion): Enable lower-casing of the name part of dataset urn if env variable is set. by @rslanka in https://github.com/datahub-project/datahub/pull/4649
- feat: Enable the ingestion of bigquery audit logs to parse usage info… by @tha23rd in https://github.com/datahub-project/datahub/pull/4441
- fix(ingest): Fix snowflake KEY_PAIR auth by @mkamalas in https://github.com/datahub-project/datahub/pull/4638
- fix(home): Fix issue where some browse cards are missing by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4652
- fix(tableau): avoid duplicate schema in URNs for upstream tables by @maaaikoool in https://github.com/datahub-project/datahub/pull/4645
- feat(ingest): capture MSSQL table+column descriptions by @kevinhu in https://github.com/datahub-project/datahub/pull/4579
- feat(ml): bringing ml screens up to date w/ the modern ui layout & improving ml lineage by @gabe-lyons in https://github.com/datahub-project/datahub/pull/4651
- (feat:airflow) Add support to capture airflow executions + high level dataflow/jobs api by @treff7es in https://github.com/datahub-project/datahub/pull/4615
- fix(ingestion): add missing workunit ids by @anshbansal in https://github.com/datahub-project/datahub/pull/4657
- fix(ingestion): Adding missing init.py by @anshbansal in https://github.com/datahub-project/datahub/pull/4659
- fix(bigquery-usage): missing dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/4661
- feat(cypress) - add cypress dashboard view to CI by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/4654
- feat(autocomplete): show fully qualified name in autocomplete by @gabe-lyons in https://github.com/datahub-project/datahub/pull/4663
- feat(ingestion) dbt: Fixing issue with strip_user_ids_from_email and adding owner_naming_pattern by @arunvasudevan in https://github.com/datahub-project/datahub/pull/4587
- fix(sqlparser): fix sqlparser breaking due to # sign by @anshbansal in https://github.com/datahub-project/datahub/pull/4662
- fix(ingestion): validate datasource in Tableau connector, before creating its upstream by @nandacamargo in https://github.com/datahub-project/datahub/pull/4613
- Added Relative Routing on the Users & Groups screen by @Ankit-Keshari-Vituity in https://github.com/datahub-project/datahub/pull/4664
- fix(airflow): Not importing emitters directly to eliminate unneeded dependency by @treff7es in https://github.com/datahub-project/datahub/pull/4668
- docs: remove ingestion source summary table by @maggiehays in https://github.com/datahub-project/datahub/pull/4670
- feat(ml): some machine learning followups by @gabe-lyons in https://github.com/datahub-project/datahub/pull/4669
- fix(search): Fix urn component settings by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4672
- fix(ingestion): update example recipes by @anshbansal in https://github.com/datahub-project/datahub/pull/4660
- feat(theming): set custom logo without rebuilding by @gabe-lyons in https://github.com/datahub-project/datahub/pull/4674
- feat(data-platform): Add platform entities for the connectors we support by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4676
- refactor(authorization): Add authorizedActor function to Authorizer interface by @dexter-mh-lee in https://github.com/datahub-project/datahub/pull/4678
- docs(tags) - add tags usage guide by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/4677
- fix(cli):Supress printing variables to logs during ingestion failure by @atulsaurav in https://github.com/datahub-project/datahub/pull/4566
- fix(docs): Improving Add Users Doc by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/4679
- Fix/modal validations by @ShubhamThakre in https://github.com/datahub-project/datahub/pull/4673
New Contributors
- @Falci made their first contribution in https://github.com/datahub-project/datahub/pull/4605
- @ms32035 made their first contribution in https://github.com/datahub-project/datahub/pull/4508
- @jchen0824 made their first contribution in https://github.com/datahub-project/datahub/pull/4625
- @dyanarose made their first contribution in https://github.com/datahub-project/datahub/pull/4627
- @mkamalas made their first contribution in https://github.com/datahub-project/datahub/pull/4638
- @atulsaurav made their first contribution in https://github.com/datahub-project/datahub/pull/4566
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.8.32...v0.8.33