magic_pdf-0.6.2b1-released
版本发布时间: 2024-07-31 17:56:29
opendatalab/MinerU最新发布版本:magic_pdf-0.7.0b1-released(2024-08-09 21:37:57)
What's Changed
- Optimized model loading logic, now requiring only a single load during batch processing.
- Command-line interface now supports batch input.
- When import fails, prints complete error messages to facilitate troubleshooting.
- Fixed a bug where overlapping spans were incorrectly removed multiple times.
- Improved OCR recognition areas, doubling the OCR speed.
- Embedded language identification models within the whl package for easier offline deployment.
- Replaced interline_equation_blocks with interline_equations to enhance interline formula recognition capabilities in non-academic paper scenarios.
- Added page number indexing to the output results of content_list.
- Locked some dependency versions and adjusted the dependency installation logic to reduce conflicts and redundant installations, cutting down the number of packages by 30% and improving the initial installation success rate.
New Contributors
- @yzztin made their first contribution in https://github.com/opendatalab/MinerU/pull/214
- @eltociear made their first contribution in https://github.com/opendatalab/MinerU/pull/231
Full Changelog: https://github.com/opendatalab/MinerU/compare/magic_pdf-0.6.1-released...magic_pdf-0.6.2b1-released
1、 magic_pdf-0.6.2b1-py3-none-any.whl 1.06MB