MyGit

v0.6.0

open-mmlab/mmocr

版本发布时间: 2022-05-05 22:20:47

open-mmlab/mmocr最新发布版本:v1.0.1(2023-07-04 15:11:53)

Highlights

  1. A new recognition algorithm MASTER has been added into MMOCR, which was the championship solution for the "ICDAR 2021 Competition on Scientific Table Image Recognition to Latex"! The model pre-trained on SynthText and MJSynth is available for testing! Credit to @JiaquanYe
  2. DBNet++ has been released now! A new Adaptive Scale Fusion module has been equipped for feature enhancement. Benefiting from this, the new model achieved 2% better h-mean score than its predecessor on the ICDAR2015 dataset.
  3. Three more dataset converters are added: LSVT, RCTW and HierText. Check the dataset zoo (Det & Recog) to explore further information.
  4. To enhance the data storage efficiency, MMOCR now supports loading both images and labels from .lmdb format annotations for the text recognition task. To enable such a feature, the new lmdb_converter.py is ready for use to pack your cropped images and labels into an lmdb file. For a detailed tutorial, please refer to the following sections and the doc.
  5. Testing models on multiple datasets is a widely used evaluation strategy. MMOCR now supports automatically reporting mean scores when there is more than one dataset to evaluate, which enables a more convenient comparison between checkpoints. Doc
  6. Evaluation is more flexible and customizable now. For text detection tasks, you can set the score threshold range where the best results might come out. (Doc) If too many results are flooding your text recognition train log, you can trim it by specifying a subset of metrics in evaluation config. Check out the Evaluation section for details.
  7. MMOCR provides a script to convert the .json labels obtained by the popular annotation toolkit Labelme to MMOCR-supported data format. @Y-M-Y contributed a log analysis tool that helps users gain a better understanding of the entire training process. Read tutorial docs to get started.

Lmdb Dataset

Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now officially supports loading images and labels from lmdb datasets via a new pipeline LoadImageFromLMDB. This section is intended to serve as a quick walkthrough for you to master this update and apply it to facilitate your research.

Specifications

To better align with the academic community, MMOCR now requires the following specifications for lmdb datasets:

Usage

  1. Use existing academic lmdb datasets if they meet the specifications; or the tool provided by MMOCR to pack images & annotations into a lmdb dataset.
  1. The second step is to modify the configuration files. For example, to train CRNN on MJ and ST datasets:
  1. You are good to go! Start training and MMOCR will load data from your lmdb dataset.

New Features & Enhancements

Bug Fixes

Docs

New Contributors

Full Changelog: https://github.com/open-mmlab/mmocr/compare/v0.5.0...v0.6.0

相关地址:原始地址 下载(tar) 下载(zip)

查看:2022-05-05发行的版本