Skip to content

Latest commit

 

History

History
31 lines (24 loc) · 1.42 KB

README_CN.md

File metadata and controls

31 lines (24 loc) · 1.42 KB

数据模块指南

代码结构

├── README.md
├── __init__.py
├── base_dataset.py  				# base dataset class with __getitem__
├── builder.py					# API for create dataset and loader
├── det_dataset.py				# general text detection dataset class
├── rec_dataset.py				# general rec detection dataset class
├── rec_lmdb_dataset.py				# LMDB dataset class
└── transforms
    ├── det_transforms.py			# processing and augmentation ops (callabel classes) especially for detection tasks
    ├── general_transforms.py			# general processing and augmentation ops (callabel classes)
    ├── modelzoo_transforms.py			# transformations adopted from modelzoo
    ├── rec_transforms.py			# processing and augmentation ops (callabel classes) especially for recognition tasks
    └── transforms_factory.py			# API for create and run transforms

如何添加自己的dataset类

  1. 继承BaseDataset类

  2. 在BaseDataset中重写以下文件和标注解析函数。

    def load_data_list(self, label_file: Union[str, List[str]], sample_ratio: Union[float, List] = 1.0, shuffle: bool = False, **kwargs) -> List[dict]

    def _parse_annotation(self, data_line: str) -> Union[dict, List[dict]]

如何添加自己的数据转换

请参考定制化数据转换开发指导