Satyam Goyal
commited on
Commit
·
c90e499
1
Parent(s):
3c5af06
Merge pull request #95 from Satgoy152:adding-doc
Browse filesImproved help messages for demo programs (#95)
- Added Demo Documentation
- Updated help messages
- Changed exception link
README.md
CHANGED
|
@@ -5,7 +5,7 @@ An End-to-End Trainable Neural Network for Image-based Sequence Recognition and
|
|
| 5 |
Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
|
| 6 |
|
| 7 |
| Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
|
| 8 |
-
|
| 9 |
| CRNN_EN | 81.66 | 74.33 | 52.78 |
|
| 10 |
| CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
|
| 11 |
| CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
|
|
@@ -16,10 +16,11 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval) at different
|
|
| 16 |
\*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
|
| 17 |
|
| 18 |
Note:
|
|
|
|
| 19 |
- Model source:
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
- `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
|
| 24 |
- `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
|
| 25 |
- `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
|
|
@@ -28,26 +29,35 @@ Note:
|
|
| 28 |
## Demo
|
| 29 |
|
| 30 |
***NOTE***:
|
|
|
|
| 31 |
- This demo uses [text_detection_db](../text_detection_db) as text detector.
|
| 32 |
- Selected model must match with the charset:
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
|
| 37 |
Run the demo detecting English:
|
|
|
|
| 38 |
```shell
|
| 39 |
# detect on camera input
|
| 40 |
python demo.py
|
| 41 |
# detect on an image
|
| 42 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
|
|
|
| 43 |
```
|
| 44 |
|
| 45 |
Run the demo detecting Chinese:
|
|
|
|
| 46 |
```shell
|
| 47 |
# detect on camera input
|
| 48 |
python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
| 49 |
# detect on an image
|
| 50 |
python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
|
|
|
|
|
|
|
|
|
| 51 |
```
|
| 52 |
|
| 53 |
### Examples
|
|
|
|
| 5 |
Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
|
| 6 |
|
| 7 |
| Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
|
| 8 |
+
| ------------ | ---------- | --------- | --------- |
|
| 9 |
| CRNN_EN | 81.66 | 74.33 | 52.78 |
|
| 10 |
| CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
|
| 11 |
| CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
|
|
|
|
| 16 |
\*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
|
| 17 |
|
| 18 |
Note:
|
| 19 |
+
|
| 20 |
- Model source:
|
| 21 |
+
- `text_recognition_CRNN_EN_2021sep.onnx`: https://docs.opencv.org/4.5.2/d9/d1e/tutorial_dnn_OCR.html (CRNN_VGG_BiLSTM_CTC.onnx)
|
| 22 |
+
- `text_recognition_CRNN_CH_2021sep.onnx`: https://docs.opencv.org/4.x/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs.onnx)
|
| 23 |
+
- `text_recognition_CRNN_CN_2021nov.onnx`: https://docs.opencv.org/4.5.2/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs_CN.onnx)
|
| 24 |
- `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
|
| 25 |
- `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
|
| 26 |
- `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
|
|
|
|
| 29 |
## Demo
|
| 30 |
|
| 31 |
***NOTE***:
|
| 32 |
+
|
| 33 |
- This demo uses [text_detection_db](../text_detection_db) as text detector.
|
| 34 |
- Selected model must match with the charset:
|
| 35 |
+
- Try `text_recognition_CRNN_EN_2021sep.onnx` with `charset_36_EN.txt`.
|
| 36 |
+
- Try `text_recognition_CRNN_CH_2021sep.onnx` with `charset_94_CH.txt`
|
| 37 |
+
- Try `text_recognition_CRNN_CN_2021sep.onnx` with `charset_3944_CN.txt`.
|
| 38 |
|
| 39 |
Run the demo detecting English:
|
| 40 |
+
|
| 41 |
```shell
|
| 42 |
# detect on camera input
|
| 43 |
python demo.py
|
| 44 |
# detect on an image
|
| 45 |
python demo.py --input /path/to/image
|
| 46 |
+
|
| 47 |
+
# get help regarding various parameters
|
| 48 |
+
python demo.py --help
|
| 49 |
```
|
| 50 |
|
| 51 |
Run the demo detecting Chinese:
|
| 52 |
+
|
| 53 |
```shell
|
| 54 |
# detect on camera input
|
| 55 |
python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
| 56 |
# detect on an image
|
| 57 |
python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
| 58 |
+
|
| 59 |
+
# get help regarding various parameters
|
| 60 |
+
python demo.py --help
|
| 61 |
```
|
| 62 |
|
| 63 |
### Examples
|
demo.py
CHANGED
|
@@ -33,17 +33,17 @@ try:
|
|
| 33 |
help_msg_backends += "; {:d}: TIMVX"
|
| 34 |
help_msg_targets += "; {:d}: NPU"
|
| 35 |
except:
|
| 36 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
| 37 |
|
| 38 |
parser = argparse.ArgumentParser(
|
| 39 |
description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
|
| 40 |
-
parser.add_argument('--input', '-i', type=str, help='
|
| 41 |
-
parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='
|
| 42 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
| 43 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
| 44 |
-
parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='
|
| 45 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
| 46 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
| 47 |
parser.add_argument('--width', type=int, default=736,
|
| 48 |
help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
|
| 49 |
parser.add_argument('--height', type=int, default=736,
|
|
|
|
| 33 |
help_msg_backends += "; {:d}: TIMVX"
|
| 34 |
help_msg_targets += "; {:d}: NPU"
|
| 35 |
except:
|
| 36 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
| 37 |
|
| 38 |
parser = argparse.ArgumentParser(
|
| 39 |
description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
|
| 40 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
| 41 |
+
parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='Usage: Set model path, defaults to text_recognition_CRNN_EN_2021sep.onnx.')
|
| 42 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
| 43 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
| 44 |
+
parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='Usage: Set the path to the charset file corresponding to the selected model.')
|
| 45 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
|
| 46 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
| 47 |
parser.add_argument('--width', type=int, default=736,
|
| 48 |
help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
|
| 49 |
parser.add_argument('--height', type=int, default=736,
|