Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -34,6 +34,33 @@ As an example, here is a video of running the model on an iPhone 13 device - ful
|
|
| 34 |
|
| 35 |
https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
## Quick start
|
| 38 |
|
| 39 |
First, download one of the Whisper models converted in [ggml format](models). For example:
|
|
@@ -319,33 +346,6 @@ https://user-images.githubusercontent.com/1991296/199337538-b7b0c7a3-2753-4a88-a
|
|
| 319 |
|
| 320 |
---
|
| 321 |
|
| 322 |
-
## Implementation details
|
| 323 |
-
|
| 324 |
-
- The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
|
| 325 |
-
- The transformer model and the high-level C-style API are implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
|
| 326 |
-
- Sample usage is demonstrated in [main.cpp](examples/main)
|
| 327 |
-
- Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
|
| 328 |
-
- Various other examples are available in the [examples](examples) folder
|
| 329 |
-
|
| 330 |
-
The tensor operators are optimized heavily for Apple silicon CPUs. Depending on the computation size, Arm Neon SIMD
|
| 331 |
-
instrisics or CBLAS Accelerate framework routines are used. The latter are especially effective for bigger sizes since
|
| 332 |
-
the Accelerate framework utilizes the special-purpose AMX coprocessor available in modern Apple products.
|
| 333 |
-
|
| 334 |
-
## Limitations
|
| 335 |
-
|
| 336 |
-
- Inference only
|
| 337 |
-
- No GPU support
|
| 338 |
-
- Very basic greedy sampling scheme - always pick up the token with highest probability.
|
| 339 |
-
This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
|
| 340 |
-
from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
|
| 341 |
-
to run the python code with the following parameters:
|
| 342 |
-
|
| 343 |
-
```
|
| 344 |
-
whisper --best_of None --beam_size None ...
|
| 345 |
-
```
|
| 346 |
-
|
| 347 |
-
In the future, `whisper.cpp` will support more sampling strategies.
|
| 348 |
-
|
| 349 |
## Benchmarks
|
| 350 |
|
| 351 |
In order to have an objective comparison of the performance of the inference across different system configurations,
|
|
|
|
| 34 |
|
| 35 |
https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4
|
| 36 |
|
| 37 |
+
## Implementation details
|
| 38 |
+
|
| 39 |
+
- The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
|
| 40 |
+
- The transformer model and the high-level C-style API are implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
|
| 41 |
+
- Sample usage is demonstrated in [main.cpp](examples/main)
|
| 42 |
+
- Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
|
| 43 |
+
- Various other examples are available in the [examples](examples) folder
|
| 44 |
+
|
| 45 |
+
The tensor operators are optimized heavily for Apple silicon CPUs. Depending on the computation size, Arm Neon SIMD
|
| 46 |
+
instrisics or CBLAS Accelerate framework routines are used. The latter are especially effective for bigger sizes since
|
| 47 |
+
the Accelerate framework utilizes the special-purpose AMX coprocessor available in modern Apple products.
|
| 48 |
+
|
| 49 |
+
## Limitations
|
| 50 |
+
|
| 51 |
+
- Inference only
|
| 52 |
+
- No GPU support
|
| 53 |
+
- Very basic greedy sampling scheme - always pick up the token with highest probability.
|
| 54 |
+
This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
|
| 55 |
+
from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
|
| 56 |
+
to run the python code with the following parameters:
|
| 57 |
+
|
| 58 |
+
```
|
| 59 |
+
whisper --best_of None --beam_size None ...
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
In the future, `whisper.cpp` will support more sampling strategies.
|
| 63 |
+
|
| 64 |
## Quick start
|
| 65 |
|
| 66 |
First, download one of the Whisper models converted in [ggml format](models). For example:
|
|
|
|
| 346 |
|
| 347 |
---
|
| 348 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 349 |
## Benchmarks
|
| 350 |
|
| 351 |
In order to have an objective comparison of the performance of the inference across different system configurations,
|