ggerganov commited on
Commit
3ad485f
·
unverified ·
1 Parent(s): 05261df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -27
README.md CHANGED
@@ -34,6 +34,33 @@ As an example, here is a video of running the model on an iPhone 13 device - ful
34
 
35
  https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ## Quick start
38
 
39
  First, download one of the Whisper models converted in [ggml format](models). For example:
@@ -319,33 +346,6 @@ https://user-images.githubusercontent.com/1991296/199337538-b7b0c7a3-2753-4a88-a
319
 
320
  ---
321
 
322
- ## Implementation details
323
-
324
- - The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
325
- - The transformer model and the high-level C-style API are implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
326
- - Sample usage is demonstrated in [main.cpp](examples/main)
327
- - Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
328
- - Various other examples are available in the [examples](examples) folder
329
-
330
- The tensor operators are optimized heavily for Apple silicon CPUs. Depending on the computation size, Arm Neon SIMD
331
- instrisics or CBLAS Accelerate framework routines are used. The latter are especially effective for bigger sizes since
332
- the Accelerate framework utilizes the special-purpose AMX coprocessor available in modern Apple products.
333
-
334
- ## Limitations
335
-
336
- - Inference only
337
- - No GPU support
338
- - Very basic greedy sampling scheme - always pick up the token with highest probability.
339
- This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
340
- from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
341
- to run the python code with the following parameters:
342
-
343
- ```
344
- whisper --best_of None --beam_size None ...
345
- ```
346
-
347
- In the future, `whisper.cpp` will support more sampling strategies.
348
-
349
  ## Benchmarks
350
 
351
  In order to have an objective comparison of the performance of the inference across different system configurations,
 
34
 
35
  https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4
36
 
37
+ ## Implementation details
38
+
39
+ - The core tensor operations are implemented in C ([ggml.h](ggml.h) / [ggml.c](ggml.c))
40
+ - The transformer model and the high-level C-style API are implemented in C++ ([whisper.h](whisper.h) / [whisper.cpp](whisper.cpp))
41
+ - Sample usage is demonstrated in [main.cpp](examples/main)
42
+ - Sample real-time audio transcription from the microphone is demonstrated in [stream.cpp](examples/stream)
43
+ - Various other examples are available in the [examples](examples) folder
44
+
45
+ The tensor operators are optimized heavily for Apple silicon CPUs. Depending on the computation size, Arm Neon SIMD
46
+ instrisics or CBLAS Accelerate framework routines are used. The latter are especially effective for bigger sizes since
47
+ the Accelerate framework utilizes the special-purpose AMX coprocessor available in modern Apple products.
48
+
49
+ ## Limitations
50
+
51
+ - Inference only
52
+ - No GPU support
53
+ - Very basic greedy sampling scheme - always pick up the token with highest probability.
54
+ This should be similar to the [GreedyDecoder](https://github.com/openai/whisper/blob/main/whisper/decoding.py#L249-L274)
55
+ from the original python implementation, so in order to make a fair comparison between the 2 implementations, make sure
56
+ to run the python code with the following parameters:
57
+
58
+ ```
59
+ whisper --best_of None --beam_size None ...
60
+ ```
61
+
62
+ In the future, `whisper.cpp` will support more sampling strategies.
63
+
64
  ## Quick start
65
 
66
  First, download one of the Whisper models converted in [ggml format](models). For example:
 
346
 
347
  ---
348
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
349
  ## Benchmarks
350
 
351
  In order to have an objective comparison of the performance of the inference across different system configurations,