-
Notifications
You must be signed in to change notification settings - Fork 4.7k
cli: clearer errors for unsupported extensions and failed audio decode #3387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
???????????? |
@aryanrahar Can you take a look at the commit and update it, as it is a little difficult to review at the moment. It looks like some formatting has been applied perhaps. |
90a7428
to
6734aac
Compare
|
Perhaps the extension validation could be extracted into a separate function to not "interfer" with the normal code path. For example, something like this: diff --git a/examples/cli/cli.cpp b/examples/cli/cli.cpp
index a87dab1c..7a7ba113 100644
--- a/examples/cli/cli.cpp
+++ b/examples/cli/cli.cpp
@@ -1,5 +1,5 @@
#include <filesystem>
-#include <algorithm>
+#include <algorithm>
#include "common.h"
#include "common-whisper.h"
@@ -128,6 +128,38 @@ static char * requires_value_error(const std::string & arg) {
exit(0);
}
+static bool validate_audio_extension(const std::string & fname_inp) {
+ if (fname_inp == "-") {
+ return true; // stdin is always valid
+ }
+
+ std::string ext;
+ try {
+ ext = std::filesystem::path(fname_inp).extension().string();
+ std::transform(ext.begin(), ext.end(), ext.begin(),
+ [](unsigned char c){ return std::tolower(c); });
+ } catch (...) {
+ // ignore; let the decoder try
+ return true;
+ }
+
+ auto ext_supported = [](const std::string &e) {
+ // keep in sync with usage: "supported audio formats: flac, mp3, ogg, wav"
+ return e == ".wav" || e == ".mp3" || e == ".flac" || e == ".ogg";
+ };
+
+ if (!ext.empty() && !ext_supported(ext)) {
+ fprintf(stderr,
+ "error: unsupported audio extension '%s' for '%s'.\n"
+ "supported: flac, mp3, ogg, wav.\n"
+ "hint: convert with ffmpeg, e.g.:\n"
+ " ffmpeg -i \"%s\" -ar 16000 -ac 1 -c:a pcm_s16le out.wav\n",
+ ext.c_str(), fname_inp.c_str(), fname_inp.c_str());
+ return false;
+ }
+ return true;
+}
+
static bool whisper_params_parse(int argc, char ** argv, whisper_params & params) {
for (int i = 1; i < argc; i++) {
std::string arg = argv[i];
@@ -1108,43 +1140,20 @@ int main(int argc, char ** argv) {
std::vector<float> pcmf32; // mono-channel F32 PCM
std::vector<std::vector<float>> pcmf32s; // stereo-channel F32 PCM
- std::string ext;
-if (fname_inp != "-") {
- try {
- ext = std::filesystem::path(fname_inp).extension().string();
- std::transform(ext.begin(), ext.end(), ext.begin(),
- [](unsigned char c){ return std::tolower(c); });
- } catch (...) {
- // ignore; let the decoder try
- }
-}
-
-auto ext_supported = [](const std::string &e) {
- // keep in sync with usage: "supported audio formats: flac, mp3, ogg, wav"
- return e == ".wav" || e == ".mp3" || e == ".flac" || e == ".ogg";
-};
-
-if (fname_inp != "-" && !ext.empty() && !ext_supported(ext)) {
- fprintf(stderr,
- "error: unsupported audio extension '%s' for '%s'.\n"
- "supported: flac, mp3, ogg, wav.\n"
- "hint: convert with ffmpeg, e.g.:\n"
- " ffmpeg -i \"%s\" -ar 16000 -ac 1 -c:a pcm_s16le out.wav\n",
- ext.c_str(), fname_inp.c_str(), fname_inp.c_str());
- continue;
-}
+ if (!validate_audio_extension(fname_inp)) {
+ continue;
+ }
-// Try to read/decode the audio. If it fails, give an actionable hint.
-if (!::read_audio_data(fname_inp, pcmf32, pcmf32s, params.diarize)) {
- std::string det = ext.empty() ? "" : (" (detected extension: " + ext + ")");
- fprintf(stderr,
- "error: failed to decode audio from '%s'%s.\n"
- "Make sure the file is not corrupted and has one of: flac, mp3, ogg, wav.\n"
- "If you still hit this, convert to a standard WAV with:\n"
- " ffmpeg -i \"%s\" -ar 16000 -ac 1 -c:a pcm_s16le out.wav\n",
- fname_inp.c_str(), det.c_str(), fname_inp.c_str());
- continue;
-}
+ // Try to read/decode the audio. If it fails, give an actionable hint.
+ if (!::read_audio_data(fname_inp, pcmf32, pcmf32s, params.diarize)) {
+ fprintf(stderr,
+ "error: failed to decode audio from '%s'.\n"
+ "Make sure the file is not corrupted and has one of: flac, mp3, ogg, wav.\n"
+ "If you still hit this, convert to a standard WAV with:\n"
+ " ffmpeg -i \"%s\" -ar 16000 -ac 1 -c:a pcm_s16le out.wav\n",
+ fname_inp.c_str(), fname_inp.c_str());
+ continue;
+ }
if (!whisper_is_multilingual(ctx)) {
if (params.language != "en" || params.translate) { |
…timings only on success
6734aac
to
a892322
Compare
I have refactored the extension check into validate_audio_extension(), simplified the decode-failed message, and guarded timings with processed_any. The diff is now minimal and free of formatting changes. When you have a moment, could you take another look? I’m excited to keep contributing and can iterate quickly on any feedback!!!! |
?? |
can you approve the workflows? |
Add early extension check (stdin “-” still supported).
On decode failure, print a concise hint with a one-line ffmpeg command.
Keep supported set consistent with usage text (flac, mp3, ogg, wav).
Print timings only if at least one file was processed successfully.
Improves UX for cases like #3375 without changing core behavior.
Manual tests:
samples/jfk.wav → transcribes + timings
bad.m4a → “unsupported extension … (flac, mp3, ogg, wav) + ffmpeg hint”, no timings
bad.mp3 (corrupt) → “failed to decode … (detected .mp3) + ffmpeg hint”, no timings
stdin (-f -) still works