How do I stream audio from browser mic using JavaScript

My project is a voice-controlled email website. The user has to speak through the browser mic to give in commands. The input audio is expected to not be stored as a file, but instead directly streamed to HuggingFace's Whisper model Inference API. This model will convert the speech to text, so that further processing can be done. I'll provide the Inference API JavaScript code below, but I think it expects a file to read instead of a stream. So, I need help modifying this code as well:

async function query(filename) {
    const data = fs.readFileSync(filename);
    const response = await fetch(
        "https://api-inference.huggingface.co/models/openai/whisper-medium",
        {
            headers: { Authorization: "Bearer ...." },
            method: "POST",
            body: data,
        }
    );
    const result = await response.json();
    return result;
}

query("sample1.flac").then((response) => {
    console.log(JSON.stringify(response));
});

So keeping in mind that the audio is to be streamed, how do I record the user's input from the browser and stream it to HuggingFace?

As of now, I only found the following article the most likely solution: Building a client-side web app which streams audio from a browser microphone to a server. (Part II) But this article focuses on the client sending the audio to an intermediate server which was also separately built, and then the server using API calls to Dialogflow.

I need the same functionality, but without the intermediate server and streaming the audio directly to existing server, via HuggingFace's API call.

edited Feb 4, 2024 at 14:34

mplungjan

180k29 gold badges183 silver badges246 bronze badges

asked Feb 4, 2024 at 14:26

Aadil Sayad

11 bronze badge

Flac would mean I have to save the audio as a file and upload it, followed by downloading on the server end. I'd like to avoid uploading and downloading files if possible, because that might delay the response from the server.

Aadil Sayad
– Aadil Sayad

2024-02-07 01:44:34 +00:00
Commented Feb 7, 2024 at 1:44
I assume you can keep the flac in memory and send it as a blob or something to HuggingFace?

mplungjan
– mplungjan

2024-02-07 06:42:58 +00:00
Commented Feb 7, 2024 at 6:42
stackoverflow.com/questions/73354800/…

Kilian Hertel
– Kilian Hertel

2024-02-17 00:46:18 +00:00
Commented Feb 17, 2024 at 0:46
stackoverflow.com/questions/51689270/…

Kilian Hertel
– Kilian Hertel

2024-02-17 00:48:11 +00:00
Commented Feb 17, 2024 at 0:48
you need a backend server anyway I dont*t understand your problem. Stream to your nodejs server save the file and send to hugging face. Whats the problem?

Kilian Hertel
– Kilian Hertel

2024-02-17 00:50:31 +00:00
Commented Feb 17, 2024 at 0:50

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How do I stream audio from browser mic using JavaScript

0

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked