Skip to content Skip to sidebar Skip to footer

How To Speed Up Processing Time Of Aws Transcribe?

I have 6 second audio recording(ar-01.wav) in wav format. I want to transcribe the audio file to text using amazon services. For that purpose I created a bucket by name test-voip a

Solution 1:

For me, AWS Transcribe took 20 minutes to transcribe a 17 minute file. One possible idea is to split the audio file in chunks and then use multiprocessing with 16 cores at EC2, like a g3.4xlarge instance.

Split the audio file in 16 parts with a silence threshold of -20, then convert to .wav:

$ sudo apt-get install mp3splt
$ sudo apt-get install ffmpeg
$ mp3splt -s -p th=-20,nt=16 splitted.mp3
$ ffmpeg -i splitted.mp3 splitted.wav 

Then, use the multiprocessing with 16 cores transcribing simultaneously, mapping your transcribe function (transcribe.start_transcription_job) for each one of the TranscriptionJobName and job_uri's:

import multiprocessing

output=[]
data = range(0,16)

deff(x):
    job_name = "Name"+str(x)
    job_uri = "https://s3.amazonaws.com/bucket/splitted"+str(x)+".wav"
    transcribe.start_transcription_job(
    TranscriptionJobName=job_name,
    Media={'MediaFileUri': job_uri},
    MediaFormat='wav',
    LanguageCode='pt-BR',
    OutputBucketName= "bucket",
    MediaSampleRateHertz=8000,
    Settings={"MaxSpeakerLabels": 2,
      "ShowSpeakerLabels": True})
    whileTrue:
        status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
        if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED','FAILED']:
            breakdefmp_handler():
    p = multiprocessing.Pool(16)
    r=p.map(f, data)
    return r

if __name__ == '__main__':
    output.append(mp_handler())

Solution 2:

I have researched for a trascription speed guarantee with no luck

In this forum post (requires an aws account) a poster makes a benchmark with the following results:

  • A 10 minute clip took about 5 minutes
  • 40 minute clips take around 17 minutes
  • a 2 hour file took 36 minutes

What seems to be an official Amazon source states that "At this time, transcription speeds are better optimized for audio longer than 30 seconds. You'll start to see a better processing time to audio duration time ratio when the audio file length is about 2 minutes or longer. Having said, this we are working hard to enhance transcription speeds overall"

I hope it helps researchers

Post a Comment for "How To Speed Up Processing Time Of Aws Transcribe?"