3

I have a conversation in wav file (customer service) I split it to 2 audio channels. Now I have 2 wav files and each person is speaking and it has silence periods. I need to cut out those silent periods to "compress" all one's persons words in shorter file.

I googled and found this link. It has this code:

def addFrameWithTransition(self, image_file, audio_file, transition_file): media_info = MediaInfo.parse(transition_file) duration_in_ms = media_info.tracks[0].duration audio_file = audio_file.replace("\\", "/") try: audio_clip = AudioSegment.from_wav(r"%s"%audio_file) f = sf.SoundFile(r"%s"%audio_file) except Exception as e: print(e) audio_clip = AudioSegment.from_wav("%s/pause.wav" % settings.assetPath) f = sf.SoundFile("%s/pause.wav" % settings.assetPath) duration = (len(f) / f.samplerate) audio_clip_with_pause = audio_clip self.imageframes.append(image_file) self.audiofiles.append(audio_clip_with_pause) self.durations.append(duration) self.transitions.append((transition_file, len(self.imageframes) - 1, duration_in_ms / 1000)) 

But it needs some kind of 'image file'. any other options?

3
  • How do you define silence? Is it period without ANY sound below certain threshold or lack of voice? Commented Apr 10, 2020 at 20:12
  • @LukaszTracewski lack of voice Commented Apr 10, 2020 at 20:18
  • 1
    github.com/pradbajaj/bothoven/blob/master/sound.py. This is for detecting silence and taking those part where the sound is and finding the frequency nodes Commented Apr 21, 2020 at 2:19

1 Answer 1

1

i found a small vad.py file that splits a conversation into two and actually compresses each voice track. In the end you will have 2 wav files with only 1 person speaking.

https://github.com/mauriciovander/silence-removal/blob/master/vad.py

works like this:

python vad name_of_new_file.wav 
Sign up to request clarification or add additional context in comments.

1 Comment

The code you shared is not, despite the name, voice activity detector (VAD). It's a crude "activity" detector that will be triggered by any noise, not just voice, that goes beyond threshold.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.