pepper.framework.sensor.vad module

class pepper.framework.sensor.vad.VAD(microphone)[source]

Bases: object

Perform Voice Activity Detection on Microphone Input

Parameters:microphone (AbstractMicrophone) –
AUDIO_FRAME_MS = 10
AUDIO_TYPE

alias of numpy.int16

AUDIO_TYPE_BYTES = 2
BUFFER_SIZE = 100
MODE = 3
VOICE_THRESHOLD = 0.6
VOICE_WINDOW = 50
activation

VAD Activation

Returns:activation
Return type:float
microphone

VAD Microphone

Returns:microphone
Return type:AbstractMicrophone
voices

Get Voices from Microphone Stream

Yields:voices (Iterable[Voice])
class pepper.framework.sensor.vad.Voice[source]

Bases: object

Voice Object (for Voice Activity Detection: VAD)

add_frame(frame)[source]

Add Voice Frame (done by VAD)

Parameters:frame (np.ndarray) –
audio

Get Voice Audio (Concatenated Frames)

Returns:audio
Return type:np.ndarray
frames

Get Voice Frames (chunks of audio)

Yields:frames (Iterable of np.ndarray)