emvoice.formants

Formant-related voice features.

Module Contents

Classes

FormantFrames

Estimate and store formant frames.

FormantAmplitudeFrames

Estimate and store formant amplitudes.

class emvoice.formants.FormantFrames(frames: List, sr: int, frame_len: int, hop_len: int, center: bool = True, pad_mode: str = 'constant', max_formants: int = 5, lower: float = 50.0, upper: float = 5450.0, preemphasis_from: Optional[float] = 50.0, window: Optional[Union[str, float, Tuple]] = 'praat_gaussian')[source]

Bases: emvoice.frames.BaseFrames

Estimate and store formant frames.

Parameters:
  • frames (list) – Formant frames. Each frame contains a list of tuples for each formant, where the first item is the central frequency and the second the bandwidth.

  • max_formants (int, default=5) – The maximum number of formants that were extracted.

  • lower (float, default=50.0) – Lower limit for formant frequencies (in Hz).

  • upper (float, default=5450.0) – Upper limit for formant frequencies (in Hz).

  • preemphasis_from (float, default=50.0) – Starting value for the applied preemphasis function.

  • window (str) – Window function that was applied before formant estimation.

Notes

See the Algorithms section for details.

property idx: numpy.ndarray[source]

Frame indices (read-only).

classmethod from_frames(sig_frames_obj: emvoice.frames.BaseFrames, max_formants: int = 5, lower: float = 50.0, upper: float = 5450.0, preemphasis_from: Optional[float] = 50.0, window: Optional[Union[str, float, Tuple]] = 'praat_gaussian')[source]

Extract formants from signal frames.

Parameters:
  • sig_frames_obj (BaseFrames) – Signal frames object.

  • max_formants (int, default=5) – The maximum number of formants that are extracted.

  • lower (float, default=50.0) – Lower limit for formant frequencies (in Hz).

  • upper (float, default=5450.0) – Upper limit for formant frequencies (in Hz).

  • preemphasis_from (float, default=50.0) – Starting value for the preemphasis function (in Hz).

  • window (str) – Window function that is applied before formant estimation.

class emvoice.formants.FormantAmplitudeFrames(frames: numpy.ndarray, sr: int, frame_len: int, hop_len: int, center: bool, pad_mode: str, lower: float, upper: float, rel_f0: bool)[source]

Bases: emvoice.frames.BaseFrames

Estimate and store formant amplitudes.

Parameters:
  • frames (np.ndarray) – Formant amplitude frames of shape (num_frames, max_formants) in dB.

  • lower (float) – Lower boundary for peak amplitude search interval.

  • upper (float) – Upper boundary for peak amplitude search interval.

  • rel_f0 (bool) – Whether the amplitude is relative to the fundamental frequency amplitude.

Notes

Estimate the formant amplitude as the maximum amplitude of harmonics of the fundamental frequency within an interval [lower*f, upper*f] where f is the central frequency of the formant in each frame. If rel=True, divide the amplitude by the amplitude of the fundamental frequency.

property idx: numpy.ndarray[source]

Frame indices (read-only).

classmethod from_formant_harmonics_and_pitch_frames(formant_frames_obj: FormantFrames, harmonics_frames_obj: emvoice.pitch.PitchHarmonicsFrames, pitch_frames_obj: emvoice.pitch.PitchFrames, lower: float = 0.8, upper: float = 1.2, rel_f0: bool = True)[source]

Estimate formant amplitudes from formant, pitch harmonics, and pitch frames.

Parameters:
  • formant_frames_obj (FormantFrames) – Formant frames object.

  • harmonics_frames_obj (PitchHarmonicsFrames) – Pitch harmonics frames object.

  • pitch_frames_obj (PitchFrames) – Pitch frames object.

  • lower (float, optional, default=0.8) – Lower boundary for peak amplitude search interval.

  • upper (float, optional, default=1.2) – Upper boundary for peak amplitude search interval.

  • rel_f0 (bool, optional, default=True) – Whether the amplitude is divided by the fundamental frequency amplitude.