Vocal Signals
Voice is a powerful stress indicator. When people are stressed, their vocal pitch rises, speech rate changes, and tonal quality shifts. Nefesh accepts pre-extracted vocal features and fuses them with other signals for a more complete picture.
Accepted Fields
| Field | Type | Values | Description |
|---|---|---|---|
tone | string | calm, focused, hesitant, tense, frustrated, anxious, hostile, excited, neutral | Classified vocal tone |
speech_rate | float | words/min | Speaking speed — elevated rates may indicate agitation |
pitch_variability | float | Hz | Variation in fundamental frequency — low variability can indicate monotone/suppressed speech |
How It Works
Nefesh does not process raw audio. You send pre-classified vocal features (from your own speech analysis pipeline or a third-party service), and Nefesh fuses them with cardiovascular, visual, and textual signals.
The tone field has the strongest impact. Values like tense, anxious, and hostile shift the stress score upward, while calm and neutral shift it downward.
Example Payload
{
"session_id": "sess_abc123",
"timestamp": "2026-03-30T14:30:00Z",
"tone": "tense",
"speech_rate": 168.5,
"pitch_variability": 42.3
}
Privacy
No audio is sent to or stored by Nefesh. Only the extracted features (tone classification, speech rate, pitch variability) are transmitted. Raw audio stays on the client.