Sales Call Analysis Pipeline
Automated pipeline that extracts 150+ call recordings from a legacy dashboard with no API, transcribes them via Deepgram, and uses GPT-4 to surface converting phrases and pain points.
The Problem
Outsourced call handlers used a legacy dashboard with no API and no export. Needed to extract 150+ recordings, transcribe them, and analyze which language converts prospects into customers to inform marketing spend.
My Approach
- 1Attempted Puppeteer DOM scraping on dynamic selectors — brittle and unreliable between sessions.
- 2Discovered predictable URL patterns for recordings, pivoted to Excel-based ID export plus batch download with session cookies.
- 3Added FFmpeg validation to catch corrupt audio files before sending to Deepgram for transcription.
- 4Ran two-level GPT-4 analysis: per-call rubric scoring and batch marketing analysis across groups of 5 transcripts.
Key Decisions & Trade-offs
API-less extraction via URL pattern discovery over DOM scraping (more reliable). FFmpeg pre-validation to avoid wasting Deepgram credits on corrupt files. Batch transcript grouping to stay within GPT-4 token limits.
Outcome & Impact
- Processed 150+ recordings end-to-end
- Surfaced specific phrases prospects responded to, directly injected into Google PPC, social ads, and landing page CTAs
- Pipeline ran in minutes vs. 4–5 hours of manual audio listening
Reflection
When a process becomes time-consuming, tedious, and repetitive — automate it. Real customer data produces real marketing insights — no amount of persona workshops replaces hearing actual prospect language.