AI IntegrationJuly 2025

Filipino Speech Coach

Role

Backend and AI integration developer

Tech stack
Next.jsFastAPIAzure VMWhisperXSupabaseCloudflare R2
Filipino Speech Coach Screenshot 2Filipino Speech Coach Screenshot 3
Overview

A cloud-native speech training application that transmits user audio to a FastAPI backend hosted on an Azure VM. Utilizing the WhisperX model for character-level alignment and phoneme transcription. Metadata and transaction logs are stored in Supabase, while audio recordings and public assets are cached and served via Cloudflare R2 bucket storage.

Key Contributions & Findings
  • 01Integrated WhisperX speech-to-text with phoneme alignment, obtaining character-level timing paired with acoustic analysis for lexical stress analysis.
  • 02Deployed a FastAPI server in an Azure VM
  • 03Utilized Supabase for real-time user database sync, coupled with Cloudflare R2 bucket storage to achieve fast and responsive asset loading.
  • 04Implemented Spaced Repetition scheduling logic to optimize user retention and pronunciation improvement over time
Infrastructure

System Architecture

A cloud-native machine learning pipeline. Audio payloads are sent from the React frontend via REST API endpoints to a FastAPI server hosted on an Azure VM, which executes phoneme-level WhisperX inference. User scores and metadata are stored in Supabase, while raw audio files are persisted securely in Cloudflare R2 bucket storage.