VocalizeWeb is a scalable AI platform that allows organizations to build intelligent voice assistants using their own documents and knowledge bases. Users can upload PDFs, research papers, and other files, then interact with the system through voice or text to receive accurate, document-based answers with source citations. The platform combines speech recognition, text-to-speech, and Retrieval-Augmented Generation (RAG) to provide fast and natural conversations in over 99 languages. Built with technologies such as FastAPI, React, Whisper, and Kokoro-82M, VocalizeWeb ensures secure data isolation for each client, supports more than 100 concurrent users, and delivers reliable real-time responses with low latency, making it a practical enterprise solution for privacy-focused, voice-first AI applications.
Tools: FastAPI,LangChain,TTS: kokoro,Docker,Azure,STT: Faster Whisper,Postgresql,Faiss
Department: Department of Computer Science
Poster