Voice to Voice AI Assistant for Universities and Enterprises

VocalizeWeb is a scalable AI platform that allows organizations to build intelligent voice assistants using their own documents and knowledge bases. Users can upload PDFs, research papers, and other files, then interact with the system through voice or text to receive accurate, document-based answers with source citations. The platform combines speech recognition, text-to-speech, and Retrieval-Augmented Generation (RAG) to provide fast and natural conversations in over 99 languages. Built with technologies such as FastAPI, React, Whisper, and Kokoro-82M, VocalizeWeb ensures secure data isolation for each client, supports more than 100 concurrent users, and delivers reliable real-time responses with low latency, making it a practical enterprise solution for privacy-focused, voice-first AI applications.

Keywords: Conversational AI,Text-to-Speech,Agentic AI Systems,Speech-to-Text,Digital Learning Assistant,API Integration,Microservices Architecture,Large Language Models
Tools: FastAPI,LangChain,TTS: kokoro,Docker,Azure,STT: Faster Whisper,Postgresql,Faiss
Department: Department of Computer Science

Poster

Team Members

Name	Email	CV
Shehzana Bibi	bscs22f44@namal.edu.pk
Sarmad Sultan	bscs22f15@namal.edu.pk

Voice to Voice AI Assistant for Universities and Enterprises