NLTK | SpaCy | Docker | FastAPI | AWS | Scikit-Learn | Gensim | HuggingFace
1. IMDb Movie Review Sentiment Analysis | Link
- Built an end-to-end binary NLP classification pipeline on 50K+ IMDb reviews, evaluating 5+ models (Logistic Regression, Naive Bayes, SVM) using TF-IDF, BoW, and Word2Vec features.
- Achieved ~88–90% F1-score by selecting Logistic Regression + TF-IDF, and improved model reliability through positive and negative keyword-level interpretability analysis.
- Deployed the final model as a Dockerized FastAPI service on AWS EC2 (Free Tier), enabling real-time sentiment predictions via REST APIs.
2. News Article Classification | Link
- Developed a multi-class news classification system across 10+ categories (Business, Politics, Sports, Entertainment, etc.) using TF-IDF and classical ML models on large-scale text data.
- Improved classification performance by ~10–15% F1-score through systematic model comparison, selecting SVM + TF-IDF as the best-performing approach.
- Extracted and visualized category-specific keyword drivers, demonstrating model interpretability and deployed the solution as a Docker-based FastAPI API on AWS EC2 (Free Tier).
3. Call Summarizer and Sentiment Analyzer | Link
- A simple Gradio UI accepts call transcript in text format.
- An LLM models summarizes the transcript and SpaCy and TextBlob is used to analyze sentment of the transcript [Positive/Neutral/Negative]