Ketaki Dabade
Columbia University
[chose the former. see projects ↓]
— The Shawshank Redemption (1994) · Andy Dufresne · citing this before it cites me
CS Grad Student
I like building things that work,
breaking things that don't,
and occasionally writing about it.
Machine Learning Track
CRIS Lab under Prof. Venkat Venkatasubramanian. Coursework: Neural Networks & Deep Learning, NLP, Analysis of Algorithms, Continual Learning & Memory Models, Financial Engineering, Databases.
CGPA: 3.74 / 4.0 — Published 3 research papers (1 IEEE, 2 Springer)
Coursework: Data Structures, OS, Computer Networks, OOP, AI, Statistics & Probability, Distributed Computing, HPC, Compiler Design.
CRIS Laboratory, Columbia University
Built a scientific content analysis pipeline: MinerU-based PDF extraction across 3,000+ textbook pages, Qwen3-Embedding for 17,000+ dense vectors, and BERTopic with HDBSCAN to discover 493 semantically coherent topics.
Leveraged Gemma for topic labeling and hierarchical clustering to map prerequisite knowledge relationships. This work lays the foundation for Sparse Autoencoder training on structured knowledge.
LivingScopeHealth
Analyzing large-scale patient data in PostgreSQL to identify early indicators of diabetes onset before clinical diagnosis.
Building classification models (XGBoost, Random Forest) with SMOTE for class imbalance and SHAP for feature importance to enable preventive interventions.
AI4M Technology Private Limited
Trained YOLOv7/v8 defect detection models for manufacturing QC. Deployed on NVIDIA Jetson with DeepStream SDK and TensorRT (FP16/INT8) achieving 3x inference speedup and 25% reduced detection latency.
Designed Flask REST APIs for real-time inference across 3 production lines. Built multi-threaded Docker backend with AWS/Azure data pipelines, CI/CD, and 85% test coverage.
ViLA EmachWirken Private Limited
Built K-Means clustering to identify 5 customer personas. Designed Grafana dashboards tracking 15+ KPIs — revenue, churn, CAC, and operational efficiency.
Conducted EDA on 100K+ transactions using Python and SQL. Automated reporting pipelines, reducing manual work by 40% and enhancing operational visibility by 30%.
End-to-end BCI pipeline from EEG signal acquisition (Emotiv EPOC X) to 3D hand visualization in Blender, achieving 97.63% gesture classification accuracy.
Read PaperAI-powered career counselor using GPT-3, EasyOCR resume parsing, and RIASEC psychometric assessments to generate personalized career path recommendations.
Read PaperAssistive driving system with real-time obstacle detection on NVIDIA Jetson Nano using custom YOLOv7, achieving 0.681 mAP with audio feedback.
Read PaperPatrona — AI Voice Safety Companion
Voice-first AI that walks home with users. Hands-free safety through natural conversation — silence detection, safe words, and live GPS alerts to emergency contacts.
Good evening,
Ketaki.
Last walk
Today · 11:24 PM · 18 min
Listening...
Your companion is right here
Heading to
548 W 113th St, New York
Alert sent.
Your contacts have been notified.
Contacts notified
Mom
Parent
Princess Leia
Roommate
More Projects
Real-time analytics with 15+ risk metrics (Sharpe, VaR, CVaR), mean-variance optimization via SciPy, Monte Carlo simulations, and TWR/IRR calculations with S&P 500 benchmarks.
Real-time SEC filing contradiction detector for S&P 500 companies. Fine-tuned FinBERT for claim classification, hybrid retrieval with pgvector + Neo4j temporal knowledge graph, and LLM agent orchestration via LangChain with custom tools for negation detection, temporal reasoning, and insider transaction lookup. Next.js dashboard with live WebSocket feed.
LoRA achieves F1 ~0.80 with only 0.95% parameter updates while full fine-tuning catastrophically fails. MuRIL outperforms IndicBERT-v2 in few-shot by 2.1% F1 with just 50 examples.
CLIP embeddings + FAISS vector indexing for sub-second similarity search across 10K+ images. Multi-metric scoring combines perceptual hashing, SSIM, and neural embeddings.
Full-stack canteen management with NLP chatbot for natural language food ordering. D3.js analytics dashboard for sales trends and demand forecasting. Flask + MongoDB + React.
Control a 3D hand using brainwaves. Emotiv EPOC X at 256Hz, FFT/wavelet feature extraction, KNN classifier at 97.63% accuracy, real-time Blender visualization. Led team of 4.
Real-time obstacle detection for visually impaired individuals on NVIDIA Jetson Nano. Custom YOLOv7 with TensorRT, Raspberry Pi camera, and audio feedback. Published in Springer CCIS.
AI career guidance platform using GPT-3 + LangChain, EasyOCR resume parsing, RIASEC psychometric assessments, and NLTK entity extraction to generate personalized career path recommendations.
Event management app with intelligent photo organization. DBSCAN clustering on facial embeddings to automatically group event photos by person — no manual tagging needed.
Biometric door lock with R307 fingerprint sensor and Arduino. Optimized matching algorithm for sub-second authentication with secure enrollment system. Led a team of 5.
Literature survey covering neural networks for match prediction, inverse RL for player valuation, multi-agent decision-making in formations, and game-theoretic models for penalty kicks and set pieces.
2026
Patrona AI Voice Safety Companion. Awarded $5,000 in ElevenLabs credits.
2024
CanMan Canteen Management System with NLP chatbot.
2022
ViziAssist ADAS assistive driving project.
2025
1 IEEE (First Author) + 2 Springer (LNNS and CCIS) conference proceedings.
Member
Member
Google / Coursera
DeepLearning.AI / Stanford Online
Accenture / Forage
NVIDIA Deep Learning Institute
Udemy
Google / Coursera
Udemy
[hi, yes. this is where you do exactly that.]
— Taxi Driver (1976) · Travis Bickle