Suraj Bhardwaj
Data Science | AI Systems | Multimodal Representation Learning | SSL | VLMs | Agentic AI
Data Scientist/AI Engineer
I’m Suraj Bhardwaj, a Data Scientist and AI Engineer based in Germany, focused on turning messy real-world data into measurable outcomes and reliable systems. I work comfortably across the full data science lifecycle: framing problems with stakeholders, building data pipelines and analysis, training and validating ML/DL models, and delivering production-ready services with monitoring and continuous improvement.
At Fraunhofer IOSB, I worked as a Research Assistant in the Perceptual User Interfaces research group within the Human–AI Interaction department (headed by Dr. Michael Voit) on the KARLI and SALSA projects, where I built reproducible data and ML pipelines for heterogeneous multimodal datasets and translated stakeholder requirements into clear KPIs and maintainable components. I accelerated iteration by eliminating major performance bottlenecks, implemented monitoring and dashboards for throughput, latency, error rates, and resource utilization, and automated large-scale benchmark runs and reporting (Ray-based). In parallel, I integrated vision–language models and multimodal pipelines for driver attention analysis and sleep stage recognition, collaborating with Dr.-Ing. Michael Voit, Dr.-Ing. Frederik Diederichs, and M.Sc. David Lerch, and developed an evidence-grounded RAG Q&A capability over internal documents.
My academic work includes an M.Sc. in Mechatronics (University of Siegen, AI specialization) and research spanning computer vision, self-supervised learning, multimodal learning, generative AI, 3D deep learning, and adversarial robustness. My thesis on self-supervised driver distraction detection was conducted in collaboration with Fraunhofer IOSB and the Computer Vision Group at Uni Siegen under the guidance of Prof. Michael Möller, Prof. Dr. Jovita Lukasik and M.Sc. David Lerch. In my thesis, I proposed Clustered Feature Weighting (CFW), a label-free batching method for imbalanced data, and improved RGB-to-IR cross-modality generalization by using self-supervised learning based representations. I also co-authored an IEEE ITSC 2025 paper and served as a peer reviewer for the venue.
Earlier, in the Visual Computing Group led by Prof. Margret Keuper, I worked on out-of-distribution robustness, GANs, latent diffusion models, and neural radiance fields (NeRFs), building strong foundations in computer vision and generative AI.
What you can expect from me
- Data Science & Analytics: EDA, feature engineering, statistical thinking (ANOVA, A/B testing), forecasting, calibration/error analysis, KPI storytelling, and dashboarding (Tableau/Plotly).
- Data Engineering foundations: SQL (data modeling, quality checks), APIs, pipeline thinking; currently expanding hands-on PySpark for big-data workflows.
- ML/DL & Applied Research: robust evaluation, ablations, generalization checks, imbalance-aware metrics; CV/NLP/multimodal systems.
- Production delivery (MLOps mindset): Dockerized services, CI/CD (GitHub/GitLab), MLflow/DVC, monitoring/drift detection, cloud deployment on Azure/AWS.
- GenAI & Agentic AI: RAG pipelines, evaluation-first iteration (RAGAS/Giskard), and agentic tooling (AutoGen, LangGraph, MCP-style servers).
If you’re hiring for Data Scientist / AI Engineer roles and care about both solid analytics and production-grade AI, feel free to reach out (LinkedIn is the fastest).
KARLI Final Event
At the KARLI Final Event, I presented an advanced occupant monitoring ML system to stakeholders and researchers, deployed inside a Level 3 Mercedes-Benz vehicle. The demo highlighted how multimodal perception and real-time ML pipelines can support safer, more intuitive human–vehicle interaction.
More about the system: Advanced Occupant Monitoring System. More context on the department’s mission and research directions: Human–AI Interaction @ Fraunhofer IOSB.
news
| Nov 18, 2025 | Project Alert: Developed the GDPR RAG Assistant – Evaluation-First Legal Compliance Chatbot. Framed and implemented the problem of trustworthy GDPR Q&A: a RAG system that provides auditable, citation-backed answers and clearly signals when the knowledge base lacks coverage. |
|---|---|
| Jul 07, 2025 | Publication Alert: My paper “Self-supervised Driver Distraction Detection for Imbalanced Datasets” got accepted for publication and presentation as full paper in the IEEE 28th International Conference on Intelligent Transportation Systems (ITSC 2025). |
| Sep 19, 2024 | KARLI Final Event: Led technical demonstrations of a Level 3 Mercedes-Benz Advanced Occupant Monitoring System, communicating its machine learning pipeline and real-world relevance to investors, scientists, and public sector officials. |
selected publications
-
Self-supervised Driver Distraction Detection for Imbalanced DatasetsIn Proceedings of the IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), Nov 2025