- Large Language Models (LLMs), Prompt Engineering, Markdown Prompt Frameworks
- Retrieval-Augmented Generation (RAG): embeddings, vector databases (Qdrant, FAISS, Pinecone, Weaviate), hybrid search
- LangChain, MCP (Model Context Protocol), FastMCP tool development
- Building AI chatbots, virtual agents, and domain-specific reasoning systems
- NLP tasks: text classification, sentiment analysis, NER, topic modeling, conversational AI
- Algorithms: Linear/Logistic Regression, Decision Trees, Random Forests, SVM, KNN, Naive Bayes, Gradient Boosting, Neural Networks
- Deep Learning: CNNs, RNNs, LSTMs
- Reinforcement Learning: Q-learning, DQN, Policy Gradients
- Model evaluation: cross-validation, hyperparameter tuning, accuracy, precision, recall, F1, ROC-AUC, MSE
- Feature scaling, encoding, handling missing data
- Dimensionality reduction: PCA, LDA
- Feature selection & engineering for ML & DL models
- Data preprocessing pipelines (Python / Scikit-learn)
- Deploying ML/LLM models via TensorFlow Serving, Flask, Django
- Cloud AI platforms: AWS SageMaker, Azure AI, GCP AI Platform, IBM Watson
- Experience with containerized and on-prem LLM deployment (Ollama)
- Python (NumPy, Pandas, Scikit-learn, TensorFlow, Keras), R
- Exploratory Data Analysis (EDA), statistical analysis, insights generation
- Jupyter Notebooks workflow
- Visualization: Matplotlib, Seaborn, Plotly
- Image classification, object detection, segmentation
- Tools: OpenCV, TensorFlow, PyTorch
π AI Team Balancing and Roster Optimization Platform β Full-Stack AI Developer (POC)
- Engineered a full-stack, AI-driven roster optimization platform (Go, React/Vite, PostgreSQL) to solve dynamic, multi-criteria team balancing challenges for a 30+ player volleyball league.
- Integrated a local LLM (Qwen via Ollama) into the Go backend to function as a constrained optimization engine, generating three distinct, highly balanced team configurations weekly.
- Designed a comprehensive PostgreSQL schema to capture and unify disparate data points, including 8+ player skill ratings (1-10 scale), historical team assignment consistency, and inter-player conflict data.
- Implemented a structured prompt and response mechanism to feed complex criteria to the LLM and receive actionable JSON outputs, ensuring reliability in team assignment (6 players per team, 4 teams total).
- Developed a RESTful Go API (Gin) for data orchestration, handling CRUD operations and managing the state of the LLM interaction workflow, from availability input to final team commitment.
- Created a modern React/Vite frontend allowing the coach to easily input weekly availability, visualize the LLM's proposed team balancing scores (e.g., Consistency vs. Balance), and commit the final roster.
- Delivered a proof-of-concept system that consistently achieved the dual objectives of maximizing team competitive balance (close matches) and fostering long-term teamwork consistency, resolving a recurring organizational challenge through AI.
AI Chatbot & Intelligent Analytics Platform β AI Engineer (POC)
-
Built a privacy-preserving RAG chatbot using Python, FastAPI, and LangChain, enabling natural-language access to proprietary business data.
-
Deployed and evaluated local LLMs with Ollama; benchmarked multiple models and selected Gemma for optimal on-prem performance.
-
Compared FAISS, Pinecone, Qdrant, Weaviate and deployed Qdrant as the vector store for high-performance semantic search.
-
Created MCP tools with FastMCP to securely connect to PostgreSQL for sales analytics, forecasting, and inventory health checks.
-
Integrated backend LLM reasoning with structured data via LangChain + MCP, enabling natural-language predictive insights.
-
Developed React + Vite and Vue UIs delivering a modern conversational interface for business and sales teams.
-
Delivered an end-to-end platform blending RAG, LLMs, vector search, and predictive modeling, demonstrating real-world value for sales and customer-service workflows.
Developed and optimized a multimodal AI solution based on the latest MiniCPM-V series models for a client-facing proof of concept.
- The project focused on MiniCPM-Llama3-V 2.5 and MiniCPM-V 2.0, implementing real-world applications in OCR and multilingual dialogue systems using advanced MLLMs (Multimodal Large Language Models).
- Built interactive web-based QA systems using Gradio and Streamlit, enhancing user experience through responsive interfaces and real-time inference integration.
- Applied quantization techniques (8-bit and 4-bit) using the llama.cpp framework to optimize model performance under resource constraints.
- Explored vLLM-based inference, including source code analysis to understand internal model execution and streamline deployment.
- Implemented both full-parameter fine-tuning and LoRA-based fine-tuning using Hugging Face Transformers with DeepSpeed, enabling scalable model customization on proprietary datasets.
- Configured advanced training pipelines with customized datasets to address client-specific tasks, improving adaptability and task accuracy.
- Performed model performance evaluation and optimization, including experiment tracking, output quality assessment, and latency reduction.
- Researched and integrated the latest techniques in multimodal learning, aligning the solution with current industry trends and research directions.
- Delivered a fully functional prototype with detailed technical documentation and deployment instructions for client adoption and future scalability.
Designed and implemented a localized, privacy-focused AI solution as a proof of concept for a client.
- The POC involved deploying and managing large language models (LLMs) such as Gemma and Ollama on the client's Windows and Mac environments using Docker.
- Focused on data privacy and on-premise control, the solution included full environment setup and container orchestration.
- Integrated advanced AI tools like openWebUI and anythingLLM to develop a customized AI assistant and secure knowledge base.
- Configured environment variables, managed local model files, and optimized inference performance with precise control over system resources.
- Built an interactive user interface with openWebUI to enhance usability and AI interaction.
- Constructed a local knowledge repository using anythingLLM for intelligent document management and semantic retrieval.
- Bridged the gap between cutting-edge AI capabilities and real-world deployment through a hands-on, secure architecture.
- Provided detailed guidance to the client for operating, maintaining, and expanding the AI system autonomously.
- Empowered the client to adopt AI innovations internally while maintaining full ownership of their data and infrastructure.
Developed a sophisticated knowledge-based ontology graph database application tailored for a federal client's needs.
- Seamlessly transitioned from a traditional relational data model to a dynamic Graph data model, leveraging the robust capabilities of Neo4j for enhanced data representation and querying efficiency.
- Implemented a comprehensive Natural Language Processing (NLP) pipeline utilizing cutting-edge Python libraries and Azure Cognitive Services APIs. This pipeline adeptly analyzed extensive datasets, extracting pertinent entities, identifying key phrases, categorizing content, and performing sentiment analysis.
- Leveraged advanced data science algorithms within Jupyter Notebook to derive meaningful insights from the analyzed data, empowering decision-makers with actionable intelligence.
- Utilized Neo4j Bloom to visualize intricate ontology knowledge graphs, facilitating intuitive exploration and understanding of complex relationships within the data.
- Expanded the scope of data acquisition by integrating additional feeds from diverse social media platforms. Leveraged the updated data to re-run the NLP pipeline, ensuring that insights remained current and reflective of evolving trends and sentiments across various digital channels.
Developed a Python-based AI project for a government client aimed at predicting the likelihood of payment for issued parking tickets.
- Gathered diverse datasets from multiple sources to ensure comprehensive coverage and accuracy in training the predictive model.
- Employed robust data preprocessing techniques to filter out outliers and ensure the integrity of the dataset, enhancing the reliability of subsequent analyses.
- Split the dataset into distinct training and testing sets to enable the evaluation of model performance on unseen data, mitigating against overfitting and ensuring generalizability.
- Leveraged a variety of machine learning algorithms to train predictive models, rigorously evaluating their performance against established metrics.
- Utilized Area Under the ROC Curve (AUC-ROC) as a primary evaluation metric to identify the most effective model in predicting the likelihood of payment for parking tickets.
- Selected the best-performing model based on AUC-ROC scores, ensuring optimal predictive accuracy and reliability for informing decision-making processes within the government agency.
Developed a Python data science project tailored for a prominent supermarket chain client to optimize their store location strategy and enhance customer experience.
-
Initiated the project by gathering comprehensive neighborhood data from Wikipedia, leveraging web scraping techniques to extract relevant information such as demographics, population density, and socioeconomic indicators.
-
Augmented the neighborhood dataset by incorporating geographical coordinates using the Geocoder library, facilitating precise mapping and analysis of locations.
-
Employed the powerful capabilities of the Foursquare Places API to retrieve detailed location and venue data, including nearby amenities, competitors, and popular attractions within each neighborhood.
-
Utilized advanced data preprocessing techniques to clean and prepare the dataset for analysis, including handling missing values, normalizing features, and encoding categorical variables.
-
Applied k-means clustering algorithm to segment neighborhoods based on similarities in venue characteristics, enabling the identification of distinct clusters representing different customer preferences and market segments.
-
Conducted thorough exploratory data analysis (EDA) to gain insights into the distribution of clusters and understand the underlying patterns driving customer behavior and preferences.
-
Leveraged visualization tools such as matplotlib and seaborn to create informative visualizations, including scatter plots, heatmaps, and cluster maps, to effectively communicate findings and insights to stakeholders.
-
Provided actionable recommendations to the client based on the clustering results, including insights into optimal store locations, potential expansion opportunities, and strategies for improving customer engagement and retention.
- Amazon SageMaker: A fully managed machine learning service that enables developers to build, train, and deploy machine learning models at scale.
- Amazon Lex: A service for building conversational interfaces into any application using voice and text.
- Amazon Polly: A text-to-speech service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.
- Amazon Rekognition: A service for adding image and video analysis to applications, including object and scene detection, facial analysis, and celebrity recognition.
- Amazon Comprehend: A natural language processing (NLP) service for extracting insights and relationships from unstructured text.
- Amazon Transcribe: A service for converting speech to text, enabling developers to transcribe audio files or streams into text in real time.
- AWS DeepLens: A deep learning-enabled video camera that allows developers to experiment with and deploy deep learning models on the edge.
- AWS DeepRacer: A fully autonomous 1/18th scale race car driven by reinforcement learning models, designed to help developers learn about reinforcement learning and machine learning in a fun way.
- AWS Deep Learning AMIs: Preconfigured Amazon Machine Images (AMIs) that come with popular deep learning frameworks and libraries installed, allowing developers to quickly set up deep learning environments on EC2 instances.
- Google Cloud AI Platform: A suite of machine learning services that enables developers to build, train, and deploy machine learning models at scale.
- Google Cloud AutoML: A suite of machine learning products that enables developers with limited machine learning expertise to train high-quality custom models for specific use cases, such as image recognition and natural language processing.
- Google Cloud Vision API: A service for integrating image recognition and analysis capabilities into applications.
- Google Cloud Speech-to-Text: A service for converting speech into text in over 120 languages and variants.
- Google Cloud Text-to-Speech: A service for converting text into natural-sounding speech in over 30 languages and variants.
- Google Cloud Translation API: A service for translating text between languages using pre-trained models or custom machine learning models.
- Google Cloud Natural Language API: A service for analyzing and extracting insights from unstructured text using machine learning models.
- Google Cloud Video Intelligence API: A service for annotating and analyzing videos using machine learning models to detect objects, faces, and activities.
- Google Cloud Speech Command Recognition: A service for recognizing speech commands in audio data using machine learning models, designed for IoT and other edge device applications.
- Google Cloud AI Hub: A hosted repository and collaboration platform for discovering, sharing, and deploying machine learning models and datasets.
- Azure Machine Learning: A cloud-based service for building, training, and deploying machine learning models at scale, with support for various programming languages and frameworks.
- Azure Cognitive Services: A collection of AI-powered APIs and pre-built models for vision, speech, language, and decision-making, enabling developers to add intelligent features to their applications.
- Azure Bot Service: A service for building, testing, and deploying chatbots across multiple channels, with built-in natural language understanding (NLU) capabilities.
- Azure Cognitive Search: A fully managed search-as-a-service solution that uses AI to provide relevant search results and enrich content with semantic understanding.
- Azure Speech Service: A cloud-based service for speech recognition and speech synthesis, with support for custom speech models and real-time transcription.
- Azure Computer Vision: A service for analyzing and extracting insights from images and videos, including object detection, image recognition, and image classification.
- Azure Language Understanding (LUIS): A service for building natural language understanding models to interpret user intents and extract relevant information from text inputs.
- Azure Personalizer: A reinforcement learning-based service for personalizing content and recommendations in applications, based on user interactions and feedback.
- Azure Video Analyzer: A service for analyzing and extracting insights from videos, including object tracking, motion detection, and scene understanding.
- Azure IoT Edge: A service for deploying AI models and analytics to edge devices, enabling real-time processing and decision-making at the edge of the network.
- Watson Assistant: A conversational AI platform that enables developers to build and deploy chatbots and virtual assistants across multiple channels.
- Watson Discovery: A service for analyzing unstructured data and extracting insights using natural language processing (NLP) and machine learning.
- Watson Language Translator: A service for translating text between languages using advanced machine learning models.
- Watson Natural Language Understanding: A service for analyzing text and extracting metadata such as entities, keywords, and sentiment.
- Watson Speech to Text: A service for converting spoken language into text in real time, with support for multiple languages and audio formats.
- Watson Text to Speech: A service for synthesizing natural-sounding speech from text in multiple languages and voices.
- Watson Visual Recognition: A service for analyzing and classifying images using machine learning models, with support for custom image classifiers.
- Watson Studio: A cloud-based platform for building and deploying machine learning models, with support for data preparation, model training, and model deployment.
- Watson Knowledge Catalog: A service for managing and curating data assets, including datasets, models, and notebooks, to facilitate collaboration and reuse.
- Watson OpenScale: A service for monitoring and managing AI models in production, including bias detection, model fairness, and performance monitoring.
To ensure fair foul judgments in basketball games with friends, we can design a system that leverages sensor technology and AI algorithms. Here's a feasible invention idea:
Integrate sensors and AI to record match data in real-time and automatically determine fouls.
- Smart Wristbands or Protective Gear: Each player wears lightweight smart devices (e.g., wristbands, knee pads) equipped with accelerometers, gyroscopes, and pressure sensors to monitor body movements, collision intensity, and direction.
- Shoe Sensors: Embedded pressure sensors to detect foot movement, stepping out of bounds, or illegal steps.
- Built-in sensors: Detect the ball's speed, position, and contact details.
- Integration with wearables: Monitor illegal contacts (e.g., hand-checking or slapping the ball).
- Pressure sensors under the floor: Track player movements, positions, and foot placement in real-time.
- Camera system: Capture player actions from multiple angles and use AI algorithms to identify fouls (e.g., pushing, grabbing).
- Action Recognition: Use computer vision and motion capture algorithms to analyze player movements against rules.
- Collision Intensity Detection: Determine if the intensity of contact exceeds foul thresholds using sensor data.
- Rule Matching: Compare sensor data with a built-in rule database to decide on fouls in real-time.
- Penalty Notifications: Display foul decisions via screens on the court or through players' devices.
- Replay Support: Provide slow-motion replays to explain foul calls, helping both sides understand the rationale.
- Advantages:
- More objective and fair decisions.
- Reduce disputes caused by foul controversies.
- Add a sense of technology to casual games.
- Challenges:
- Equipment cost: Making smart devices affordable for widespread use.
- Data accuracy: The AI algorithm requires sufficient training data to minimize errors.
- Rule adaptation: Setting reasonable penalty thresholds to avoid being overly strict.
Such a system is not only suitable for casual games with friends but also for amateur or semi-professional leagues.
I need some sponsors!
Here is my WIP project:
- Gather the data from the history for all criminal events in my city
- Take the live feed from different resources (e.g. police reports, call centers, governments, witnesses etc.)
- Do the data cleansing and remove outliners
- Split the data set for training and test data
- Choose the best ML algorithm and use it to analyze the data
- Predict the highest freqenent days/hours for all particular zones
- Notify the police stations and recommend them to prepare more on the crime peek days/hours
Here's a high-level overview of how I will proceed:
-
Data Gathering: Collect historical data on criminal events in our city from relevant sources such as police records, crime databases, news reports, and government statistics. Additionally, set up mechanisms to gather live data from sources like police reports, emergency call centers, government agencies, and eyewitness accounts.
-
Data Cleansing and Preprocessing: Cleanse the collected data to remove inconsistencies, errors, and outliers. This may involve tasks such as handling missing values, standardizing data formats, and identifying and removing erroneous entries.
-
Data Splitting: Split the cleaned dataset into training and testing subsets. The training dataset will be used to train our machine learning model, while the testing dataset will be used to evaluate its performance.
-
Model Selection and Training: Choose the most appropriate machine learning algorithm for our problem. This could involve experimenting with different algorithms such as decision trees, random forests, support vector machines, or neural networks. Train the selected model using the training dataset.
-
Model Evaluation: Evaluate the performance of our trained model using the testing dataset. Common evaluation metrics for classification tasks include accuracy, precision, recall, and F1 score. Choose the evaluation metrics that are most relevant to our project's objectives.
-
Prediction: Once our model is trained and evaluated, use it to make predictions on new data. In this case, use the trained model to predict the highest frequency days and hours for criminal events in particular zones of our city.
-
Notification and Recommendation: Develop a mechanism to notify relevant authorities, such as police stations, about the predicted high-frequency days and hours for criminal activity in specific zones. Additionally, provide recommendations on how they can prepare and allocate resources accordingly.
-
Deployment and Monitoring: Deploy our model and notification system in a production environment. Continuously monitor the system's performance and update the model as needed with new data and insights.
-
Feedback Loop: Establish a feedback loop to collect data on the effectiveness of our predictions and recommendations. Use this feedback to refine and improve our model over time.
-
Documentation and Maintenance: Document the entire project, including data sources, preprocessing steps, model selection, training, and deployment processes. Regularly maintain and update our system to ensure its effectiveness and relevance over time.