Skip to content
View panwarnalini-hub's full-sized avatar
😊
Quiet days, steady progress.
😊
Quiet days, steady progress.

Block or report panwarnalini-hub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
panwarnalini-hub/README.md
Typing SVG


Published Packages

docling-extractor

PyPI version Python 3.8+ Downloads

Production-grade document extraction with intelligent fallback chains

pip install docling-extractor

Documentation | Source

delta-lake-utils

PyPI version Python 3.8+ PyPI Downloads

Production utilities for Delta Lake table management and optimization

pip install delta-lake-utils

Documentation | Source



Tech Stack

Cloud & Data Platforms

Azure Databricks Spark Delta Lake

Languages & Tools

Python PySpark SQL Pandas NumPy

ML & Visualization

Transformers scikit-learn Power BI Streamlit



GitHub Activity

Activity Graph



Featured Projects

Clinical Document Intelligence Pipelines

Repo Demo PyPI

Production medallion architecture for clinical trial document processing.

Key Achievements:

  • Published extraction library to PyPI
  • 87-category classification from ICH-GCP taxonomy
  • Fine-tuned SapBERT biomedical NER: 74.1% F1 score
  • Unity Catalog governance with Delta Lake

Tech: Databricks PySpark Delta Lake Transformers NLP

Delta Lake Optimization Toolkit

Repo PyPI

Production-grade utilities for Delta Lake table management.

Key Achievements:

  • Published package to PyPI
  • Smart OPTIMIZE with auto Z-ORDER
  • Table health diagnostics
  • Medallion pipeline generator
  • Unity Catalog auditor

Tech: Databricks PySpark Delta Lake Python

Real-Time Gesture ML Pipeline

Repo

Medallion platform for computer vision feature engineering.

Key Achievements:

  • Bronze-Silver-Gold architecture
  • Real-time webcam landmark processing
  • 21 hand + 468 face landmarks
  • ML-ready feature vectors

Tech: Python MediaPipe Computer Vision Real-Time

NASA Exoplanet Analysis

Repo

Databricks Hackathon Submission

Scientific exploration of 5000+ confirmed exoplanets.

Key Achievements:

  • PySpark data transformations
  • SQL analytics and habitability scoring
  • Kepler's law validation

Tech: Databricks PySpark SQL Scientific Computing

Real-Time Stock Market Analytics

Repo Demo

Production-grade stock market analytics pipeline with medallion architecture and live dashboard.

Key Achievements:

  • Near real-time stock ingestion with API rate-limit awareness
  • Bronze–Silver–Gold pipeline on Databricks
  • Delta Lake storage with partitioning
  • Interactive Streamlit dashboard for ad-hoc analysis

Tech: Databricks PySpark Delta Lake Streamlit APIs

Spotify Listening Analytics

Repo

Spotify Web API pipeline with medallion architecture.

Key Achievements:

  • Real-time API data ingestion
  • Listening pattern analysis
  • Power BI dashboards

Tech: Databricks PySpark APIs Power BI



Connect

Building scalable data solutions with Azure and Databricks


LinkedIn PyPI Kaggle Email



Popular repositories Loading

  1. clinical-doc-pipelines clinical-doc-pipelines Public

    Clinical trial document intelligence pipelines using medallion architecture. Classification (87 categories) + NER (8 entity types) on Databricks.

    Python 1 1

  2. fabric-data-engineer-project fabric-data-engineer-project Public

    Data engineering project using Azure Fabric, Spark, and Power BI. Covers ingestion, schema documentation, transformations, and visualization.

    Python 1

  3. executive-sales-summary executive-sales-summary Public

    Power BI dashboard analyzing Contoso sales data (2011–2013 actuals, 2014 forecast)

  4. panwarnalini-hub panwarnalini-hub Public

  5. sales-performance-dashboard sales-performance-dashboard Public

    Jupyter Notebook

  6. sql-window-functions-ecommerce sql-window-functions-ecommerce Public

    Jupyter Notebook