Skip to content
View RaymondSHANG's full-sized avatar

Block or report RaymondSHANG

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
RaymondSHANG/README.md

Hi there, I'm Yuan Shang, PhD πŸ‘‹

I am a Data Scientist based in Cary, NC. I specialize in bridging the gap between advanced statistical modeling, causal inference, and machine learning to solve complex problems in both Marketing Science and Biomedical Research.

  • πŸ”­ Currently: Consultant, building models to quantify incremental impacts and optimize budget decisions.
  • πŸŽ“ Education: Ph.D. in Biochemistry(structure biology) (HKUST), M.S. in Statistics (U of Arizona), and M.S. in Computer Science (Georgia Tech).
  • πŸ”¬ Research: Author of 30+ peer-reviewed publications with 2100+ citations.

πŸ› οΈ Tech Stack & Skills

Machine Learning & AI

  • Deep Learning: PyTorch, CNN, Transformers, Seq2Seq
  • LLMs: Fine-tuning (LoRA), Al Agents, Function Calling, LangChain
  • Domains: Image Processing, Natural Language Processing

Causal Inference & Statistics

  • Methods: Difference-in-Differences (DiD), Synthetic Controls, TBR-MM, A/B Testing, Matching Markets,Marketing Mix Modeling
  • Modeling: Bayesian Modeling (MCMC), Time-Series Analysis, Survival Analysis, Mixed Models

Data Engineering & Tools

  • Languages: Python, R, SQL, SAS
  • Big Data: PySpark/Hive, Google Cloud Platform (GCP)
  • Workflows: Nextflow, ETL Pipelines, Flask, R Shiny

πŸ“Š Professional Highlights

Marketing Decision Science

I currently work on optimizing geo-pairs and driving multi-million-dollar budget decisions using causal analysis. My work involves:

  • Implementing clustering algorithms to identify optimal geo-pair selections.
  • Applying Time-Based Regression Matched Markets (TBR-MM) models.
  • Utilizing Bayesian Inference for robust uncertainty quantification.

Bioinformatics & Computational Biology

With over 8 years of experience in high-dimensional omics, I have:

  • Led analytics collaborations for University of Arizona, processing 1,000+ animal/human datasets.
  • Applied ML to real-world evidence datasets (NACC, ADNI) to uncover drug repositioning opportunities.
  • Developed computational pattern-matching methods for target identification at HKUST.

πŸ”— Connect with Me

Pinned Loading

  1. image-caption-attention-transformer image-caption-attention-transformer Public

    Python 2

  2. MyRNAPipe MyRNAPipe Public

    HTML 1

  3. RL4OVERCOOKED RL4OVERCOOKED Public

    Reinforcement Learning multi Agents for overcooked gaming

    Jupyter Notebook 1

  4. AudioCollector AudioCollector Public

    Python

  5. chatBot_RNASeq chatBot_RNASeq Public

    Python

  6. Clinical-Summarization Clinical-Summarization Public

    Jupyter Notebook