👋 Hi, I'm Clébio Júnior
Data Scientist with 4+ years of experience building machine learning and NLP solutions in production, focused on solving real-world business problems in risk, credit, legal, and customer analytics.
I specialize in working with structured and unstructured data, developing predictive models, extracting insights from text and documents, and translating data into actionable decisions with measurable business impact.
I have hands-on experience with libraries such as scikit-learn, spaCy, pdfplumber, pytesseract, and pandas, as well as techniques like NLP, clustering, and supervised learning.
Get my resume
Connect with me
👨💻 About me
-
🎓 Master's degree in Natural Sciences (UENF) and Bachelor's degree in Physics (IFF) -
📊 Strong background in predictive modeling, NLP, and customer analytics -
🛠️ Skilled withPython,scikit-learn,spaCy,pandas,pdfplumber,pytesseract -
🗂️ Hands-on experience with unstructured data extraction from PDFs and images -
✍️ I share technical content on Medium
💼 Professional Experience
Data Scientist | Vert Analytics (Oct/2024 – Present, Remote)
- Automated legal opinion defense recommendations using NLP and FAISS.
- Developed solutions for unstructured data extraction (PDFs and images) using pdfplumber, Tesseract OCR, regex.
- Applied NLP (spaCy) to analyze social media comments, identifying customer concerns and dissatisfaction patterns.
Data Scientist | Datarisk (Jan/2022 – Aug/2024, Remote)
- Built credit scoring models, sales forecasting, and customer segmentation using machine learning and clustering.
- Developed predictive models for customer behavior (default risk, plan upgrade likelihood, job instability).
- Delivered insights supporting strategic decision-making and risk reduction.
Data Scientist | Be.X! (Mar/2021 – Jan/2022, Remote)
- Processed structured and unstructured data using regex and data cleaning techniques.
- Implemented outlier detection algorithms based on business rules for risk mitigation.
- Created ML models for delivery delay prediction, improving logistics operations.
📌 Featured Topics & Interests
- Applied Machine Learning in production
- Natural Language Processing (NLP)
- Unstructured data extraction and analysis
- Credit risk and customer behavior modeling
- Turning data into actionable business insights
Personal projects
View all- Loading
About
Data Scientist with 4+ years of experience building machine learning and NLP solutions in production, focused on solving real-world business problems in risk, credit, legal, and customer analytics.