Milena Šošić

PhD for Computer Science - NLP specialization

Faculty of Mathematics, University of Belgrade

Serbia

prof_pic.jpg

Computer scientist with twenty years of experience in machine learning, text mining, business intelligence and general software algorithms development and implementation. Worked on high-visibility projects in the commercial, banking and telecommunications areas. Has a knowledge of CRISP-DM methodology for data mining.

Able to influence the strategic direction of the company by identifying opportunities in large, rich data sets, creating and implementing data driven strategies that fuel growth including revenue and profits.

Design and implement predictive models and cutting edge algorithms utilizing diverse sources of data to predict demand and optimize business processes.

Utilize analytical tools (Weka, SPSS, STATISTICA, Azure Cloud) to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into strategies that drive value.

Working experience in software development, architecture and management include, but is not limited to, following areas: .NET technology development, information systems architecture and optimization, design algorithms, database architecture, business intelligence and machine learning. Especially,

  • Methodologies: Object oriented software development and design (OOD and OOP),

  • Programming languages, databases and technologies: C#/ASP.Net, MS MVC, MS SQL Server. PostgreSQL, HTML, JavaScript, jQuery, Python,

  • Machine learning tools: Weka, SPSS, STATISTICA, IBM Infosphere, IBM Intelligent Miner, Microsoft Azure Cloud Services, python libraries for machine learning (scikit-learn, numpy, pandas, matplotlib, seaborn, statsmodels, jupyter notebook), python libraries for deep learning (keras, tensorflow, pytorch), LLM & AI tools (langchain, langgraph, CrewAI, promptflow, LiteLLM, vLLM, Unsloth, DSPy, Langfuse, Docling),

  • Specialization in the field of NLP, especially: computational linguistics, semantic text analysis, named entity recognition, part of speech tagging, topic modeling, n-gram language modeling, knowledge construction from unstructured textual data, summarization, information retrieval,

  • Hands-on experience with data preparation including: data cleansing, feature engineering and visualization,

  • Reading and writing machine learning scientific publications and transform their results into business value.

Education

Interests

Natural Language Processing (NLP)
Computational linguistics · Semantic text analysis · Named Entity Recognition (NER) · PoS tagging · Topics modeling · Language modeling · Knowledge construction from richly formatted textual data · Summarization
LLMs
Architectures · Bias · Content Genearation · Annotation · Validation · Hallucination · Serbian language
Serbian Language Resources
JeRTeh · Group · Membership · Resources · Technologies · Serbian language
Hobbies
Hiking · Reading · Playing flute

news

Jan 10, 2026 New project pages, updated CV, and additional references are now available. :sparkles: :memo: :books:
Aug 6, 2023 An initial version of the website has been created! :sparkles: :blush:

selected publications

2026

  1. emotions.gif
    Building an emotion lexicon for Serbian using curated language resources
    Milena Šošić, Jelena Graovac, and Ranka Stanković
    Language Resources and Evaluation, 2026

2022

  1. email4.gif
    Effective methods for email classification: Is it a business or personal email?
    Milena Šošić, and Jelena Graovac
    Computer Science and Information Systems, 2022