Milena Šošić

Computer scientist with twenty years of experience in machine learning, text mining, business intelligence and general software algorithms development and implementation. Worked on high-visibility projects in the commercial, banking and telecommunications areas. Has a knowledge of CRISP-DM methodology for data mining.

Able to influence the strategic direction of the company by identifying opportunities in large, rich data sets, creating and implementing data driven strategies that fuel growth including revenue and profits.

Design and implement predictive models and cutting edge algorithms utilizing diverse sources of data to predict demand and optimize business processes.

Utilize analytical tools (Weka, SPSS, STATISTICA, Azure Cloud) to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into strategies that drive value.

Working experience in software development, architecture and management include, but is not limited to, following areas: .NET technology development, information systems architecture and optimization, design algorithms, database architecture, business intelligence and machine learning. Especially,

Methodologies: Object oriented software development and design (OOD and OOP),
Programming languages, databases and technologies: C#/ASP.Net, MS MVC, MS SQL Server. PostgreSQL, HTML, JavaScript, jQuery, Python,
Machine learning tools: Weka, SPSS, STATISTICA, IBM Infosphere, IBM Intelligent Miner, Microsoft Azure Cloud Services, python libraries for machine learning (scikit-learn, numpy, pandas, matplotlib, seaborn, statsmodels, jupyter notebook), python libraries for deep learning (keras, tensorflow, pytorch), LLM & AI tools (langchain, langgraph, CrewAI, promptflow, LiteLLM, vLLM, Unsloth, DSPy, Langfuse, Docling),
Specialization in the field of NLP, especially: computational linguistics, semantic text analysis, named entity recognition, part of speech tagging, topic modeling, n-gram language modeling, knowledge construction from unstructured textual data, summarization, information retrieval,
Hands-on experience with data preparation including: data cleansing, feature engineering and visualization,
Reading and writing machine learning scientific publications and transform their results into business value.

Education

2020.10 - 2025.12
Doctor of Philosophy (Dr/PhD)

Faculty of Mathematics, University of Belgrade, Serbia

Computer Science, Machine Learning, Natural Language Processing

Thesis titled: Modeling moral and emotional language aspects in conversational texts classification
- Text classification · Semantic Knowledge Bases (English, Serbian) · Conversation · Emotions · Morality
2004.10 - 2010.12
Magister of Science (Mr/Mag/MPhil)

Faculty of Mathematics, University of Belgrade, Serbia

Computer Science

Thesis titled: Applying classification methods on N-gram genome analysis
- Databases · Data Mining · Bioinformatics
1997.10 - 2004.04
Graduated Mathematician and Computer Scientist (BSc/MSc)

Faculty of Mathematics, University of Belgrade, Serbia

Computer Science, Mathematics
- Programming Languages (Pascal, C, C++, Java, Prolog, Lisp) · Algorithms Architecture · Computer System Architecture · Mathematical Logic · Automata Theory · Databases · Microprocessors · Microcomputers · Assembly Language · Compilers and Interpreters · Programmable Systems Architecture · Algebra (I and II) · Analytical, Constructive and Euclidian Geometry · Calculus (I and II) · Real and Complex Functions · Numerical Methods (I and II) · Probability and Statistics

Interests

Natural Language Processing (NLP)

Computational linguistics · Semantic text analysis · Named Entity Recognition (NER) · PoS tagging · Topics modeling · Language modeling · Knowledge construction from richly formatted textual data · Summarization

LLMs

Architectures · Bias · Content Genearation · Annotation · Validation · Hallucination · Serbian language

Serbian Language Resources

JeRTeh · Group · Membership · Resources · Technologies · Serbian language

Hobbies

Hiking · Reading · Playing flute

news

Jan 10, 2026	New project pages, updated CV, and additional references are now available.
Aug 6, 2023	An initial version of the website has been created!

selected publications

2026

Building an emotion lexicon for Serbian using curated language resources

Milena Šošić, Jelena Graovac, and Ranka Stanković

Language Resources and Evaluation, 2026

Bib HTML PDF Code

@article{vsovsic2026building,
  title = {Building an emotion lexicon for Serbian using curated language resources},
  author = {{\v{S}}o{\v{s}}i{\'c}, Milena and Graovac, Jelena and Stankovi{\'c}, Ranka},
  journal = {Language Resources and Evaluation},
  volume = {60},
  number = {1},
  pages = {9},
  year = {2026},
  publisher = {Springer},
  doi = {10.1007/s10579-025-09639-3},
  url = {https://doi.org/10.1007/s10579-025-09639-3},
}

2022

Effective methods for email classification: Is it a business or personal email?

Milena Šošić, and Jelena Graovac

Computer Science and Information Systems, 2022

Bib HTML PDF Code

@article{vsovsic2022effective,
  title = {Effective methods for email classification: Is it a business or personal email?},
  author = {{\v{S}}o{\v{s}}i{\'c}, Milena and Graovac, Jelena},
  journal = {Computer Science and Information Systems},
  volume = {19},
  number = {3},
  pages = {1155--1175},
  year = {2022},
  doi = {1820-0214/2022/1820-02142200034S},
}

Education

Computer Science, Machine Learning, Natural Language Processing

Thesis titled: Modeling moral and emotional language aspects in conversational texts classification

Computer Science

Thesis titled: Applying classification methods on N-gram genome analysis

Computer Science, Mathematics