Szymon Woźniak

Szymon Woźniak

Wrocław, Dolnośląskie, Poland
614 followers 500+ connections

About

I currently work as a Data Scientist at Surfer.
I previously worked as a Machine…

Activity

Join now to see all activity

Experience

  • Surfer Graphic

    Surfer

    Wrocław, Woj. Dolnośląskie, Polska

  • -

    Wrocław, Woj. Dolnośląskie, Polska

  • -

    Wrocław, Woj. Dolnośląskie, Polska

  • -

    Wrocław, woj. dolnośląskie, Polska

  • -

    Rejon Hamburg, Niemcy

Education

  • Wrocław University of Science and Technology Graphic

    Politechnika Wrocławska

    5,5 (excellent)

    -

    While studying I've worked with: deep neural networks, probabilistic models, natural language processing, computer vision, social media analysis, network science, representation learning and DevOps methods for machine learning.

    Technologies I worked with:
    - Python, Scikit-learn, Pandas, Tensorflow/Keras, PyTorch, PyTorch Lightning, wandb, NetworkX,
    - Celery, Docker, Kubernetes, Helm, PySpark
    - MongoDB, InfluxDB, Grafana, Redash.

  • -

    -

Licenses & Certifications

Publications

  • Assessment of Massively Multilingual Sentiment Classifiers

    WASSA @ ACL 2022: Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

    Models are increasing in size and complexity in the hunt for SOTA. But what if those 2%increase in performance does not make a difference in a production use case? Maybe benefits from a smaller, faster model outweigh those slight performance gains. Also, equally good performance across languages in multilingual tasks is more important than SOTA results on a single one. We present the biggest, unified, multilingual collection of sentiment analysis datasets. We use these to assess 11 models and…

    Models are increasing in size and complexity in the hunt for SOTA. But what if those 2%increase in performance does not make a difference in a production use case? Maybe benefits from a smaller, faster model outweigh those slight performance gains. Also, equally good performance across languages in multilingual tasks is more important than SOTA results on a single one. We present the biggest, unified, multilingual collection of sentiment analysis datasets. We use these to assess 11 models and 80 high-quality sentiment datasets (out of 342 raw datasets collected) in 27 languages and included results on the internally annotated datasets. We deeply evaluate multiple setups, including fine-tuning transformer-based models for measuring performance. We compare results in numerous dimensions addressing the imbalance in both languages coverage and dataset sizes. Finally, we present some best practices for working with such a massive collection of datasets and models for a multi-lingual perspective.

    Other authors
    See publication
  • Hex2vec -- Context-Aware Embedding H3 Hexagons with OpenStreetMap Tags

    4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GEOAI '21)

    Other authors
    See publication
  • Parameter-Less Population Pyramid for Permutation-Based Problems

    Parallel Problem Solving from Nature -- PPSN XVI, Springer International Publishing

    Linkage learning is frequently employed in state-of-the-art methods dedicated to discrete optimization domains. Information about linkage identifies a subgroup of genes that are found dependent on each other. If such information is precise and properly used, it may significantly improve a method's effectiveness. The recent research shows that to solve problems with so-called overlapping blocks, it is not enough to use linkage of high quality -- it is also necessary to use many different…

    Linkage learning is frequently employed in state-of-the-art methods dedicated to discrete optimization domains. Information about linkage identifies a subgroup of genes that are found dependent on each other. If such information is precise and properly used, it may significantly improve a method's effectiveness. The recent research shows that to solve problems with so-called overlapping blocks, it is not enough to use linkage of high quality -- it is also necessary to use many different linkages that are diverse. Taking into account that the overlapping nature of problem structure is typical for practical problems, it is important to propose methods that are capable of gathering many different linkages (preferably of high quality) to keep them diverse. One of such methods is a Parameter-less Population Pyramid (P3) that was shown highly effective for overlapping problems in binary domains. Since P3 does not apply to permutation optimization problems, we propose a new P3-based method to fill this gap. Our proposition, namely the Parameter-less Population Pyramid for Permutations (P4), is compared with the state-of-the-art methods dedicated to solving permutation optimization problems: Generalized Mallows Estimation of Distribution Algorithm (GM-EDA) and Linkage Tree Gene-pool Optimal Mixing Evolutionary Algorithm (LT-GOMEA) for Permutation Spaces. As a test problem, we use the Permutation Flowshop Scheduling problem (Taillard benchmark). Statistical tests show that P4 significantly outperforms GM-EDA for almost all considered problem instances and is superior compared to LT-GOMEA for large instances of this problem.

    Other authors
    • Michał Przewoźniczek
    • Marcin Komarnicki
    See publication

Projects

  • Billy

    - Present

    Billy is an application for bill splitting with support for receipt scanning. It's a group project that took 3rd place in API 2019 competition on Wrocław University of Science and Technology. Front-end is a progressive web app written in Angular framework. Back-end is based on Spring framework, Kotlin language PostgreSQL database.

    Other creators
    See project

Honors & Awards

  • II miejsce w konkursie na najlepszego absolwenta wydziału Informatyki i Zarządzania Politechniki Wrocławskiej

    Politechnika Wrocławska

  • I miejsce w Drugim Konkursie Programowania Obiektowego w C++ - Algorytmy Genetyczne

    dr Michał Przewoźniczek, mgr Marcin Komarnicki

    Konkurs był organizowany na Politechnice Wrocławskiej. Celem było zaprogramowanie metody ewolucyjnej rozwiązującej różne problemy optymalizacyjne kodowane binarnie (np. Ising-Spin Glass, NK-Landscapes). Rozwiązanie musiało zostać zaimplementowane w C++, bez wykorzystania mechanizmów automatycznego zarządzania pamięcią.

  • I Miejsce w Konkursie Programowania Obiektowego 2017 - Programowanie Genetyczne

    dr Michał Przewoźniczek, mgr Marcin Komarnicki

    Konkurs organizowany na Politechnice Wrocławskiej. Celem było zaprogramowanie algorytmu ewolucyjnego, który znajdował symboliczne wzory funkcji najlepiej opisujące zbiór danych w postaci trójek (x, y, f(x,y)). Całość musiała być zaimplementowana w C++ bez użycia mechanizmów automatycznego zarządzania pamięcią.

Languages

  • angielski

    Professional working proficiency

More activity by Szymon

View Szymon’s full profile

  • See who you know in common
  • Get introduced
  • Contact Szymon directly
Join to view full profile

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Szymon Woźniak in Poland

Add new skills with these courses