My passion as a Data Scientist is leveraging the power of machine learning and especially NLP. People would describe me as very organized, self-taught software developer who writes clean code. I enjoy solving real business problems by designing successful algorithms and machine learning models. My advanced physics and mathematical background makes me a great problem solver. I am most excited to use my creativity to build new products from scratch.
I have over two years of work experience in ML and NLP. The main Python libraries I use are: Pandas, NumPy, and Scikit-learn for basic ML; Spacy, RegEx, Gensim, and NLTK for NLP; and Tensorflow or PyTorch for neural networks. I have been responsible for algorithms in the following fields: author name disambiguation, topic modelling, text generation (e.g. autocompletion), record linkage deduplication, and hierarchical multi-label classification. I worked comfortably with both Hebrew and English datasets. I am familiar with different approaches used in NLP, e.g. Count vectorizers (e.g. BoW, TF-IDF), word embeddings (e.g. Word2Vec, GloVe), and encoders (e.g. BERT). I developed modules and classes for deploying my code into production. Finally, I am comfortable creating my own tools for asssessing the performance of my models, e.g. interactive confusion matrix.