Hello,

I'm Marko.

Full Stack Data Scientist.

Skills set:

Statistics

Data Science

Machine Learning

Teaching

Design Web-courses

Analytical thinking

Operations Research

R

python

SQL (Data Bases)

Tableau

Linux

"Statistics is the grammar of science."

Karl Pearson

Projects:

Optimization of delivery frequencies

In an efficient retail company, it is essential to optimize the delivery process. In the process, one must consider demand VS supply of goods and all the physical limitations concerning space and time. It also includes optimization of weekly delivery frequencies from warehouses to stores.

The project's main goal was to provide a platform enabling its user to access all the information regarding flows of goods and estimated weekly delivery frequencies. The end product was a tailor-made dashboard showing current, past, and optimal states from a warehouse or a store point of view.

Tools used: SQL, python, Tableau

Store sales prediction after store renewal

After we finish retail store renovation, this usually leads to increased sales in the first few days of opening. The reason for that is the new and better store appearance, improved store assortment, and marketing campaigns related to the event. It is essential to have accurate estimates of boosted sales to have just the right amount of goods on-site.

We proposed a forecasting platform with the primary goal of predicting boosted sales of the selected store by considering the store's past sales and the behavior of similar stores after renovation. The platform has been applied in a few dozen store renewal events and always yielded promising results.

Tools used: SQL, R, Tableau

Cash in transit process optimization

Cash-in-transit (CIT) operations supply bank outlets and ATMs with cash in retail banks. The main goal is to balance cash levels to have the right amount. So you stay within the maximum allowed level, and the maximum permitted level. In our case, a third-party CIT provider executed CIT operations, so the project's goal was to propose "an optimal solution" to decrease CIT-related costs for our bank.

We proposed a two-fold solution, which firstly gathered and provided all the relevant CIT information on one page. And secondly, it showed an estimate of optimal CIT network solution" from an operations research standpoint. Our solution helped decision-makers tweak the proposed solution, so it resembled a real-world counterpart. The idea was applied in practice and helped our bank significantly decrease CIT-related costs.

Tools used: R, MS Excel

Advanced client clustering pipeline

Companies would like to label clients based on client behavior toward the company. We can use estimated labels in marketing campaigns for more efficient client targeting. Banks are no exception here. Given bank decided to outsource a company that will provide a solution for a client clustering procedure.

I joined the given bank after others had already proposed the solution. So my main contribution was implementing the proposed clustering procedure into production, where I had to develop a SQL- design based on generated clustering rules.

Tools used: SQL, R

Propensity to buy model account overdraft & short-term loan

Banks are using prediction models to improve client targetting for marketing-related strategies. The selected bank was planning to develop models for account overdrafts and short-term loans on the commercial client side. The main idea is to build a model using clients' past data to predict the propensity of clients to buy a specific product.

We have proposed a machine-learning-based solution for both products. The main challenge was imbalanced classes client buying - 1 VS client not buying - 0), which we successfully overcame.

Tools used: R

Client churn prediction

The costs of preventing a client from churning are much lower than the costs of obtaining a new client. Therefore it is essential to detect when a good client will churn. And also try to use retention strategies to prevent a client from leaving a company.

We have proposed a statistical model for client churn prediction for a given bank. Model development was quite challenging since we had to tackle challenges like churn definition, how far in the future we predict churn, imbalanced classes client churning- 1 VS client not churning - 0), and many more. Nevertheless, we successfully designed the model and deployed it in production.

Tools used: R

Propensity to buy model for pay-later cards

For a given bank, we have proposed a machine learning-based solution to predict the propensity of clients to buy a pay-later card. We designed the model for consumer clients. Here we faced similar challenges compared to other models. In the end, we successfully deployed the model in production.

Tools used: python, R

Automate real estate prices data web scraping & analysis

The project's main goal was to design a procedure for collecting and analyzing Slovenian real estate price data. I have created the code for web scraping and gathering data part of the project. I will develop the code for price analysis after a substantial amount of data is collected.

Tools used: python, Tableau

Automate the output of monthly instructor Udemy report

A personal project with the main idea of helping automate the output of the monthly instructor Udemy report. It's a two-fold design, where first, the method applies data cleaning and parsing, and in the second step generates a dashboard showing all the relevant information regarding enrolled students, instructors' revenue, and effort invested into course design. The procedure is already deployed and used every month.

Tools used: R, Tableau

Automate process for bank client target list creation

Banks usually use a list of existing and potential commercial clients to execute marketing strategies. Creating a list requires the collection of various data from different data sources. Which normally introduces several obstacles in data cleaning, parsing, and wrangling steps. Automatization of the list creation procedure is essential to avoid human-prone mistakes.

I was part of given process design in two different banks. In the end, we proposed a procedure that is almost fully automated. The shared approach speeds up the client list creation process and minimizes human-generated mistakes.

Tools used: R, SQL

"Models should be as simple as possible, but not more so."

Albert Einstein

R for Beginners

Data Visualization with R and ggplot2

Data science with R: tidyverse