Below is a quick summary of some of the projects I keep in this repositorie.

Project 1 - Analyzing NY High School Data (SAT results)

This is a project about New York City public schools, based on a Dataquest guided project.
I combined data from various datasets, into a single dataframe. The list of the datasets which I have combined is available in the schools folder.

After combining the data, I did the following:

Project 2 - Covid cases in Greece.

In this project I focused on the covid cases in Greece and did the following:

Project 3 - Web Scraping and Wordcloud using Python.

In this project I extracted data from a webpage and then used this data to create a Wordcloud and a DataFrame, both of which contained useful information. The webpage used for our example, is an interesting music blog in WordPress. You can explore it here.
I extracted the following information for each post:


Below you can see an example of the image I created with Wordcloud.

Project 4 - Study of EKPA.

Made some visualizations using data from a recent study of National and Kapodistrian University of Athens (EKPA), in which I was responsible for the study documentation. A survey was organized, asking graduates of EKPA postgraduate programs questions about employment, satisfaction with their studies etc.
Some interesting observations are the following:

Project 5 - House Prices: Prediction and Data analysis.

This dataset is from a Kaggle competition, so it challenges us to predict the final price of each home. There are 79 explanatory variables, describing every aspect of residential homes in Ames, Iowa.

Project 6 - Lotto probabilities.

In this project I worked on the 6/49 lottery (Lotto) and answered the following questions:

Finally, I used historical data from the national 6/49 lottery game in Canada, with drawings dating from 1982 to 2018. By comparing a ticket against the historical lottery data in Canada, someone can determine whether he would have ever won by now.

Project 7 - Visualizing Earnings Based On College Majors.

This is a dataset on the job outcomes of students who graduated from college between 2010 and 2012. The original data on job outcomes was released by American Community Survey, which conducts surveys and aggregates the data.
This is a part of a project i did on Dataquest and I had the chance to work with some plots like histograms, scatterplots and barplots, in Jupyter notebooks.

Project 8 - Titanic dataset.

This is the famous Titanic dataset. I’ve downloaded the data from Kaggle.