As a rule, we are using Power BI to present our findings, creating dashboards or reports. But Microsoft Power BI can be useful on the stage of initial exploratory data analysis as well.
I found it when I needed to examine a really wide data table, containing hundreds of columns. Usually, I am writing an R script, creating Scatterplot matrices using pairs(). But having a lot of features, and wishing to browse them in different combinations, that would be a bit onerously.
That is why I created a ggpairs R Visual, showing the same chart in Power BI. There are two reasons for that. First, I can quickly select features to display, simply marking them on the “Fields” pane in Power BI. Secondly, Power BI has a lot of Data Sources which could be accessed much easy than in R.
Of course, Power BI has a few drawbacks. It is trying to refresh a chart every time you are selecting/deselecting fields. It is annoying. And do not forget about the data size limitation in R Visuals – Power BI takes no more than first 150,000 rows.
The source code of the ggpairs.R can be found there
A year ago I decided to take a course, dedicated to Machine Learning and Data Science. Microsoft offered a “Microsoft Professional Program for Data Science” on the basis of massive open online courses (MOOC) on the edX platform.
The program consists of 4 units of 9 courses and a final project (see more at https://academy.microsoft.com/en-us/professional-program/data-science/). Some of the units allow you to choose from different courses. For example, you can choose courses, requiring knowledge of R or Python.
I completed the following courses:
Each course, including a Capstone Project, costs $99 for a verified certificate. That way, you will pay $990 if Microsoft does not raise the price as they did it twice before it.
The courses are well structured: some theory, presented by a trainer with hands-on demos, Quizzes, Labs, and Exams.
The most enjoyable part for me was the Capstone Project. It is a competition, during which you have to predict some values having a bunch of data, and to write a report of your analysis and findings. You can use any techniques you want, but the final score depends on the accuracy of your predictions.
It was a really excellent experience, but I have to take a breath before I start looking for new courses.