Although I have a graduate degree in econometrics and familiarity with statistical data analysis, I am still relatively new to programming languages and yet to learn a lot about computers and data structure. Figuring that there might be many people who are only as familiar with the stuff as I am, I decided to start blogging from the very beginning of my data science endeavor here. In the following, I will describe how I set up my computer with Windows 10 for future tasks. If you want to start data science as well, I hope my blog can be of some help.
First, I downloaded the programming languages Python (Anaconda) and R. They are very popular among data scientists and I have some experience using them. For Python, I opted for Anaconda with 3.7 version as it comes with packages suitable for data analysis, and for R, I just downloaded the latest version (3.6.3).
Second, I chose to use Jupyter Lab as my main platform for data analysis because it seemed to me a great tool for a blog series with data science. What makes Jupyter Lab so special for me is that you can use both Python and R in the same platform. I decided to use Jupyter Lab after watching this presentation (40 mins):
You can download Jupyter Lab using the Anaconda code below in Command Prompt, as explained in the tutorial.
conda install -c conda-forge jupyterlab
When the downloading is complete, you can launch Jupyter Lab in your default web browser by calling it in Command Prompt:
jupyter lab
You will see that Python can be used in Jupyter Lab. Now, let's add R. You can do this by following this tutorial. After doing this, however, I was greeted with some errors when I tried to use R on Jupyter Lab. I don't know what was wrong, but the errors were fixed after following the steps shown here, basically doing the same thing with R console instead of Anaconda Prompt.
Third, we need a relational database management system (RDBMS) to create a database in a computer, which will be necessary or very useful at least for data analyses. I decided to install SQLite, not for specific reasons to be honest, but it seemed pretty straightforward to use and have a lot of applications. I also downloaded DB Browser for SQLite, a tool that visualizes databases created with SQLite with a user friendly interface.
We now have Python (Anaconda) and R for doing data analysis, both of which can be run in Jupyter Lab, and SQLite, with DB Browser for SQLite, for making and managing databases. All set!
What are you going to do now? I'm going to get some data of COVID-19, hoping I can do something useful with it later.
Kommentare