top of page

1. Getting COVID-19 Data


As you may have heard already, Johns Hopkins University launched their Coronavirus Resource Center as soon as the problem had become known. They have been collecting all data of the pandemic they can get their hands on and visualizing them with maps and graphs. Moreover, they share their data sets through their GitHub account.


Since I am planning to analyze how the spread of the virus and socioeconomic factors are associated with each other, I downloaded CSV files of COVID-19 data from their repository using R (3.6.3) with Jupyter Lab. In the following, I will show you how I did it, in case there are people who may want the same data. For the necessary setup to replicate the task, please refer to my post "Setting Up Windows 10 for Data Science".


Acknowledgement: For this task, I referred to this stack overflow discussion and this DataCamp tutorial. I appreciate their trailblazing.


First, we need to visit the said GitHub repository. There, click on the link for their first daily data set, "01-22-2020.csv". In that page, you will see a button "Raw" on the top right of the data; click on it. You will be at this page showing the CSV file contents. Copy the URL.


All the rest of work will be done in Jupyter Lab using the R kernel just as below (if you don't know how to launch it, go to the previous post).


(If the notebook above is shrunk, keep refreshing the page until it does)


Now we have the Johns Hopkins U's data on our computer. But it takes a lot of time to rewrite the dates manually. So, let's learn how to automate the process here!

bottom of page