177k views
5 votes
Get the data and put it in order (20 marks) Make sure you open a new project in R. upload the WDI library. a- WDI allow you to search and upload data from the World Bank World Development Indicator directly from R. For example, search the series containing information on GDP per capita. Note that to search for more than one term you need to include within single quotes as ‘term1.*term2’. We will use the following series: "GDP per capita, PPP (constant 2017 international $)" Note its code, which can be used to upload the series as a dataframe using the WDI command. We want it for all countries and for years between 2000 and 2020. Let’s call this series WB_GDP. You codes should look like WB_GDP=WDI(indicator=`series code’, country = "all", start=2000, end==2020) For education, we will use the Barro-lee percentage of 15+ with secondary education: 'BAR.SEC.ICMP.15UP.ZS' b- Merge these two dataframes by country and year c- The data is messy , do the following: - Eliminate the 4 iso variables: This can be done using the subset command and using a negative sign to indicate that you want to eliminate variables. DF = subset(DF, select = -c(var1,var2,var3,var4) ) - Keep only the years for which the Barro-Lee data is available; i.e. 2000, 2005, 2010!!! Hint: This can also be done by using the subset command. In order to make a conditional or statement you need to use the | sign; i.e year==2000 | year==2005. - Rename the GDP and Education variables as GDP and Edu. Hint: There are various ways of doing this either using subset or more elegantly with the rename command. - Generate a log(GDP) variable. - Use the Skim command to summarise the data. Discuss the main issues.

1 Answer

4 votes

Final answer:

To get the data and put it in order using R, you can follow the provided steps, which involve uploading the WDI library, searching and uploading the desired series using WDI commands, merging dataframes, cleaning the data, and summarizing it using the Skim command.

Step-by-step explanation:

In order to get the data and put it in order using R, you can follow the steps mentioned below:

  1. Open a new project in R and upload the WDI library.
  2. To search and upload data from the World Bank World Development Indicator directly from R, use the WDI library. For example, search for the series containing information on GDP per capita using the code 'GDP per capita, PPP (constant 2017 international $)'. To upload the series as a dataframe, you can use the command WB_GDP = WDI(indicator = 'series code', country = 'all', start = 2000, end = 2020).
  3. For education, you can use the Barro-lee percentage of 15+ with secondary education: 'BAR.SEC.ICMP.15UP.ZS'.
  4. Merge the two dataframes by country and year.
  5. Clean the data by eliminating the 4 iso variables, keeping only the years 2000, 2005, and 2010 for the Barro-Lee data, renaming the GDP and Education variables as GDP and Edu, generating a log(GDP) variable, and using the Skim command to summarize the data and identify any issues.

User John Bartholomew
by
7.8k points