Exercise 1

Questions

  1. Use the long-file pgen as master data und merge the variables sampreg, psample, pop, sex, gebjahr and phrf from the file ppfadl to it. You will find gross labor income in the variable pglabgro.

  2. You will have to transform the income data using a consumer price index (see table A2 in SOEPmonitor).

  3. Replicate the tables concerning current monthly individual gross labor income from SOEP-Monitor (p.73): https://www.diw.de/de/diw_02.c.222728.de/soepmonitor.html

Data Prep

Stata

use "_data/ex_mydf.dta", clear
    
* Sample for Analysis
**********************************************
    tab pop

    cap drop asample
    gen asample=1 if pop<3              // only private households

R

ex_mydf <- readRDS(file = "_data/ex_mydf.rds")

asample <- ex_mydf %>% filter(pop <3)

Answers

1.1 Create Data

Use the long-file pgen as master data und merge the variables sampreg, psample, pop, sex, gebjahr and phrf from the file ppfadl to it. You will find gross labor income in the variable pglabgro.

See data creation

1.2 Wrangle Data

You will have to transform the income data using a consumer price index (see table A2 in SOEPmonitor).

Also see data creation

1.3 Replicate Table

Replicate the tables concerning current monthly individual gross labor income from SOEP-Monitor (p.73): https://www.diw.de/de/diw_02.c.222728.de/soepmonitor.htmlDescribe Data or download here

Stata

use "_data/ex_mydf.dta", clear
    
* Sample for Analysis
**********************************************
    tab pop

    cap drop asample
    gen asample=1 if pop<3              // Anstaltsbevölkerung rausnehmen
    
* 3. Replication of SOEP-Monitor (Exercise 1)
***************************************************************************
    table syear sampreg   [pw=phrf], c(mean breink10) center col
    table syear erwstatus [pw=phrf], c(mean breink10) center col by(sampreg) format(%9.2fc)

. use "_data/ex_mydf.dta", clear
(PGEN: Feb 12, 2017 13:00:53-1 DBV32L)

.         
. * Sample for Analysis
. **********************************************
.         tab pop

     Aktuelle Populationszugehoerigkeit |      Freq.     Percent        Cum.
----------------------------------------+-----------------------------------
         [1] Privathaush., deutscher HV |    519,155       87.28       87.28
         [2] Privathaush., auslaend. HV |     72,022       12.11       99.39
         [3] Anstaltshh.,  deutscher HV |      2,568        0.43       99.82
         [4] Anstaltshh,   auslaend. HV |        653        0.11       99.93
       [5] n.real. Privathaush., dt. HV |        396        0.07       99.99
     [6] n.real. Privathaush., ausl. HV |         30        0.01      100.00
       [7] n.real. Anstaltshh.,  dt. HV |          4        0.00      100.00
----------------------------------------+-----------------------------------
                                  Total |    594,828      100.00

. 
.         cap drop asample

.         gen asample=1 if pop<3                          // Anstaltsbevölkerun
> g rausnehmen
(3,651 missing values generated)

.         
. * 3. Replication of SOEP-Monitor (Exercise 1)
. ***************************************************************************
.         table syear sampreg   [pw=phrf], c(mean breink10) center col

----------------------------------------------------------------------------
Erhebungs |
jahr      |
(Survey-Y |           Aktuelle Stichprobenregion (Berlin West-Ost)          
ear)      | [1] Westdeutschland,  [2] Ostdeutschland,           Total       
----------+-----------------------------------------------------------------
     1984 |       2150.266                                    2150.266      
     1985 |       2114.438                                    2114.438      
     1986 |       2210.113                                    2210.113      
     1987 |       2240.134                                    2240.134      
     1988 |       2313.789                                    2313.789      
     1989 |       2347.261                                    2347.261      
     1990 |       2355.683                                    2355.683      
     1991 |         2363.4              1298.328               2133.14      
     1992 |       2458.036              1536.907              2277.377      
     1993 |       2525.121              1690.638              2368.518      
     1994 |        2528.86              1812.548              2393.563      
     1995 |       2541.935              1887.492              2418.383      
     1996 |       2640.174               1917.95              2509.297      
     1997 |       2585.446              1936.169              2468.956      
     1998 |       2610.403              1901.663               2483.35      
     1999 |       2635.446              1926.642              2511.707      
     2000 |       2602.331               1944.52               2488.17      
     2001 |       2574.108               1951.02              2467.288      
     2002 |        2644.25              1973.334              2531.548      
     2003 |       2683.347              2049.873              2579.703      
     2004 |        2660.85              1976.247              2547.776      
     2005 |       2588.603              1962.568              2487.351      
     2006 |        2591.64              1930.182              2481.239      
     2007 |       2553.655              1862.697              2435.244      
     2008 |       2507.967              1865.118               2399.36      
     2009 |         2539.8              1954.229               2438.52      
     2010 |       2504.635              1955.423              2409.267      
     2011 |        2526.58              1917.365              2419.531      
     2012 |       2490.234              1967.234              2402.278      
     2013 |       2487.072              1936.126              2390.287      
     2014 |       2498.136              1988.372              2410.679      
     2015 |       2562.444               2044.89              2473.146      
----------------------------------------------------------------------------

.         table syear erwstatus [pw=phrf], c(mean breink10) center col by(sampr
> eg) format(%9.2fc)

-----------------------------------------------------------------------
Aktuelle Stichprobenregion (Berlin       |      RECODE of pgemplst     
West-Ost) and Erhebungsjahr              |     (Employment Status)     
(Survey-Year)                            | Vollzeit  Teilzeit    Total 
-----------------------------------------+-----------------------------
[1] Westdeutschland, alte Bundeslaender  |
                                    1984 | 2,524.24    933.91  2,252.48
                                    1985 | 2,468.76    880.50  2,186.75
                                    1986 | 2,568.27  1,029.22  2,307.93
                                    1987 | 2,636.95    949.78  2,347.15
                                    1988 | 2,716.84  1,020.73  2,422.29
                                    1989 | 2,753.86    989.33  2,449.56
                                    1990 | 2,770.59  1,021.27  2,453.24
                                    1991 | 2,805.45  1,038.36  2,463.05
                                    1992 | 2,887.26  1,127.26  2,552.92
                                    1993 | 2,984.69  1,090.48  2,619.89
                                    1994 | 2,973.13  1,107.11  2,609.82
                                    1995 | 3,018.67  1,097.27  2,626.09
                                    1996 | 3,147.05  1,197.99  2,722.11
                                    1997 | 3,080.75  1,136.08  2,666.22
                                    1998 | 3,128.58  1,159.52  2,702.82
                                    1999 | 3,208.76  1,143.09  2,731.11
                                    2000 | 3,195.41  1,134.32  2,683.36
                                    2001 | 3,195.14  1,105.96  2,668.04
                                    2002 | 3,323.16  1,106.64  2,746.24
                                    2003 | 3,396.16  1,163.09  2,796.68
                                    2004 | 3,346.17  1,158.97  2,761.15
                                    2005 | 3,303.90  1,126.71  2,690.02
                                    2006 | 3,306.01  1,082.16  2,676.91
                                    2007 | 3,277.37  1,102.98  2,639.73
                                    2008 | 3,242.78  1,034.07  2,588.77
                                    2009 | 3,251.35  1,127.42  2,626.37
                                    2010 | 3,251.80  1,054.28  2,587.15
                                    2011 | 3,224.20  1,077.21  2,602.36
                                    2012 | 3,197.46  1,073.62  2,566.41
                                    2013 | 3,241.60  1,091.38  2,573.60
                                    2014 | 3,280.71  1,107.76  2,576.22
                                    2015 | 3,364.24  1,180.52  2,650.76
-----------------------------------------+-----------------------------
[2] Ostdeutschland, neue Bundeslaender ( |
                                    1984 |                             
                                    1985 |                             
                                    1986 |                             
                                    1987 |                             
                                    1988 |                             
                                    1989 |                             
                                    1990 |                             
                                    1991 | 1,386.07    729.64  1,325.86
                                    1992 | 1,641.03    986.63  1,591.71
                                    1993 | 1,830.22  1,063.01  1,766.58
                                    1994 | 1,983.73  1,107.00  1,900.95
                                    1995 | 2,082.88  1,112.50  1,970.43
                                    1996 | 2,123.27  1,159.78  2,005.26
                                    1997 | 2,135.56  1,217.68  2,019.18
                                    1998 | 2,128.20  1,129.91  1,996.40
                                    1999 | 2,151.43  1,249.14  2,018.00
                                    2000 | 2,228.21  1,123.32  2,040.05
                                    2001 | 2,248.54  1,144.69  2,053.01
                                    2002 | 2,295.54  1,184.70  2,092.15
                                    2003 | 2,404.22  1,180.51  2,148.72
                                    2004 | 2,390.41  1,149.77  2,090.16
                                    2005 | 2,316.41  1,176.05  2,063.21
                                    2006 | 2,294.99  1,148.18  2,016.19
                                    2007 | 2,229.73  1,068.30  1,948.06
                                    2008 | 2,196.47  1,032.96  1,928.59
                                    2009 | 2,371.43  1,004.52  2,031.32
                                    2010 | 2,352.29  1,056.86  2,024.60
                                    2011 | 2,286.47  1,001.40  1,982.48
                                    2012 | 2,373.58    997.28  2,027.58
                                    2013 | 2,331.49  1,059.28  1,987.80
                                    2014 | 2,423.47  1,057.17  2,059.64
                                    2015 | 2,488.00  1,144.59  2,111.84
-----------------------------------------------------------------------

R

#### 1.3 Describe Data ####
# # Means by Region 
# prep1.3 <- asample %>% 
#       select(syear, breink10, erwstatus, ost, phrf) %>% 
#       dplyr::group_by(syear, ost) %>% 
#       # mutate_at(c("breink10"), funs(Total = mean(., na.rm = T))) %>% 
#       dplyr::summarise(TotalMean = mean(breink10, weights = phrf, na.rm = T))
# 
#      xtabs(TotalMean ~ syear + ost, prep1.3) 

# Means by Region and Job Status
result1.3 <- asample %>% 
      group_by(syear, ost, erwstatus) %>%
      dplyr::summarise(mean = mean(breink10, weights = phrf, na.rm = T))

xtabs(mean ~ syear + erwstatus + ost, result1.3) 
## , , ost = 0
## 
##       erwstatus
## syear     1    2
##   1984 2406  945
##   1985 2366  896
##   1986 2466 1024
##   1987 2540  934
##   1988 2607  975
##   1989 2622  981
##   1990 2653  987
##   1991 2667  989
##   1992 2743 1078
##   1993 2802 1044
##   1994 2799 1064
##   1995 2830 1036
##   1996 2941 1122
##   1997 2908 1069
##   1998 2967 1087
##   1999 3007 1080
##   2000 3174 1100
##   2001 3163 1073
##   2002 3827 1220
##   2003 3792 1221
##   2004 3740 1224
##   2005 3641 1201
##   2006 3623 1170
##   2007 3579 1173
##   2008 3500 1131
##   2009 3497 1168
##   2010 3490 1106
##   2011 3472 1137
##   2012 3450 1115
##   2013 3308 1080
##   2014 3379 1128
##   2015 3425 1181
## 
## , , ost = 1
## 
##       erwstatus
## syear     1    2
##   1984    0    0
##   1985    0    0
##   1986    0    0
##   1987    0    0
##   1988    0    0
##   1989    0    0
##   1990    0    0
##   1991 1398  739
##   1992 1652  953
##   1993 1836 1048
##   1994 1982 1077
##   1995 2065 1079
##   1996 2104 1156
##   1997 2134 1232
##   1998 2123 1181
##   1999 2147 1140
##   2000 2245 1142
##   2001 2249 1148
##   2002 2579 1235
##   2003 2631 1211
##   2004 2568 1191
##   2005 2547 1213
##   2006 2521 1184
##   2007 2466 1194
##   2008 2378 1124
##   2009 2508 1131
##   2010 2487 1111
##   2011 2460 1062
##   2012 2480 1054
##   2013 2461 1100
##   2014 2528 1094
##   2015 2586 1138