@marloz wrote:
So, I´m working with ENIGH - Database, which stands for ¨National Survey of Household Income and Expenses¨ in Spanish, this is an exercise conducted by the Mexican government and like most surveys of its kind, it works with Weights.
What I´m trying to do is to calculate the mean, maximum and minimum household income by Decile. In other words What´s the income of each 10%, grouping household base on their income.
To be honest, I haven’t gone that far but this is what I got until now:
- I need my svydesign object
- Convert that into a table using svytable
- Arrange using desc() on my income variable
ENIGH_design <-svydesign(id=~upm, strata=~est_dis, weights=~factor_hog, data = ENIGH) ENIGH_table <- svytable(ing_cor, ENIGH_design)
Here is where it gets tricky, supposing I have 100 rows, I can’t take the first 10 of them because in reality, when taking weights in mind, the might be 9% or 20% (I´m just throwing numbers) of the actual population.
I could use
cut()
on my income variable but I would be forgetting about weights and results will only be representative of the sample, not total population.I think that the best approach would be to use a combination of:
mutate()
to create a new variable baseif()
in conjugation with mutate to define on which decile each row falls togroup_by()
andmean()
to calculate what I´m aiming forThis way I will have an extra variable which I could use to calculate whatever I want with whatever other variable I wish to. But again, I haven´t define my groups so it´s pretty much useless.
Thank you for reading. Thank you for your help.
Database available: https://www.inegi.org.mx/programas/enigh/nc/2016/default.html#Datos_abiertos
Here is a glimpse of how my DB looks:
folioviv foliohog ubica_geo est_dis upm factor ing_cor 100587003 1 10010000 2 610 180 22,723 100587004 1 10010000 2 610 180 17,920 100587005 1 10010000 2 610 180 27,506 100587006 1 10010000 2 610 180 56,236 100605201 1 10010000 2 620 178 41,587 100605202 1 10010000 2 620 178 135,437 100605203 1 10010000 2 620 178 62,386 100605205 1 10010000 2 620 178 103,502 100605206 1 10010000 2 620 178 27,323 100606301 1 10010000 3 630 223 68,042 100606302 1 10010000 3 630 223 98,537 100606305 1 10010000 3 630 223 53,237 100606306 1 10010000 3 630 223 132,861 100609801 1 10010000 3 640 232 190,033 100609802 1 10010000 3 640 232 28,654 100609805 1 10010000 3 640 232 74,408 100631401 1 10010000 1 650 171 80,761 100711503 1 10010000 1 770 184 38,640 100711504 1 10010000 1 770 184 81,672
There are many more columns but they aren´t necessary for this exercise.
Posts: 1
Participants: 1