I theorized that the biggest reason this happened is cause of "old-money". This means that people that have accumulated wealth over the years can pass it over to their children and make heavy investments in their future, subsidizing in the process the future opportunities of those that are not this fortunate. If you look around you (in your university circles or the internet), you can notice that those that are educated in the greatest universities come from families with established wealth. This is mostly due to the fact that colleges have adopted business model and prefer money over bright students (or prefer money more than a poor bright kid, I am not implying that wealthy kids are not smart). This increases the opportunities of wealthy families keeping their wealth in the future while decreasing the opportunities of poorer families to get richer and "make it".
For this reason, I created a model based on real world data that simulates a N people population each having a set of characteristics and then i did what computers do best and simulated this population through time to see if inequality will change if we let people inherit the money of their parents. Inheritance has a lot more to do with the money one's parents have while they are alive and lot less while they are dead but in this way we can still get notable results while still staying true to the reality.
The results of both the model and the simulation follow in the lines of code below. I have to be clear that this project was made for fun and for me to try to test my Python, statistics and R skills while finding out something I was really curious about. For this reason, I INSIST that you do take what is described with a big grain of salt as it is only a model of society and does not try to replicate every single aspect of reality. In the future I hope I improve on both the model and the simulation, adding more features and making it as close to reality as I can.
Creating a python module with functions that statistically calculate the following variables of a person
**Sex, Age, Social Class, Income, Marital Status, Capital, Children Status **
To achieve this I used object-oriented programming to create a Population comprised of objects of the class Person.
The python code for this class is given in the following cell. While in the one below it I describe the attributes and methods used in it.
class Person:
def __init__(self,gender, social_class, age, income, capital, marital_status):
global id_count
self.id = id_count
self.social_class = social_class
self.gender = gender
self.age = age
self.income = income
self.husband_wife = 0
self.children_id = []
self.mother_id = 0
self.father_id = 0
self.children_status = 0
self.marital_status = marital_status
self.capital = capital
id_count += 1
def get_married(self, id):
self.marital_status = 1
self.husband_wife = id
def assign_child(self, id):
self.children_status = 1
self.children_id.append(id)
def assign_parents(self, father_id, mother_id):
self.father_id = father_id
self.mother_id = mother_id
def set_social_class(self, new_class):
self.social_class = new_class
#def update_income(self):
def update_capital(self,costs):
self.capital += self.income
self.capital -= costs
def increment_age(self, increment):
self.age += increment
def pass_capital(self, cap, siblings):
self.capital += (cap/siblings)
def __repr__(self):
return "<Person ID:%s, Gender:%s, Social Class:%s, Age:%s, Income:%s, Capital:%s, Husband/Wife ID:%s Children: %s Father: %s Mother: %s>" % (self.id, self.gender,
self.social_class, self.age, self.income,self.capital, self.husband_wife, self.children_id, self.father_id, self.mother_id)
Variable Name | Description |
ID |
An identification number that distinguishes |
Gender |
The person's gender, can be male or female. |
Age |
The person's age is an integer in [0,100] |
Social Class |
The person's social class is an integer in [1,5]. Concretely, we have 5 classes.
 |
Income |
The yearly income of a person, which depends on his age and gender (Figure 1) and his social class (Figures 2-6). As we can see in Figure 1 the income to age distribution follows a normal Gaussian distribution, which means that a person's income is on it's highest when he is middle aged. Moreover we can see from the data that women's income is lower than that of men. The data used to create the income distribution's was provided by IPUMS and the tidying of the data was done in R. Specifically, I fitted the data in 5 different distributions (Figures 2-6), one for each social class. |
Capital |
The capital accumulated by each person. It depends on the income and the age of each person. More can be found about the heuristic used, in Charles Farell's book: "Your Money Ratios: 8 Simple Tools for Financial Security" 2 Â |
Husband_wife | Variable that contains the id of the spouse of the individual. |
Children_id | List that contains the ids of the children of the individual. |
Mother_id | Id of mother. |
Father_id | Id of father. |
Children_status | Whether or not the person has a child or more. |
Marital_status | Whether or not the person is married. |
Â
Procedure Name | Description |
set_social_class(new_class)Â | Â Assigns new class to the individual. |
get_married(id) | Sets spouse of individual to ID and marital_status to 1. |
assign_child(id) | Sets children_status to 1 and appends id to children_id list. |
assign_parents(father_id, mother_id) | Sets the parents' ids. |
increment_age(increment) | Increments age by a number. |
update_capital(cap, siblings) | Updates a person's capital after his/her parent has died. It adds it to the existing one after dividing it by the number of children in the family. |
First, to create a person we need to make it a he or she. So we will be taking a random binary probability of the sex being 0 for males and 1 for females.
import numpy as np
np.random.seed(1)
gender = np.random.randint(2)
gender
1
Next we will define that person's age based on his/her gender and according to the UN's data (http://data.un.org/Data.aspx?d=POP&f=tableCode%3A22)
The age data follows this dataframe:
import pandas as pd
ages = pd.read_csv("data/ages.csv")
ages_df = pd.DataFrame(ages)
ages_df
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
Country or Area | Year | Area | Sex | Age | Record Type | Reliability | Source Year | Value | Value Footnotes | |
---|---|---|---|---|---|---|---|---|---|---|
0 | United States of America | 2015 | Total | Both Sexes | Total | Estimate - de jure | Final figure, complete | 2016.0 | 321418820.0 | 1,2 |
1 | United States of America | 2015 | Total | Both Sexes | 0 | Estimate - de jure | Final figure, complete | 2016.0 | 3978038.0 | 1,2 |
2 | United States of America | 2015 | Total | Both Sexes | 0 - 4 | Estimate - de jure | Final figure, complete | 2016.0 | 19907281.0 | 1,2 |
3 | United States of America | 2015 | Total | Both Sexes | 1 | Estimate - de jure | Final figure, complete | 2016.0 | 3968564.0 | 1,2 |
4 | United States of America | 2015 | Total | Both Sexes | 1 - 4 | Estimate - de jure | Final figure, complete | 2016.0 | 15929243.0 | 1,2 |
5 | United States of America | 2015 | Total | Both Sexes | 2 | Estimate - de jure | Final figure, complete | 2016.0 | 3966583.0 | 1,2 |
6 | United States of America | 2015 | Total | Both Sexes | 3 | Estimate - de jure | Final figure, complete | 2016.0 | 3974061.0 | 1,2 |
7 | United States of America | 2015 | Total | Both Sexes | 4 | Estimate - de jure | Final figure, complete | 2016.0 | 4020035.0 | 1,2 |
8 | United States of America | 2015 | Total | Both Sexes | 5 | Estimate - de jure | Final figure, complete | 2016.0 | 4018158.0 | 1,2 |
9 | United States of America | 2015 | Total | Both Sexes | 5 - 9 | Estimate - de jure | Final figure, complete | 2016.0 | 20487176.0 | 1,2 |
10 | United States of America | 2015 | Total | Both Sexes | 6 | Estimate - de jure | Final figure, complete | 2016.0 | 4019207.0 | 1,2 |
11 | United States of America | 2015 | Total | Both Sexes | 7 | Estimate - de jure | Final figure, complete | 2016.0 | 4148360.0 | 1,2 |
12 | United States of America | 2015 | Total | Both Sexes | 8 | Estimate - de jure | Final figure, complete | 2016.0 | 4167887.0 | 1,2 |
13 | United States of America | 2015 | Total | Both Sexes | 9 | Estimate - de jure | Final figure, complete | 2016.0 | 4133564.0 | 1,2 |
14 | United States of America | 2015 | Total | Both Sexes | 10 | Estimate - de jure | Final figure, complete | 2016.0 | 4121289.0 | 1,2 |
15 | United States of America | 2015 | Total | Both Sexes | 10 - 14 | Estimate - de jure | Final figure, complete | 2016.0 | 20622330.0 | 1,2 |
16 | United States of America | 2015 | Total | Both Sexes | 11 | Estimate - de jure | Final figure, complete | 2016.0 | 4130328.0 | 1,2 |
17 | United States of America | 2015 | Total | Both Sexes | 12 | Estimate - de jure | Final figure, complete | 2016.0 | 4101021.0 | 1,2 |
18 | United States of America | 2015 | Total | Both Sexes | 13 | Estimate - de jure | Final figure, complete | 2016.0 | 4084306.0 | 1,2 |
19 | United States of America | 2015 | Total | Both Sexes | 14 | Estimate - de jure | Final figure, complete | 2016.0 | 4185386.0 | 1,2 |
20 | United States of America | 2015 | Total | Both Sexes | 15 | Estimate - de jure | Final figure, complete | 2016.0 | 4249742.0 | 1,2 |
21 | United States of America | 2015 | Total | Both Sexes | 15 - 19 | Estimate - de jure | Final figure, complete | 2016.0 | 21108903.0 | 1,2 |
22 | United States of America | 2015 | Total | Both Sexes | 16 | Estimate - de jure | Final figure, complete | 2016.0 | 4184296.0 | 1,2 |
23 | United States of America | 2015 | Total | Both Sexes | 17 | Estimate - de jure | Final figure, complete | 2016.0 | 4194286.0 | 1,2 |
24 | United States of America | 2015 | Total | Both Sexes | 18 | Estimate - de jure | Final figure, complete | 2016.0 | 4217995.0 | 1,2 |
25 | United States of America | 2015 | Total | Both Sexes | 19 | Estimate - de jure | Final figure, complete | 2016.0 | 4262584.0 | 1,2 |
26 | United States of America | 2015 | Total | Both Sexes | 20 | Estimate - de jure | Final figure, complete | 2016.0 | 4363440.0 | 1,2 |
27 | United States of America | 2015 | Total | Both Sexes | 20 - 24 | Estimate - de jure | Final figure, complete | 2016.0 | 22739313.0 | 1,2 |
28 | United States of America | 2015 | Total | Both Sexes | 21 | Estimate - de jure | Final figure, complete | 2016.0 | 4456790.0 | 1,2 |
29 | United States of America | 2015 | Total | Both Sexes | 22 | Estimate - de jure | Final figure, complete | 2016.0 | 4529472.0 | 1,2 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
342 | United States of America | 2015 | Total | Female | 78 | Estimate - de jure | Final figure, complete | 2016.0 | 840960.0 | 1,2 |
343 | United States of America | 2015 | Total | Female | 79 | Estimate - de jure | Final figure, complete | 2016.0 | 803984.0 | 1,2 |
344 | United States of America | 2015 | Total | Female | 80 | Estimate - de jure | Final figure, complete | 2016.0 | 772956.0 | 1,2 |
345 | United States of America | 2015 | Total | Female | 80 - 84 | Estimate - de jure | Final figure, complete | 2016.0 | 3386676.0 | 1,2 |
346 | United States of America | 2015 | Total | Female | 81 | Estimate - de jure | Final figure, complete | 2016.0 | 693056.0 | 1,2 |
347 | United States of America | 2015 | Total | Female | 82 | Estimate - de jure | Final figure, complete | 2016.0 | 671319.0 | 1,2 |
348 | United States of America | 2015 | Total | Female | 83 | Estimate - de jure | Final figure, complete | 2016.0 | 640387.0 | 1,2 |
349 | United States of America | 2015 | Total | Female | 84 | Estimate - de jure | Final figure, complete | 2016.0 | 608958.0 | 1,2 |
350 | United States of America | 2015 | Total | Female | 85 | Estimate - de jure | Final figure, complete | 2016.0 | 582641.0 | 1,2 |
351 | United States of America | 2015 | Total | Female | 85 - 89 | Estimate - de jure | Final figure, complete | 2016.0 | 2422101.0 | 1,2 |
352 | United States of America | 2015 | Total | Female | 86 | Estimate - de jure | Final figure, complete | 2016.0 | 522563.0 | 1,2 |
353 | United States of America | 2015 | Total | Female | 87 | Estimate - de jure | Final figure, complete | 2016.0 | 487256.0 | 1,2 |
354 | United States of America | 2015 | Total | Female | 88 | Estimate - de jure | Final figure, complete | 2016.0 | 440879.0 | 1,2 |
355 | United States of America | 2015 | Total | Female | 89 | Estimate - de jure | Final figure, complete | 2016.0 | 388762.0 | 1,2 |
356 | United States of America | 2015 | Total | Female | 90 | Estimate - de jure | Final figure, complete | 2016.0 | 347555.0 | 1,2 |
357 | United States of America | 2015 | Total | Female | 90 - 94 | Estimate - de jure | Final figure, complete | 2016.0 | 1261957.0 | 1,2 |
358 | United States of America | 2015 | Total | Female | 91 | Estimate - de jure | Final figure, complete | 2016.0 | 297448.0 | 1,2 |
359 | United States of America | 2015 | Total | Female | 92 | Estimate - de jure | Final figure, complete | 2016.0 | 246825.0 | 1,2 |
360 | United States of America | 2015 | Total | Female | 93 | Estimate - de jure | Final figure, complete | 2016.0 | 205146.0 | 1,2 |
361 | United States of America | 2015 | Total | Female | 94 | Estimate - de jure | Final figure, complete | 2016.0 | 164983.0 | 1,2 |
362 | United States of America | 2015 | Total | Female | 95 | Estimate - de jure | Final figure, complete | 2016.0 | 125996.0 | 1,2 |
363 | United States of America | 2015 | Total | Female | 95 - 99 | Estimate - de jure | Final figure, complete | 2016.0 | 366919.0 | 1,2 |
364 | United States of America | 2015 | Total | Female | 96 | Estimate - de jure | Final figure, complete | 2016.0 | 90302.0 | 1,2 |
365 | United States of America | 2015 | Total | Female | 97 | Estimate - de jure | Final figure, complete | 2016.0 | 69285.0 | 1,2 |
366 | United States of America | 2015 | Total | Female | 98 | Estimate - de jure | Final figure, complete | 2016.0 | 47272.0 | 1,2 |
367 | United States of America | 2015 | Total | Female | 99 | Estimate - de jure | Final figure, complete | 2016.0 | 34064.0 | 1,2 |
368 | United States of America | 2015 | Total | Female | 100 + | Estimate - de jure | Final figure, complete | 2016.0 | 61886.0 | 1,2 |
369 | footnoteSeqID | Footnote | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
370 | 1 | Excluding U.S. Armed Forces overseas and civil... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
371 | 2 | Postcensal estimates. | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
372 rows Ă— 10 columns
The Age column contains data in bins of 5, but in the get_age_percents() function, we convert the bins to bins of 5 and then calculate the probability density function of the data manually.
def get_age_percents():
ages = pd.read_csv("data/ages.csv")
ages_df = pd.DataFrame(ages)
starting_columns = ["Total", "0 - 4", "5 - 9", "10 - 14", "15 - 19", "20 - 24", "25 - 29", "30 - 34", "35 - 39", "40 - 44", "45 - 49"
, "50 - 54", "55 - 59", "60 - 64", "65 - 69", "70 - 74", "75 - 79", "80 - 84", "85 - 89", "90 - 94", "95 - 99", "100 +"]
final_columns = ["Total", "0 - 9", "10 - 19", "20 - 29", "30 - 39", "40 - 49", "50 - 59", "60 - 69", "70 - 79", "80 - 89", "90 - 99", "100 +"]
# ages_1 = ages_df["Value"].loc[(ages_df["Age"].isin(starting_columns[5:7])) & (ages_df["Sex"] == "Male")]
# print ages_1
ages_m = []
ages_f = []
t = 0
for i in range(1,11):
ages_m .append(sum(ages_df["Value"].loc[(ages_df["Age"].isin(starting_columns[i+t:i+t+2])) & (ages_df["Sex"] == "Male")]))
ages_f .append(sum(ages_df["Value"].loc[(ages_df["Age"].isin(starting_columns[i+t:i+t+2])) & (ages_df["Sex"] == "Female")]))
t = t + 1
total_m = sum(ages_df["Value"].loc[(ages_df["Age"] == "Total") & (ages_df["Sex"] == "Male")])
hundred_m = sum(ages_df["Value"].loc[(ages_df["Age"] == "100 +") & (ages_df["Sex"] == "Male")])
total_f = sum(ages_df["Value"].loc[(ages_df["Age"] == "Total") & (ages_df["Sex"] == "Female")])
hundred_f = sum(ages_df["Value"].loc[(ages_df["Age"] == "100 +") & (ages_df["Sex"] == "Female")])
ages_m .append(hundred_m)
ages_f .append(hundred_f)
#Get percents of ages
import numpy as np
ages_m = np.array(ages_m)
ages_f = np.array(ages_f)
ages_m_percent = (ages_m/total_m)*100
ages_f_percent = (ages_f/total_f)*100
ages_m_percent = ["%.2f"%item for item in ages_m_percent ]
ages_f_percent = ["%.2f"%item for item in ages_f_percent ]
return ages_m_percent, ages_f_percent
When we have the p.d.f of the ages, we will use a get_age function to get a random age based on the gender of the individual like this:
import ages
age = ages.get_age(gender)
age
8
Then we have to find the Social Class of the individual which follows the Thompson, Hickey 2005 model I described in the table above.
Concretely:
import scipy.stats as sts
def get_class():
xk = np.arange(5)
pk = (0.2, 0.32, 0.32, 0.15, 0.01)
custm = sts.rv_discrete(name='custm', values=(xk, pk))
return custm.rvs() + 1
social_class = get_class()
social_class
1
I used R and some data collected from IPUMS.org, to achieve my goal. Then I used some R code to divide the data instances to the classes based on the proposed model[1], after I analysed the income to the gender and age of the people as shown in Figure 1.
Figure 1:Age to Income Scatterplot (Blue:Male, Red: Female)
With the following R code I divided the data I gathered from IPUMS.org to social classes based on the model proposed by Thompson and Hickey[1].
df <- read.csv(file="cps_00005.csv", header=TRUE, sep=",")
#Remove nan, d$INCTOT == 99999999, d$INCTOT == 99999998,
df <- na.omit(df)
df <- df[!(df$FTOTVAL == 99999999 | df$FTOTVAL == 99999998),]
df <- df[!(df$INCTOT == 99999999 | df$INCTOT == 99999998),]
df <- df[order(df$FTOTVAL),]
#We get the 5 classes. According to William Thompson & Joseph Hickey, 2005 we have
lc = (floor(.2*nrow(df)))
wc = (floor(.52*nrow(df)))
lmc = (floor(.84*nrow(df)))
umc = (floor(.99*nrow(df)))
uc = (floor(.01*nrow(df)))
lower_class = df[1:lc,]
working_class = df[(lc+1):wc,]
lower_middle_class = df[(wc+1):lmc,]
upper_middle_class = df[(lmc+1):umc,]
upper_class = df[(umc+1):nrow(df),]
#print(nrow(lower_class)/nrow(df)*100) -> 20%
#print(nrow(working_class)/nrow(df)*100) -> 32%
#print(nrow(lower_middle_class)/nrow(df)*100) -> 32%
#print(nrow(upper_middle_class)/nrow(df)*100) -> 15%
#print(nrow(upper_class)/nrow(df)*100) -> 1%
print(fitdist(lower_class$INCTOT, distr = "unif", method = "mme"))
print(fitdist(working_class$INCTOT, distr = "norm", method = "mme"))
print(fitdist(lower_middle_class$INCTOT, distr = "norm", method = "mme"))
print(fitdist(upper_middle_class$INCTOT, distr = "norm", method = "mme"))
print(fitdist(upper_class$INCTOT, distr = "gamma", method = "mme"))
Figure 2:Lower Class Distribution fit
Â
Figure 3:Working Class Distribution fit
Â
Figure 4:Lower Middle Class Distribution fit
Â
Figure 5:Upper Middle Class Distribution fit
Â
Turns out that the last 5 distributions give good enough results to model a Generation 0 population. To be honest, I tried creating a machine learning model to try and predict the Income, but the nature of the data(Age, social class and sex) didn't help that much in my endeavor. However, the code for the model created can be found in the stats.py under the name get_income2. But when we will simulate the generations, we will have to keep in mind that men tend to have higher incomes than women and that the age-income distribution is a Gaussian
 References:
1.Thompson, W. & Hickey, J. (2005). Society in Focus. Boston, MA: Pearson, Allyn & Bacon; Beeghley, L. (2004). The Structure of Social Stratification in the United States. Boston, MA: Pearson, Allyn & Bacon.
2.Farell, C. (2018). How The Ratios Work « Your Money Ratios. [online] Yourmoneyratios.com. Available at: http://www.yourmoneyratios.com/how-the-ratios-work/ [Accessed 16 Aug. 2018].
Â
According to Figures 2-6, I fit each social class' income to the following distributions respectively:
**Lower Class**: Uniform
**Working Class**: Normal
**Lower Middle Class**: Normal
**Upper Middle Class**: Normal
**Upper Class**: Gamma
The python function to get the income based on the social class is the following:
*I did not take into account the age or gender of the individual as far as the model goes, in the simulation process later I took this variables into account. The reason for this is that I did not have enough data to make something valuable and that the nominal variables *Gender, Social Class, will hardly provide a good Income prediction in a machine learning algorithm, as the prediction value is nominal. In testing this theory I found out I was right as the maximum correlation i got was about 0.5.
def get_income(social_class):
#Got classes from https://en.wikipedia.org/wiki/Household_income_in_the_United_States Social_Class
#lower_class = 1 (14-20%) , working_class = 2 (20 -52%), lower_middle_class = 3 (52-84%), upper_middle_class= 4 (84-99%), upper_class = 5 (99-100%)
if(social_class == 1):
inc = np.random.uniform(-3744.26,15749.78)
elif(social_class == 2):
inc = np.random.normal(15879.603, 9528.182)
elif(social_class == 3):
inc = np.random.normal(27564.15, 19854.43)
elif(social_class == 4):
inc = np.random.normal(55957.25, 48711.24)
else:
inc = np.random.gamma(1/0.6346247, 1/0.000002982905)
return inc
# ro.r.source("income.R")
# ro.r('''source("income.R")''')
# inc = ro.globalenv['income']
# x = inc(18,2,3)
# return x
income = get_income(social_class)
if(age<15):
income = 0
income
0
To compute a person's capital we used Charles Farell's heuristic[2] (I won't be posting heuristic code as they are quite long and trivial)
import stats
capital = stats.get_capital(age, social_class, income)
capital
0.0
We then compute other statistical relevant features as if the person is married or not and if he/she is the number of children they have.
The functions used are get_marital_status(gender, age), get_children_status(mother, father),get_number_of_children are heuristics based on real life data.[3,4,5]
from family import *
marital_status = family.get_marital_status(gender,age) #1 for yes, 0 for no
marital_status
0
To get the children status we will first have to create the Person object in Python.
person = Person(gender, social_class, age, income, capital, marital_status)
print person
<Person ID:1, Gender:1, Social Class:1, Age:8, Income:0, Capital:0.0, Husband/Wife ID:0 Children: [] Father: 0 Mother: 0>
The model works in the following way. For as long as a people's list called population does not exceed N, we will be creating people. If a person is married than we will create his/her partner and his/her partner's parents, along with his/her own. Then it runs a probability of the couple having kids and creates kids for them according to a probability and adds all these people in the first Generation of the simulation.
def model(N):
population = []
while( len(population) < N ):
# Create a person, if it aged more than 18, run a probability of it being married, else just give his parents
# If he/she is create a partner for him/her and give them a child
# If it's a male the social class has chances of being lower, females on the other hand, tend to marry up
gender = np.random.randint(2)
social_class = get_class()
age = ages.get_age(gender)
marital_status = 0
partner = 0
if(age < 15):
inc = 0
else:
inc = get_income(social_class)
cap = get_capital(age, social_class, inc)
person = Person(gender, social_class, age, inc, cap, marital_status)
#print ( "Created person-id %d" % person.id)
marital_status = get_marital_status(gender,age)
if(marital_status == 1):
#Create another human being with different gender, if male then social class is same or lower,
#if female marry only to same class
partner = create_partner(person)
#Give parents to partner
partner_mother, partner_father = create_parents(partner)
if(partner_mother!=0):
partner.mother_id = partner_mother.id
population.append(partner_mother)
if(partner_father!=0):
partner.father_id = partner_father.id
population.append(partner_father)
#Marry Them
population.append(partner)
person.get_married(partner.id)
partner.get_married(person.id)
#Check for and give them Children
if(person.gender == 1):
m = person
f = partner
else:
m = partner
f = person
#m = mother, f = father of child
if(get_children_status(m, f)):
no_of_children = get_number_of_children(m,f)
if(no_of_children is None):
no_of_children = 1
for _ in range(no_of_children):
child = create_child(person, partner, 1)
if (child!=0):
partner.assign_child(child.id)
person.assign_child(child.id)
child.assign_parents(f.id, m.id)
population.append(child)
mother, father = create_parents(person)
population.append(person)
if(mother!=0):
person.mother_id = mother.id
population.append(mother)
if(father!=0):
person.father_id = father.id
population.append(father)
return population
The results of the model for a population of a 1000,2000 and 5000 people, in respect to the class-capital distribution are the following
import numpy as np
np.random.seed(1)
population1 = model(1000)
population2 = model(2000)
population3 = model(5000)
def plot_capital_to_class(population, N):
# Check the capital to social class distribution
capital = []
total_cap = 0
class_size = [0] * 5
class_cap = [0] * 5
tot_inc = 0
class_inc = [0] *5
for p in population:
capital.append(p.capital)
total_cap += p.capital
tot_inc +=p.income
class_size[p.social_class-1] += 1
class_cap[p.social_class-1] += p.capital
class_inc[p.social_class-1] +=p.income
class_inc = np.array(class_inc)
print "Income:"
print (class_inc/tot_inc) *100
class_cap = np.array(class_cap)
print "Capital:"
arr = (class_cap/total_cap)*100
print arr
class_size = np.array(class_size)
print "Population:"
pop = (class_size/float(N))*100
print pop
print "Capital:Population ratio"
print (arr/pop)
import matplotlib.pyplot as plt
ind = np.arange(5)
plt.bar(ind, arr, align= 'center', width = 0.5)
plt.xticks(ind,("Lower Class", "Working Class", "Lower Middle Class", "Upper Middle Class", "Upper class") )
plt.ylabel("Capital")
plt.title("Capital distribution per social class (Simulation)")
plt.tight_layout()
plt.ylim(0.0,60.0)
plt.show()
return capital
cap1 = plot_capital_to_class(population1,1000)
cap2 = plot_capital_to_class(population2,2000)
cap3 = plot_capital_to_class(population3,5000)
Income:
[ 3.25754056 17.64907108 27.34955917 36.03464325 15.70918595]
Capital:
[ 2.94931656 14.77192988 27.60128711 35.05683699 19.62062946]
Population:
[16.9 33.3 31.1 18. 1.4]
Capital:Population ratio
[ 0.17451577 0.4436015 0.88750119 1.94760205 14.01473533]
Income:
[ 3.60685866 17.54332042 32.38809354 26.961163 19.50056438]
Capital:
[ 2.92336124 14.31667082 31.78975533 29.25399279 21.71621981]
Population:
[17.95 32.5 33.4 14.65 1.55]
Capital:Population ratio
[ 0.16286135 0.44051295 0.95178908 1.99685958 14.01046439]
Income:
[ 4.81334394 17.38036787 33.43219905 28.23697775 16.13711139]
Capital:
[ 3.92188299 14.30665683 32.61451713 29.31294689 19.84399616]
Population:
[21.52 29.48 33.76 14.1 1.16]
Capital:Population ratio
[ 0.18224363 0.48530044 0.96606982 2.0789324 17.10689324]
As we can see, the upper class despite it's size, has a substantial percentage of the capital of the population, which increases with the increase of the size of the population. This can be further proved by examining the capital to population ratio of each class which is proportionally larger for the upper class.
A good estimator to give also would be the gini coefficient of our population[5],which is defined as:
To do this we create a gini_index function that takes as input a numpy array and returns the single index of capital in our population.
def gini_index(capital):
capital = capital.flatten()
if np.amin(capital) < 0:
# Values cannot be negative:
capital -= np.amin(capital)
# Values cannot be 0:
capital += 0.0000001
# Values must be sorted:
capital = np.sort(capital)
# Index per array element:
index = np.arange(1,capital.shape[0]+1)
# Number of array elements:
n = capital.shape[0]
# Gini coefficient:
return ((np.sum((2 * index - n - 1) * capital)) / (n * np.sum(capital)))
print gini_index(np.array(cap1))
print gini_index(np.array(cap2))
print gini_index(np.array(cap3))
0.19617587628517835
0.18205903508393706
0.12080484386107655
These numbers show that the population's we created have diminishing gini indexes as the population get bigger (and thus more representative of reality), making the model pretty unequal
Now that we have our unequal population we will simulate the one with 1000 people, to try to check how more unequal it will be in the future, if each dying parent passes his/her capital to his/her children.
The code that will perform the simulation is the following and it is programmed to run for as many years as we want. In each year it will output the number of people in the population as well as the statistics provided before.
def marriage_query(person,partner):
if(partner.marital_status == 0):
if(partner.gender != person.gender):
if(person.gender == 0):
#if(partner.age in xrange((person.age/2) + 7, person.age + 8)):
if(partner.social_class <= person.social_class):
person_marital_status = get_marital_status(person.gender, person.age)
partner_marital_status = get_marital_status(partner.gender, partner.age)
if((person_marital_status) & (partner_marital_status)):
return 1
else:
return 0
#else:
# return 0
else:
#if(partner.age in xrange((person.age) , person.age + 21)):
person_marital_status = get_marital_status(person.gender, person.age)
partner_marital_status = get_marital_status(partner.gender, partner.age)
if((person_marital_status) & (partner_marital_status)):
return 1
# else:
#return 0
else:
return 0
else:
return 0
# This function updates the people's classes after each iteration
def update_class(population):
# Sort ascending based on capital
population.sort(key=lambda x: x.capital)
# Get classes
T = len(population)
l = int(0.2 * T)
w = int(0.32 * T)
lm = int(0.32 * T)
um = int(0.15 * T)
u = int(0.01 * T)
# Set new social class
for i, p in enumerate(population):
if(i <= l):
p.social_class = 1
elif(i <= l+w):
p.social_class = 2
elif(i <= l+w+lm):
p.social_class = 3
elif(i <= l+w+lm+ um):
p.social_class = 4
else:
p.social_class = 5
return population
def run_simulation(population, years):
for year in range(years):
print len(population)
for p in population:
if(check_death(p) == 0):
p.increment_age(1)
# TODO:
#Create a costs variable from data and initialize that too
#Then adjust income and capital according to the already given data?! or make them interact?!
#ADJUST Income
if(p.age == 15):
p.income = get_income(p.social_class)
else:
#p.update_income()
p.income += p.income * 0.01
costs = 0
p.update_capital(0)
#If not married, run probability of him getting get_married
if(p.marital_status == 0):
#Find a potential marriage partner
partners = [partner for partner in population if(marriage_query(p,partner))]
if(partners):
prtnr = random.choice(partners)
prtnr.get_married(p)
p.get_married(prtnr)
#If married run probability of giving birth
if((p.marital_status == 1)):
for person in population:
if(person.id == p.husband_wife):
if(person.gender == 0):
husband = person
wife = p
else:
wife = person
husband = p
child_status = family.get_children_status(wife, husband)
if(child_status == 1):
child = family.create_child(husband, wife, 0)
husband.assign_child(child.id)
wife.assign_child(child.id)
child.assign_parents(husband.id, wife.id)
population.append(child)
break
else:
#If that person has children, then pass its capital to his/her kids
children = [child for child in population if (child.id in p.children_id) ]
no_of_children = len(children)
for c in children:
c.pass_capital(p.capital, no_of_children)
population = [person for person in population if p.id != person.id]
population = update_class( population)
cap = plot_capital_to_class(population,len(population))
print gini_index(np.array(cap))
run_simulation(population1, 20)
1007
1462
1881
2277
2650
3043
3404
3723
4028
4379
4688
4982
5263
5543
5820
6093
6358
6621
6856
7104
Income:
[ 1.96802942 20.26639224 34.34630328 30.3449897 13.07428536]
Capital:
[-2.49912499 9.9415439 33.48622741 41.4580972 17.61325647]
Population:
[20.01363327 31.99727335 31.99727335 14.99659168 0.99522836]
Capital:Population ratio
[-0.12487113 0.31069972 1.04653378 2.7645013 17.69770359]
0.18507320299921423
What we see from the above graph is that with the passage of time if capital keeps being passed from parents to children, then the distribution of capital between the social classes, is getting even more unequal with the top classes getting most of the spoils while the lower ones are almost non-existent.
Next goal of this simulation is to remove inheritance from the equation and see if the inequality persists.