-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regressions team #13
Comments
We didn't talk about this today. Please reply with a link to your file with tables! |
@mromano224 @SebastianStoneham What about regressions? These would be easy to implement from the bank tract data! We can talk on monday about how! --> c would show issues. y1 = Denial Rate TractStat1 = Hispanic% Reg 1: DenialRate = a + bH + c{BOW==1} + d*H{BOW ==1} + e |
You can find the tables we produced in my branch called juan4, the files are in code folder and they are called TablesAZ and TablesCA. |
Cool, I see some numbers you should show Matt Monday in your existing tables...
|
@jum223 @XiaozheZhangLehigh @mromano224 @SebastianStoneham What's the status here? Please be prepared with which tables (And which numbers specifically in the tables) you what to show. Please reply with a link to the files with the tables you want to show. |
You need to update the regression. See the picture above. Reg 1 uses y1 and x1, 2 uses y1 x2, and so on |
Just updated the regressions in regression_1 @donbowen |
Better. Still not right.
|
another update... please lmk, sorry for the confusion @donbowen |
Am I looking at the right file? Good job with the column names... it helped me figure out the key problem.
Here, just restart the file with this (some issues fixed, others I left pointers to.) import pandas as pd
import numpy as np
from statsmodels.formula.api import ols as sm_ols
from statsmodels.iolib.summary2 import summary_col
bank_tract = pd.read_csv('../input_data_clean/bank_tract_clean_WITH_CENSUS.csv')
# adjust this next line to drop the BMO rows
bank_tract = bank_tract.query('which_bank != "BMO")
# create vars
bank_tract['hisp_rate'] = (bank_tract ['HispanicLatinoPop'] / bank_tract ['Tot.Pop']) * 100
bank_tract['hisp_over_med'] = bank_tract["hisp_rate"] > bank_tract["hisp_rate"].median()
bank_tract['log_num_apps'] = np.log(bank_tract['num_applications'])
# skip all the one-off regressions (just show them all together...)
# regressions
# define the regression models (YOU'LL NEED TO MAKE ONE MORE VARIABLE ABOVE, AND THEN UPDATE THESE TO MATCH FORMULA)
m1 = sm.OLS.from_formula('denial_rate ~ hisp_rate', data=all_other).fit()
m2 = sm.OLS.from_formula('denial_rate ~ hisp_over_med', data=bank_of_west).fit()
m3 = sm.OLS.from_formula('log_num_apps ~ hisp_rate', data=all_other).fit()
m4 = sm.OLS.from_formula('log_num_apps ~ hisp_over_med', data=bank_of_west).fit()
# set up the formatting for the table
info_dict = {'No. observations': lambda x: f"{int(x.nobs):d}"}
float_format = '%0.3f'
# UPDATE THIS AS NEEDED:
regressor_order = ['Intercept', 'hisp_rate', 'hisp_over_med']
# UPDATE THE COLUMN NAMES (just using the y variable in each column works)
table = summary_col(results=[m1, m2, m3, m4],
model_names=['|all banks denial rate reg|',
'|BOW denial rate reg|',
'|all banks log num apps reg|',
'|BOW log num apps reg|'],
regressor_order=regressor_order,
float_format=float_format,
info_dict=info_dict,
stars=True)
table.title = 'OLS Regressions'
# print the table
print(table) |
Regression Analysis: |
No description provided.
The text was updated successfully, but these errors were encountered: