Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments: R file #2

Open
jmarkgraf opened this issue Apr 29, 2016 · 2 comments
Open

Comments: R file #2

jmarkgraf opened this issue Apr 29, 2016 · 2 comments

Comments

@jmarkgraf
Copy link
Owner

Hi Malte, here my comments on the R file:

  • variable "TotalTerms" (line 82-86): variable does not tell us whether it is mayor's first term or maybe already her fifth term; this, however, is crucial as we presume that re-election is more likely the more terms the incumbent has served as a mayor.
  • variable "Year" (line 88-89): what is the value added of this variable?
  • variable "Reelection" (line 97-103): our dataset is not ordered by municipality ID, but by incumbent name, which leads to a wrong estimation - you need to order it by IDMunicipality (1st order) and election year (2nd order) before you do the steps (this is a problem for all further slides you do!!); furthermore, you need to first exclude 1 round of the runoff election (either the 1st round that led to the runoff or the 2nd round, which is the outcome of the runoff election)
  • subsetting for time period (line 114-116): a more elegant way to do this is to use the lubridate package, followed by a command like MayorElection$ElectionDate <- ymd(MayorElection$ElectionDate); subsetting is more parsimonious with MayorElection <- subset(x = MayorElection, ElectionDate > "2006-01-01") (for all three subsetting steps). But your way works too!
  • subsetting for runoffs (line 118-120): as mentioned above, this comes too late as you already slided everything.
  • merging datasets (line 130-137): you need to do that before you subset; otherwise you exclude non-contested elections before you slide the board membership info.
  • other aspects:
    • you might want to slide the gender variable, too.
    • you need to slide the vote share variable of the winner.

These are my first comments on the R file. Hope that is helpful for your next steps!

@mberneaud
Copy link
Collaborator

mberneaud commented Apr 30, 2016

First of all: Thanks for the extensive comments on my file, especially the highlighting of errors in the order of subsetting / merging.

Some clarifications regarding my steps:

variable "TotalTerms" (line 82-86): variable does not tell us whether it is mayor's first term or maybe already her fifth term; this, however, is crucial as we presume that re-election is more likely the more terms the incumbent has served as a mayor.

I agree that this is a real downside of the variable. I did this because I didn't find any way of calculating the number of preceding terms for each row in the data set. I remember you telling me that Christopher added such functionality into his DataCombine package after you discussed this bilaterally. Could you please share the code which you used to create term variable using the GitHub version of DataCombine on here or in a GitHub Gist with me, so I could include it into our R source file?

variable "Year" (line 88-89): what is the value added of this variable?

As I didn't convert the date strings into R's date format using lubridate, I extracted the year numbers from the date string and coerced them into an integer to be used in the subsetting. While this might not be the most parsimonious way of doing it, I'll leave it "as is" as it should suffice for our analysis, which is conducted on the year-level.

variable "Reelection" (line 97-103): our dataset is not ordered by municipality ID, but by incumbent name, which leads to a wrong estimation - you need to order it by IDMunicipality (1st order) and election year (2nd order) before you do the steps

You're absolutely right about this. Thanks for highlighting this issue. I have rearranged my code for creating the Reelection variable and all the lagged dependent variables after the subsetting of the data set and order the data prior to merging.

you need to first exclude 1 round of the runoff election (either the 1st round that led to the runoff or the 2nd round, which is the outcome of the runoff election)

I have excluded those elections which are coded "3" for ElectionType. This excludes all first rounds where run-offs were necessary and instead uses the result of the run-off election for the year under scrutiny. While this probably overstates the margin by which the candidate won, I see this to be better than the danger of falsely declaring those candidates winners, who were leading in the first round but who lost in the run-offs.

you might want to slide the gender variable, too.
you need to slide the vote share variable of the winner.
I've done that in the source code, but you might have just missed it given the length of the document. The code I used is below.

MayorElection <- slide(MayorElection, Var = "VoteShareWinner", TimeVar = "ElectionDate", NewVar = "L.VoteShareWinner")
MayorElection <- slide(MayorElection, Var = "Geschlecht1", TimeVar = "ElectionDate", NewVar = "L.Geschlecht1")
``

Once again, thanks for the feedback. Your expertise with the data set is really showing here.

@jmarkgraf
Copy link
Owner Author

Very quickly: Here is the data code that I used for the term number of mayors. It takes quite some time though to run:

## : Number of Terms of Mayors
MayorElection$FakeVar <- 1
MayorElection <- CountSpell(MayorElection, TimeVar = "ElectionDate", 
                            SpellVar = "FakeVar", GroupVar = "NameCandidate1", 
                            NewVar = "TermMayor", SpellValue = 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants