Skip to content

Latest commit

 

History

History
100 lines (56 loc) · 4.71 KB

stats_interview_questions.md

File metadata and controls

100 lines (56 loc) · 4.71 KB

Probability Questions

  1. You have a Magic The Gathering deck of 60 cards, which contains cards of the following types:
  • Swamp (15 in deck): Generates 1 manna. One can be played each turn.
  • Dark Ritual (4 in deck): Consumes 1 manna, generates 3 manna. Any amount can be played in a turn.
  • Juzam Djinn (4 in dec): Consumes 4 manna.

Your goal on the first turn, in which you draw 7 cards from a randomized deck, is to play a Juzam Djinn. What is the probability that you can play a Juzam Djinn on the first turn?

  1. Suppose X and Y are two independent normally distributed random variables with unknown means, but equal variance. How far apart must the means be so that the mixture distribution (the distribution of X + Y) is bimodal?

Statistics: Things You Must Know

  1. What is a p-value.

  2. What is the law of large numbers.

  3. Suppose I flip a coin, and receive 10 heads in a row. Why does the law of large numbers not tell me that tails is most likely on the next flip?

  4. What is the central limit theorem.

  5. Why is the central limit theorem important? Give a common application.

  6. Describe the process of null hypothesis significance testing.

  7. Suppose that all scientists in the world run all of their experiments as properly formulate null hypothesis tests at a significance level of 5%. What does this say about the errors that they will make, on average?

  8. What is an A/B test. Give a simple example.

  9. What is an A/A test, and why would you run one?

  10. What is an unbiased estimator?

  11. Critique the following statement: All estimators used in statistical tests must be unbiased.

  12. A VP of your company criticises your team: "75% percent of your experiments do not show statistical significance, so are a waste of money". How would you respond?

  13. At AllFarm insurance company, a new initiative is dreamed up that executives believe will make agents (people who sell their product) more productive. To test this theory, the AllPlace team picks one state (out of 50) to run a test for the procedure. A product manager insists that it is important that the test group includes high performing agents, as they will generate the most revenue with the new procedures. Please take the role of a data scientist in this meeting.

  14. InternetCompany would like to incentivise it's army of car driving contractors to pick up more customers at peak hours by offering a new routing algorithm to it's drivers (the old application is stable and well loved, so InternetCompany does not want to force drivers to switch). To test the efficacy of the app, InternetCompany selects 50% of it's drivers totally at random, and sends an email to them with information about the new app, and a download link.

In evaluating the results, Matt the amazing data scientist reasons that drivers who received the link but did not actually download the app cannot benefit, and so re-assigns them to the control group. Please discuss this procedure with Matt.

Kahnemans Questions

All families of six children in a city were surveyed. In 72 families the exact birth order of all of their children was GBGBBG.

What is your estimate for the number of families surveyed in which the exact order of births was BGBBBB?

On each round of a game 20 marbles are distributed at random among five children: Alan, Bob, Carl, Dan, and Ed. Consider the following distributions:

Alan 4 Ben 4 Carl 5 Dan 4 Ed 3

vs.

Alan 4 Ben 4 Carl 4 Dan 4 Ed 4

In many rounds of the game, which distribution will occur more often?

The average heights of adult males and females in the US are, respectively, 5 ft 10 inches, and 5 ft 4 inches. Both height distributions (males and females) are approximately normal with a standard deviation of about 2.5 inches.

An investigator has selected one population by chance (either males of females) and has drawn a random sample from it.

In which of the following situations is it more likely that the investigator has chosen the male population:

  • The sample consists of a single person whose height is 5ft 10 inches.
  • The sample consists of 6 people whose average height is 5 ft 8 inches.

A certain town is serviced by two hospitals. In the larger hospital about 45 babies are born each day, and in the smaller hospital about 15 babies are born each day. As you know, about 50 percent of all babies are boys, though the exact percentage of boys varies day to day, sometimes higher, sometimes lower.

For a period of 1 year, each hospital recorded the days on which more than 60 percent of the babies born were boys. Which hospital do you think recorded more such days?

  • The larger hospital.
  • The smaller hospital.
  • About the same.