The project I am most proud of, it won me a Data Science Hackathon hosted by the University of Nairobi. I cleaned and restructured Kenyan HIV epidemiological data, before analyzing it to come up with insights that can help stakeholders come up with solutions on the best way to combat the disease, based on the facts. I also merged the HIV data with contraceptive data - using the county name as the index - to find out the effect of different contraceptives (condoms, IUDs etc) on the HIV infection rate in the various counties.
I used Python as well as its various libraries to achieve this (pandas, matplotlib, seaborn). Some of the insights I came up with include: 1 - Counties with high populations, such as those with big towns or cities, have the highest number of HIV-positive patients. 2 - Counties found in the Western and Nyanza Province regions, have higher numbers of HIV-positive people. This could possibly be attributed to a number of factors. e.g cultural factors such as polygamy, bride inheritance, superstition; and historical factors such as disinformation on HIV and how it spreads. 3 - Condoms, and especially male condoms (as they are mostly used), prove effective in curbing the spread of the HIV virus. They should therefore be distributed to high-risk counties as much as possible. 4 - Condoms are by far the most popular kind of contraceptive.