The purpose of this document is to state some of the guidelines regarding data (as well as code) as used by the University of Chicago’s Data Science Clinic. Note that any other agreement between the University of Chicago’s DSI and the client (such as a DSI partnership agreement or MOU) supersedes this document.
These guidelines should be considered the “baseline” policies that the Data Science Clinic will abide by. Any deviation from these policies should be communicated to the project team by either the Clinic Administration or the Faculty Mentor on the project.
There are two types of data and code that are dealt with in Clinic:
- Confidential Data/Code: Confidential data or code is any data/code provided to the Clinic by the client or created under the direction of the client for the project. This is data which should be considered property of the client. Examples of this could include data provided by the client to do analysis, algorithms provided by the client or even work product produced under the direction of the client and mentor during the quarter.
- Non-confidential Data/Code: This is all other data / code which is used during the project. This includes public data/code that may reside under other licenses.
The Faculty Mentor on the project can provide guidance on which data and code is confidential or not.
For Confidential Data/Code: The only acceptable use is during the project in order to fulfil the goals of the project. This activity may include training models, producing visualizations and exploring the data, but only as a way toward completing the goals of the project.
For non-Confidential Data/Code: If there exists a license, that license takes precedence, otherwise all data and code created for the project is allowed to be used by the project.
All Data/Code (confidential and non-confidential) should be analyzed and used on an encrypted hard drive and not shared with any person who is not affiliated with the project or Data Science Clinic Admin. No backups, outside the code/data repository are to be stored locally.
All Data/Code will be stored in a private Github repository with backups and access control managed via Github. Only people working on the project and approved by the faculty mentor should be provided access to this storage. Github two-factor authentication is required to be enabled on your account.
If a non-Github repository is required, that access methods will be provided by either Clinic Administration or the Faculty Mentor. Only people working on the project and approved by the faculty mentor should be provided access to this storage. If a non-Github repository is used and multi-factor authentication is available, it is required to be used.
At the conclusion of the project, access will be revoked. At this time, you will receive an email to delete any data or code that was stored locally.
Disclosure of data/code of either type (confidential or non-confidential) is explicitly forbidden without the explicit consent of the faculty mentor on the project.
No local backups should exist, all data/code for the project should be put in the private Github repo (unless otherwise directed).
At the conclusion of the project, all local copies should be deleted and access to the repository on Github will be deleted. The repository itself should also be deleted once the project concludes and there is no longer an ongoing engagement with the client.
An incident is defined as any event which indicates that the code or data on a project may have fallen into the possession of anyone not associated with the project. If an incident occurs, such as a laptop getting stolen or an account is compromised, the student needs to alert both the Faculty Mentor and the Clinic Administration as quickly as possible.
Can I share the code with prospective employers or during a job search?
Only if the project client approves it. If you aren’t sure if this applies to you, please talk to your Faculty Mentor or the Clinic Administration.
I really want to talk about this project when I do interviews and write up what I’ve done on my resume, am I allowed to do that?
For many projects there are things that you can share and things that you shouldn’t. If you aren’t sure exactly what you should or shouldn’t say the Faculty Mentor or Clinic Administration will gladly assist. If you aren’t sure what you should write on your resume, those same people will gladly chip in.