View on GitHub

Data for Data Science's four-class series on data management and organization

Data for Data Science


Researchers face a growing data management challenge, starting with data collection and continuing through data analysis, publication, and archival. Potential problems research labs may face include scalability of their data management methods to many and/or very large data files, fully documenting data and its organization, and meeting requirements of grants/publication related to data sharing. This four-class course is designed to introduce attendees to best practices in data organization and management. Each one-hour lecture will include lecture, discussion, and practice exercises. This course assumes no prior training in data science. At the end of this course, you will be able to identify resources at Fred Hutch for data management and apply best practices in data organization to your own research projects.

Software requirements for this course can be found on’s Software page.