Data Science

The rapidly expanding collection of massive amounts of data is leading to transformations across broad segments of industry, science, and society. These changes have sparked great demand for individuals with skills in managing and analyzing complex data sets. Such skills are interdisciplinary, involving ideas typically associated with computing, information processing, mathematics, and statistics as well as the development of new methodologies spanning these fields. Our major in Data Science (offered jointly with the Dietrich School of Arts & Sciences Departments of Mathematics and Statistics) will enable students to participate in this data revolution.

This undergraduate major allows students to gain critical skill sets that span key areas of statistics, computing, and mathematics, with foundational training providing literacy in four areas (data, algorithmic, mathematical, and statistical) that every student needs to master data science. Students will develop expertise that connects theory to the solution of real-world problems, and be able to specialize their studies towards a more specific career focuses. Completing this major will prepare students to work as a data science professional or to pursue graduate study in a direction involving data in a significant way.

Major Requirements

Foundational Skills - 31 credits

The foundational courses provide students with fundamental knowledge across four "literacies": data, algorithmic, mathematical, and statistical. Courses in this area will help students develop baseline computational capabilities, will teach students to think about data in a statistical framework, and will introduce students to fundamental mathematical concepts arising in data analysis. These courses are drawn from three main disciplines (CS/IS, Math, and Statistics) and include an introductory course in the fundamental skills of working with data (Python/R programming, exploratory data analysis, data visualization):

Expertise - 18 credits

This is where students become data scientists, integrating skills from the foundational areas to develop expertise in the realm of data. Skills will be developed in the description and analysis of data in terms of sources of variability and key relationships, the development of algorithms and data handling skills to extract and interpret information from complex data sets, as well as in the visualization and communication of results. The critical issue of the ethical use of data will also be addressed in the context of data science.

Specializations

Students within the data science major will have the opportunity to pursue an area of specialization through the selection of elective courses in a targeted direction relating to data analytics, computer systems, modeling, or data science in context. While selecting all 3 courses from the same category is advised for students seeking a focus, students may also choose courses across categories to suit their interests, if they prefer that approach. The specialization course groupings are as follows.

  • Computer Systems: Students pursuing this specialization will gain depth of knowledge in the development, deployment, and analysis of the complex computer and information systems necessary for tackling large-scale data science problems.
  • Data Analytics: Students pursuing a data analytics specialization will enhance their ability to make sound inferences and decisions using the science and art of learning from data: specifically, the design, collection, analysis, and interpretation of data in an uncertain world, and the communication of findings.
  • Data Science in Context: Students pursuing this specialization will gain depth of knowledge in both the technical and organizational aspects of the management, curation, description, preservation, and application of digital datasets of varying sizes in specific business, professional, or scientific contexts. We expect the collection of courses within the specialization to expand as more domain-specific data science courses begin to be offered across campus.
  • Modeling: Students pursuing a modeling specialization will enhance their ability to develop and harness theoretical tools to characterize structure within data and to represent and analyze processes that may underlie this structure.

Capstone - 3 credits

Data science is a hands-on field. Comprehensive training in data science requires substantive experience working on a problem outside of the realm of usual classroom experiences, with the complications of messy data, ambiguity, and lack of clear structure that characterize "real-world" scenarios. This experience should include work with others with diverse skill sets as well as communication with non-specialists. The capstone
course will provide students with such an experience. In the short term, the capstone course requirements can be fulfilled through completion of CMPINF 1981 - PROJECT STUDIO , MATH 1103 - MATHEMATICAL PROBLEMS IN BUSINESS, INDUSTRY, AND GOVERNMENT , or STAT 1961 (Statistical Data Science in Action). In this case, students will be advised to select a capstone course based on their specializations but ultimately will be able to choose from among the full list of capstone options. In addition, the capstone requirement may be satisfied via a faculty-guided research project that is relevant to data science, subject to approval by the Data Science program director(s). After we have more experience in integrating and coordinating our courses across the three units, we will consider a unified cross-listed capstone course if that is deemed more desirable.

For full major requirement details, visit the Data Science course catalog.

Admissions Requirements