Interested in this course for Spring 2026? If you are a new UVM Advance/Non-Degree student, choose your course and complete the application form. If you are a current UVM student, enroll in a course through MY UVM Portal. If you have questions please chat with us or schedule a 15 or 30-minute virtual meeting with an Enrollment Coach.

About STAT 5870 A

Data harvesting, cleaning, and summarizing; working with non-traditional, non-numeric data (social network, natural language textual data, etc.); scientific visualization; advanced data pipelines with a practical focus on real datasets and developing good habits for rigorous and reproducible computational science; Project-based. Credit not awarded for both STAT 5870 and STAT 3870. Prerequisites: Knowledge of CS 1210 and either STAT 1410 or STAT 2430 required; knowledge of CS 2100 and MATH 2522 or MATH 2544 recommended; Graduate student or Instructor permission. Cross-listed with: CSYS 5870, CS 5870.

Notes

Cross-listed with CS 5870 and CSYS 5870; Total combined enrollment: 40; Prereqs enforced by the system: Graduate student or instructor permission; Knowledge of CS 1210 & (STAT 1410 or STAT 2430) required; Knowledge of CS 2100 & (MATH 2522 or MATH 2544) recommended; PACE students with permission and override.

Section Description

This course offers a rigorous, hands-on introduction to data science as both a technical and ethical research practice. Students will learn how to collect, curate, clean, and interpret data with methodological care, transparency, and accountability. Emphasizing the process of inquiry, the course situates data science within broader social, scientific, and epistemic contexts, asking not only what we can do with data, but how and why we do it. Core topics include data acquisition and documentation, exploratory data analysis, reproducible workflows, uncertainty and error, and ethical reasoning in the design and communication of data-driven research. Students will engage with questions of data provenance, consent, and representation while developing practical fluency in Python-based tools for data wrangling and visualization. Through applied exercises and reflective projects, students will practice designing responsible research pipelines and communicating insights.

Section Expectation

The course is organized around lectures, weekly assignments, and a final applied research project. Each lecture introduces key technical and ethical dimensions of the data science lifecycle, which are then explored through structured, hands-on assignments. Assignments will emphasize data collection and documentation, exploratory analysis, and the ethical interpretation of empirical findings. The final project invites students to design and conduct a small-scale data-driven study (from sourcing and cleaning the data to analyzing, visualizing, and communicating results) accompanied by a reflective ethical analysis of their methodology and assumptions.

Evaluation

Grades will be determined as follows: Homework assignments (30%) Exams or short written reflections (20%) Class participation (10%) Final project (40%) Homework Assignments: Python will be used throughout for data analysis and visualization, primarily through Jupyter notebooks. Students are expected to have basic proficiency in Python. Assignments will require both technical implementation and written reflection, integrating computational skills with conceptual understanding of research ethics and design.

Important Dates

Note: These dates may not be accurate for select courses during the Summer Session.

Courses may be cancelled due to low enrollment. Show your interest by enrolling.

Deadlines
Last Day to Add
Last Day to Drop
Last Day to Withdraw with 50% Refund
Last Day to Withdraw with 25% Refund
Last Day to Withdraw

Resources

There are no courses that meet this criteria.