CSYS 5870 A (CRN: 12893)
Complex Systems: Data Science I - Experience
3 Credit Hours—Seats Available!
Registration
For crosslists see: CS 5870 A STAT 5870 A
About CSYS 5870 A
Data harvesting, cleaning, and summarizing; working with non-traditional, non-numeric data (social network, natural language textual data, etc.); scientific visualization; advanced data pipelines with a practical focus on real datasets and developing good habits for rigorous and reproducible computational science; Project-based. Credit not awarded for both CSYS 5870 and CS 3870. Prerequisites: Knowledge of CS 1210 and either STAT 1410 or STAT 2430 required; knowledge of CS 2100 and MATH 2522 or MATH 2544 recommended; Graduate student or Instructor permission. Cross-listed with: STAT 5870, CS 5870.
Notes
Graduate student or instructor permission; Knowledge of CS 1210 and either STAT 1410 or STAT 2430; CS 2100 and MATH 2522 or MATH 2544 recommended; Cross-listed with CS 5870 A and STAT 5870 A; Total combined enrollment: 40; PACE students with permission and override
Section Description
This course offers a rigorous, hands-on introduction to data science as both a technical and ethical research practice. Students will learn how to collect, curate, clean, and interpret data with methodological care, transparency, and accountability. Emphasizing the process of inquiry, the course situates data science within broader social, scientific, and epistemic contexts, asking not only what we can do with data, but how and why we do it. Core topics include data acquisition and documentation, exploratory data analysis, reproducible workflows, uncertainty and error, and ethical reasoning in the design and communication of data-driven research. Students will engage with questions of data provenance, consent, and representation while developing practical fluency in Python-based tools for data wrangling and visualization. Through applied exercises and reflective projects, students will practice designing responsible research pipelines and communicating insights.
Section Expectation
The course is organized around lectures, weekly assignments, and a final applied research project. Each lecture introduces key technical and ethical dimensions of the data science lifecycle, which are then explored through structured, hands-on assignments. Assignments will emphasize data collection and documentation, exploratory analysis, and the ethical interpretation of empirical findings. The final project invites students to design and conduct a small-scale data-driven study (from sourcing and cleaning the data to analyzing, visualizing, and communicating results) accompanied by a reflective ethical analysis of their methodology and assumptions.
Evaluation
Grades will be determined as follows: Homework assignments (30%) Exams or short written reflections (20%) Class participation (10%) Final project (40%) Homework Assignments: Python will be used throughout for data analysis and visualization, primarily through Jupyter notebooks. Students are expected to have basic proficiency in Python. Assignments will require both technical implementation and written reflection, integrating computational skills with conceptual understanding of research ethics and design.
Important Dates
Note: These dates may not be accurate for select courses during the Summer Session.
Courses may be cancelled due to low enrollment. Show your interest by enrolling.
| Last Day to Add | |
|---|---|
| Last Day to Drop | |
| Last Day to Withdraw with 50% Refund | |
| Last Day to Withdraw with 25% Refund | |
| Last Day to Withdraw |
Resources
There are no courses that meet this criteria.
