Conducting research for data science research often involves handling a variety of data sources and analysis methods that can quickly become quite complex. The desire to easily share and reproduce this research adds to this complexity. This is where using the same tools and methodologies from software engineering can help.
In this workshop, I want to share my experience of being both a software engineer and a social science researcher. By building an example project using Python and PostgreSQL, we'll go over the following items:
- Building a research application by creating our very own CLI tool
- Designing a PostgreSQL database to power our research application
- Using tools like QGIS and pgAdmin to read from our database for exploratory data analysis
- Enabling reproducibility by packaging and sharing our software
To illustrate how all of this comes together, we walk through an example project where we research travel times to parks within the city of Berlin. We'll use data from the 2022 German Census and OpenStreetMap as the basis for the study. The tutorial will focus on how to import and organize this data into our database as well as adding new tables for measuring travel times. At the end, we'll package everything up and share it with the world.
Prior to the tutorial, please ensure you have git (https://git-scm.com/) and the pixi package manager (https://prefix.dev/) installed on your computer.
online
Begin:
End:
Add to Calendar