Docker Compose: Making Our Data Pipeline Project Portable
by Yin Yin Chan 🤓 June 2026
TL;DR
This is a continuation to the post on Data Pipeline: LA Meter Parking Occupancy and Citations with Python and PostgreSQL.
Now that we have our data pipeline scripts running within Github Codebases, we need to Dockerize the entire program so we can make it portable.
Docker Compose
Prerequisite skills
We have an ingestion script that imports from one of the data endpoints we need, let’s see how it would run inside a docker container.
Create
- Create a
docker-compose.ymlto simplify starting all services. We can run docker-compose to start up both the pipeline build and postgresql.
From your project root:
touch docker-compose.yml
In docker-compose.yml:
services:
app:
build: ./pipeline
env_file:
- .env.docker
depends_on:
postgres:
condition: service_healthy
postgres:
image: postgres:18
env_file:
- .env.docker
environment:
POSTGRES_DB: la_meter_parking
volumes:
- la_meter_parking_postgres_data:/var/lib/postgresql
ports:
- "5432:5432"
healthcheck:
test: ["CMD", "pg_isready", "-U", "root", "-d", "la_meter_parking"]
interval: 5s
retries: 5
volumes:
la_meter_parking_postgres_data:
Note that we’re asking docker compose to read from .env.docker, so let’s create that and use the same keys you had from .env:
touch .env.docker
And then set your values.
Because we’ve already added it to .gitignore at the beginning of the tutorial, this .env.docker should already be ignored.
Run
- Let’s run it now
docker-compose up --build
If all is well and good, your terminal should start and end with this:
[+] Building 1.3s (16/16) FINISHED
=> [internal] load local bake definitions
...
app-1 | Inserted: 100
app-1 | Inserted: 39
app-1 exited with code 0
Check
- Check that LA City’s data is in our database. In a separate terminal,
docker-compose ps
Take the NAME that gets output
docker exec -it the_name_that_was_output psql -U same_as_POSTGRES_USER -d same_as_POSTGRES_DB
You should be in and can use psql to check on your newly created table from the docker-compose build
\dt
SELECT * FROM meter_occupancy LIMIT 10;
An output like this means it works:
index | SpaceID | EventTime_UTC | OccupancyState
-------+--------------+-------------------------+----------------
0 | WP150 | 2026 Jun 04 05:52:15 PM | OCCUPIED
1 | CB588 | 2026 Jun 04 04:31:17 PM | OCCUPIED
2 | C1087 | 2026 Jun 04 07:54:01 PM | OCCUPIED
3 | WP69 | 2026 Jun 04 07:12:46 PM | VACANT
4 | WP67 | 2026 Jun 04 07:30:50 PM | OCCUPIED
5 | C291 | 2026 Jun 04 08:01:42 PM | OCCUPIED
6 | V167 | 2026 Jun 04 03:37:15 PM | OCCUPIED
7 | HO281B | 2026 Jun 04 06:01:53 PM | OCCUPIED
8 | HO879A-const | 2026 Jun 04 06:17:45 PM | UNKNOWN
9 | SV34 | 2026 Jun 04 07:08:42 PM | OCCUPIED
(10 rows)
Clean up
Delete anything you don’t need
# Containers
docker ps -a
docker rm <container_id>
# Images
docker images
docker rmi <image>
# Volumes
docker volume ls
docker volume rm <volume_name>
# Networks
docker network ls
docker network rm <network_id>
