Projects

CitiBike Network Analysis

Fall 2021
Intro to Urban Data Informatics
Columbia GSAPP
Prof. Boyeong Hong

Summer ridership is 19x winter — and the network rewires itself with the seasons. Mapping 204K Citi Bike trips across NYC and Jersey City using Python, NetworkX, and Plotly.

Scroll
01

The Question

Method
NetworkX DiGraph
Plotly scatter plots
Seasonal comparison

204K trips sampled
4 months of 2021 data

New York's continental climate receives four distinct seasons. This study observes the relationship between Citi Bike ridership and the frequency of use at different stations to answer the question: is there a significant difference in the ridership of Citi Bikes over the seasons?

Data was sampled from four representative months — February, May, August, and November — corresponding to each season. Rather than comparing absolute volume between months, the analysis focuses on relative activity at docking stations and inter-station connectivity within each month's time frame.

02

Approach

Building the Network
Network
139–218 active stations
Geographic node layout
Directed edges by trip count

Each month was modeled as a directed graph using NetworkX, where stations are nodes positioned by their real latitude and longitude. Trips between station pairs become directed edges, weighted by trip count. The graph layout mirrors actual geography — you can see the shape of NYC and Jersey City's waterfront in the network itself.

Using Plotly interactive scatter plots for each month and basic data-munging, we observe which stations gain or lose prominence across seasons. Overlaying connection lines on the scatter plots reveals how inter-station connectivity shifts.

Tools & Libraries

Python, pandas, NumPy, NetworkX, Plotly, Folium, geopandas, shapely, scikit-learn, Matplotlib, Seaborn

03

Seasonal Networks

Station activity by season — node size = trip volume, edges = inter-station connections
CitiBike network graph — February (winter)
WINTER
February — 4,881 trips. The network contracts to core commuter corridors.
CitiBike network graph — May (spring)
SPRING
May — spring ramp-up. Peripheral stations reactivate as ridership climbs.
CitiBike network graph — August (summer)
SUMMER · PEAK
August — 93,800 trips. 218 active stations, 57% more than winter.
CitiBike network graph — November (fall)
FALL
November — fall decline. Recreational trips drop first, commuter routes persist.
04

Findings

  • Summer ridership is ~19x winter. 93,800 trips in August vs 4,881 in February. The system effectively hibernates in winter.
  • The active network expands geographically. From 139 stations in February to 218 in August — the system's spatial footprint grows by 57% in warm months.
  • People take longer rides in summer and fall. Shorter, utilitarian rides in winter and spring give way to longer recreational trips when the weather allows.
  • Hub stations maintain centrality year-round. Newport PATH (5,063 trips in August), 11 St & Washington St, and 14 St Ferry anchor the network regardless of season.
This project began as coursework for Introduction to Urban Informatics at Columbia GSAPP (Fall 2021, Professor Boyeong Hong). The interactive Plotly visualizations and full write-up live in the original article on Medium. Browse the notebooks and data on GitHub.