Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Data is for the census tracts where the stops are located, which admittedly makes more sense for subway/streetcar/bus stops than commuter rail stops (where people drive from other tracts and park at the stations). The BART is particularly troublesome because it takes on characteristics of both types of transit.

A data science friend of mine said we should do a "watershed" model where we define an area where people flow into a stop, but I'm not sure how best to do that! Maybe someone smart can fork the project and improve on our methods.



Probably you could just bunch census tracts within a certain distance of any stop and count them as being part of their nearest stop and then average them. (Weighing correctly for population differences.) If you wanted to do it better, you could have the weight for each census tract fall off with distance from the stop.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: