The General Transit Feed Specification (GTFS) is a standard format for public transportation schedules and geographic information. GTFS data is typically stored in a relational database, and you can use SQL queries to retrieve and analyze the data. Below are some sample SQL queries for common GTFS data queries:
Assuming you have the following GTFS tables in your database:
agency
: Contains information about the transit agency.routes
: Contains information about transit routes.trips
: Contains information about specific trips on routes.stops
: Contains information about transit stops.stop_times
: Contains information about stop times for specific trips.calendar
: Contains service dates and day-of-week availability.calendar_dates
: Contains exceptions to the normal service schedule.
Here are some example SQL queries:
- Get all transit agencies:
sqlSELECT * FROM agency;
- Get all routes of a specific agency:
sqlSELECT * FROM routes WHERE agency_id = 'agency_id_here';
- Get all trips for a specific route:
sqlSELECT * FROM trips WHERE route_id = 'route_id_here';
- Get all stops for a specific trip:
sqlSELECT stop_times.*, stops.* FROM stop_times
JOIN stops ON stop_times.stop_id = stops.stop_id
WHERE trip_id = 'trip_id_here'
ORDER BY stop_sequence;
- Get all trips that stop at a specific stop:
sqlSELECT trips.* FROM trips
JOIN stop_times ON trips.trip_id = stop_times.trip_id
WHERE stop_id = 'stop_id_here';
- Get all trips for a specific route on a specific date:
sqlSELECT trips.* FROM trips
JOIN calendar ON trips.service_id = calendar.service_id
WHERE route_id = 'route_id_here'
AND 'date_here' BETWEEN start_date AND end_date
AND day_of_week_column = 1; -- 1 for Monday, 2 for Tuesday, and so on.
- Get all stops within a certain distance from a specific latitude and longitude (using Haversine formula for distance calculation):
sqlSELECT stop_id, stop_name, stop_lat, stop_lon,
3959 * acos(cos(radians(latitude_here)) * cos(radians(stop_lat))
* cos(radians(stop_lon) - radians(longitude_here)) + sin(radians(latitude_here))
* sin(radians(stop_lat))) AS distance
FROM stops
HAVING distance < radius_in_miles
ORDER BY distance;
Note: The exact SQL queries may vary depending on your database schema and table structures. Make sure to replace placeholders like 'agency_id_here'
, 'route_id_here'
, 'trip_id_here'
, 'stop_id_here'
, 'date_here'
, 'latitude_here'
, 'longitude_here'
, and 'radius_in_miles'
with actual values or parameters as needed.
Always ensure that you have appropriate indexes on the columns used in your queries to optimize performance, especially for large GTFS datasets.