The General Transit Feed Specification (GTFS) is a standard format for public transportation schedules and geographic information. GTFS data is typically stored in a relational database, and you can use SQL queries to retrieve and analyze the data. Below are some sample SQL queries for common GTFS data queries:

Assuming you have the following GTFS tables in your database:

  1. agency: Contains information about the transit agency.
  2. routes: Contains information about transit routes.
  3. trips: Contains information about specific trips on routes.
  4. stops: Contains information about transit stops.
  5. stop_times: Contains information about stop times for specific trips.
  6. calendar: Contains service dates and day-of-week availability.
  7. calendar_dates: Contains exceptions to the normal service schedule.

Here are some example SQL queries:

  1. Get all transit agencies:
sql
SELECT * FROM agency;
  1. Get all routes of a specific agency:
sql
SELECT * FROM routes WHERE agency_id = 'agency_id_here';
  1. Get all trips for a specific route:
sql
SELECT * FROM trips WHERE route_id = 'route_id_here';
  1. Get all stops for a specific trip:
sql
SELECT stop_times.*, stops.* FROM stop_times JOIN stops ON stop_times.stop_id = stops.stop_id WHERE trip_id = 'trip_id_here' ORDER BY stop_sequence;
  1. Get all trips that stop at a specific stop:
sql
SELECT trips.* FROM trips JOIN stop_times ON trips.trip_id = stop_times.trip_id WHERE stop_id = 'stop_id_here';
  1. Get all trips for a specific route on a specific date:
sql
SELECT trips.* FROM trips JOIN calendar ON trips.service_id = calendar.service_id WHERE route_id = 'route_id_here' AND 'date_here' BETWEEN start_date AND end_date AND day_of_week_column = 1; -- 1 for Monday, 2 for Tuesday, and so on.
  1. Get all stops within a certain distance from a specific latitude and longitude (using Haversine formula for distance calculation):
sql
SELECT stop_id, stop_name, stop_lat, stop_lon, 3959 * acos(cos(radians(latitude_here)) * cos(radians(stop_lat)) * cos(radians(stop_lon) - radians(longitude_here)) + sin(radians(latitude_here)) * sin(radians(stop_lat))) AS distance FROM stops HAVING distance < radius_in_miles ORDER BY distance;

Note: The exact SQL queries may vary depending on your database schema and table structures. Make sure to replace placeholders like 'agency_id_here', 'route_id_here', 'trip_id_here', 'stop_id_here', 'date_here', 'latitude_here', 'longitude_here', and 'radius_in_miles' with actual values or parameters as needed.

Always ensure that you have appropriate indexes on the columns used in your queries to optimize performance, especially for large GTFS datasets.

Have questions or queries?
Get in Touch