Tapping into TNC data

By Chris McCahill

With the rise of transportation network companies (TNCs) like Uber and Lyft, and growing concerns about their effects on traffic and curb usage, transportation agencies and local governments are eager for data. Data from TNCs, however, are heavily guarded. Many governments are trying to negotiate agreements with these companies and working on laws that require data sharing. Others, however, are getting more creative.

Almost as long as TNCs have existed, folks have gathered data simply by using the services and recording as much information as they can. In 2014, one Lyft driver tracked 955 of his rides—including routes, costs, and earnings—using a combination of apps. A University of Colorado PhD student did something similar but also incorporated passenger surveys. In Chicago, DePaul University researchers studied TNC travel times by riding as passengers.

Those interested in general trends around TNC usage, but not specific trip data, rely on surveys. Researchers from Harvard and MIT surveyed people throughout the U.S. about TNC usage, and researchers at UC-Davis conducted surveys in Boston, Chicago, Los Angeles, New York, San Francisco, Seattle, and Washington, D.C. The Metropolitan Area Planning Council in Boston also conducted passenger intercept surveys.

Detailed trip data is still the gold standard, however. FiveThirtyEight was able to access some Uber data by issuing multiple Freedom of Information Act requests to the New York City Taxi and Limousine Commission. Meanwhile, in perhaps the most innovative approach, researchers from Northeastern University came up with a system for collecting data using APIs, which TNCs offer to third party app developers. By tapping into APIs (application programming interfaces), the researchers monitored drivers who were using Uber and Lyft apps but hadn’t picked up passengers. Lyft drivers, specifically, are assigned persistent but anonymous IDs. When drivers go offline and return again some time later, the researchers can assume that their coordinates indicated passenger pick-up and drop-off locations. That information let the San Francisco County Transit Authority produce an interactive map of those locations, broken down by time of day and day of the week. The information lacks route details, but can be used to generate trip matrices.

These second-hand sources aren’t necessarily as reliable or robust as data directly from the TNCs, but until laws and agreements are in place to enable sharing, they provide the best information we have to understand our fast-changing transportation landscape.

Chris McCahill is an Associate Researcher at SSTI.