Using Big Data to Reimagine Urban Transportation

Imagine the streets of Manhattan with only one-quarter the amount of taxi traffic. Think of the benefits of reduced noise and congestion, cleaner air and improved personal safety.

Researchers at MIT have been thinking about those things lately, and they’ve come up with a radical new model for urban transportation that offers an example of how new data analysis technology can enable us to completely rethink our assumptions about design.

The scientists at MIT’s Computer Science and Artificial Intelligence Laboratory mined data from more than three million taxi rides to come up with a smarter way to ferry people around the city. Using a combination of ride-sharing, route optimization and on-demand delivery enabled by smart phones, they calculated that 95% of the current demand for taxis in Manhattan could be covered by just 2,000 highly coordinated 10-person vehicles. That’s compared to the 14,000 taxis that today roam the streets, looking for riders. Even if smaller, four-passenger cars were used, 98% of current demand could be met with only 3,000 cars. In both cases, wait times would be reduced.

The team applied algorithms that not long ago were too expensive or resource-intensive to be processed by commercial computers. Thanks to big data advances like machine learning and predictive analytics, combined with large cloud data stores, these programs are now practical and affordable.

Taxicab fleets and ride-sharing services like Uber and Lyft gather a lot of data about routes and trip times, but relatively little is put to use outside of their own captive services. The MIT researchers aggregated this data and analyzed it by time of day, trip length, route frequency, number of riders and other usage patterns to create a model of how the current system works. They then used tried-and-true linear programming techniques to optimize routes that maximize passenger loads and minimize travel time.

It’s the first time scientists have been able to quantify the trade-off between fleet size, capacity, waiting time, operational costs and other factors for a wide range of vehicles, according to Daniela Rus, Director of the Computer Science and Artificial Intelligence Laboratory at MIT.

The team applied “anytime optimal algorithm,” which is a form of machine learning that runs models repeatedly to seek out patterns that point to potential areas for improvement. As the machines get smarter, the scheduling algorithms improve. So a year down the road the fleet might be pared to 2,500 vehicles without impacting service. Widespread adoption of autonomous vehicles could improve efficiencies even further.

Would this work in New York? Probably not, at least not at first. For one thing, not all riders will trade the privacy of a taxi cab for a shared minivan. But the data indicates that a surprisingly large number will. Lyft reported that half of its San Francisco trips and 30% of its New York assignments are already carpools. There are many ways shared vehicles could be made more attractive to use. And with an average wait time of under three minutes, many riders may be happy to share a ride rather than stand in the street waving at passing cabs.

Either way, “That can be accounted for in the algorithm,” said Dr. Javier Alonso-Mora, Assistant Professor at Delft University of Technology in the Netherlands, and one of the principal authors of the study.

So can changes in public transportation use, which may be sparked by the availability of comfortable low-cost alternatives. “This could be modified in future iterations of the model,” Alonso-Mora said. The machine-learning model is flexible enough to adapt to changes in real time without disruption.

Other variables may be harder to model, such as the inevitable opposition from labor unions, auto repair shops and fleet owners, not to mention city agencies that benefit from taxi licensing fees. It’s more likely that the early adopters will be private limousine companies and shuttle services. “A number of companies have contacted us to investigate how this algorithm could be applied to the transportation services that they provide,” Alonso-Mora said. Big ideas usually start small.

It’s been said that modeling traffic patterns is more complex than forecasting the weather. With new technologies that make large-scale data analysis practical – and affordable via the cloud – urban planners are ready to tackle problems that were considered impossible to solve just five years ago.

Paul Gillin

Writer, Speaker and B2B Content Marketing Strategist
Paul Gillin is a writer, speaker and B2B content marketing strategist who specializes in social media. He helps organizations understand and use social media to build their brands and strengthen customer relationships. Paul is the author of five books and more than 400 articles on the topic of social media and digital marketing. He was the social media columnist for B2B magazine for seven years and is currently a staff columnist at He also writes regularly for the tech news site SiliconAngle. Previously, Paul was a technology journalist for 23 years. He was founding editor-in-chief of B2B technology publisher TechTarget and editor-in-chief and executive editor of the technology weekly Computerworld. His website is

Latest posts by Paul Gillin (see all)