How Expedia Finds your Flights: An Overview
When you search for flights on Expedia, how do we come up with results? Many flight searches on Expedia are powered by Best Fare Search (BFS), Expedia’s in-house flight search and pricing engine. BFS is somewhat unique; typically most online travel agencies (OLTAs) rely on external technology providers for flight search. At Expedia we like to tackle hard problems that are core to our business ourselves – and as we’ll present in this two-part series, flight search is a very hard problem!
Historically searching for flights was a totally opaque process for customers. They would have to work directly with an agent in a brick-and-mortar travel agency to find out what flight itineraries were available for their desired destination and travel dates. Behind the scenes the travel agent relied on a Global Distribution System (GDS) green screen interface to inform them of the current state of flights availability and pricing. The process was manual and slow, and the options that could be shown, considered, or recommended were very limited. Ultimately the customer’s ability to book a desired itinerary depended quite a bit on the expertise and know-how of the individual travel agent that they worked with.
With the onset of the internet and online computation in the late 90s, Expedia invested in the BFS (Best Fare Search) search engine as an opportunity to automate the flight search process and expand the breadth of selection and transparency for customers. In 2000, BFS was introduced as the first PC-based (as compared to CRS mainframe systems) online flight search engine in the industry. This represented an order of magnitude improvement in Expedia’s ability to provide flight search results at scale. The first version of BFS increased the set of flight itineraries presented to customers from 10 to 100. Fast forward to today and Expedia customers have the ability to choose amongst 1000s of itinerary options, selected and presented to them within mere seconds. To make this happen, there is a lot that occurs under the hood.
So what makes flight search at scale a challenging problem in the first place? First we need to consider the scale of the problem space for any given flight search. As an example, let’s take a customer search for flights from Seattle to Atlanta. For BFS to produce a high quality set of itineraries for the customer, we first need to construct the flight domain – essentially a graph that represents all theoretical paths/routes that connect SEA to ATL. It turns out that the flight domain graph can quickly grow to be very large. If one considers all direct and indirect ways to get from A to B (A-B, A-C-B, A-C-D-B and so on), layer on all operating airlines and combinations of multiple airlines (aka interlines), and then account for a variety of flight schedules – it is not uncommon for the graph to grow to millions of paths in each direction. That’s a fairly large domain from which to choose from, and it requires some smart algorithms to quickly traverse this graph and derive an attractive mix of relevant results. For a trip between Seattle and Atlanta, the flight domain contains more than 19 quadrillion unique round trip itineraries.
The large size of the flight domain can to some degree be attributed to the “hub and spoke” model of air traffic distribution. In other words, airlines tend to base their operations in a few central hubs and rely on a large volume of connecting flights into/out of these hub cities to build out their ‘route map’. It is then not surprising that connecting itineraries involving one or more stopovers are often the more reasonably priced ones (as opposed to direct service that may have fewer choices and generally be more expensive).
We briefly touched upon the selection challenge – in other words, given such a large domain, how do we figure out what subset of itineraries are meaningful to our customers? This makes the search problem even more interesting, as it introduces an element of subjectivity into the equation. Obviously, a flight that is attractive to one customer may be completely irrelevant to another. Or perhaps there are certain customers who need to see a diverse variety of options (cheap, expensive, long, short, all potential airlines, etc.), to be able to compare them and make the best choice for themselves. At Expedia, our approach to situations like these is to learn through a disciplined approach of applied, iterative testing. In this context, BFS is key – it allows us to control and customize the flight search result completely. This enables more testing, more learning and ultimately better outcomes for our customers.
At BFS, we often debate the natural tradeoffs that we encounter in our space. For example, attempts to provide a more complete or comprehensive set of results to our customers can negatively impact our search speed. Similarly, efforts to find cheaper and cheaper flights can introduce complexity that can be costly (in terms of computation and/or real cost). We often need to negotiate between competitiveness (the qualitative nature of our search results), accuracy (making sure the prices we compute are correct and current) and performance (speed and size of the search results). The challenge for us is to tune BFS to perform optimally against these dimensions, while understanding that any such optimization exercise is ultimately fluid. This is because our engine is largely dependent on domain data (flights, fares and availability) that changes constantly in response to market demand, airline revenue management practices, and seasonality.
So where does this leave us when dealing with millions of combinations, the importance of speedy search and the underlying volatility of airline data? In the second part of this post we will explore the technical challenges in BFS in more detail, and dive deeper into the computer science behind flight search.