A utility for data mining flight prices

In August, my girlfriend will be starting medical school in Colorado, and I will be here in San Francisco. Naturally, I have been looking at plane tickets between SF and Denver, and I noticed that, contrary to my expectation, flight prices followed a non-monotonic function of the number of weeks in advance I searched. I expected that flight prices would monotonically get more expensive as the flight date got closer, but I found that the pattern was something like this:

  • 3+ months before the flight date, the prices are in a "steady state" with little change.
  • Somewhere around 1.5-3 months before the flight date, there is a noticeable dip in prices (the tickets get cheaper).
  • As the flight date nears to within a week or two, prices get much higher.

My hypothesis is there are at least two semi-independent variables affecting the price of an airline ticket. One is the amount of time before the flight date you search for the ticket. This, as I had observed, seems to have a local minimum at 1.5-3 months. I suspect that the actual flight date itself will be another factor: flights around Christmas and July 4th should be more expensive, all other things equal (tentatively called the ticket's "intrinsic price"). I wanted to test this hypothesis by gathering actual data on ticket prices, and to aid me in the process of mining ticket prices I created a tool named (so creatively) flight_price.

flight_price is a collection of CasperJS scripts and some Python wrappers to automatically retreive and record flight prices. When you run it, it collects information about airline tickets serving your specified route for the next six months. It then stores that ticket information in a database. I am releasing an "alpha quality" version of this today, which you can use with your own routes and days, but once I have collected a few months of information, I will update flight_price to use the collected data and perform some basic analyses on it. Hopefully that will allow me to characterize the "intrinsic price" of an airline ticket, and determine exactly how many weeks in advance I should buy plane tickets for maximum savings.


comments powered by Disqus