Category “Big Data”

Published
July 6, 2017
Author
Ram Hariharan
Category
Comments
Time To Read
Estimated reading time: 5 minutes

Turbocharging Machine Learning

By Ram Hariharan in Big Data, Cloud, Data Science on July 6, 2017 |

The Challenge – How can we scale bidding? Expedia attracts customers through meta search sites like Trivago, Google Hotel Ads, TripAdvisor, and Kayak. As you might guess, Meta sites are very competitive marketplaces. This comes as no surprise – the Winner spot gets 80% of the clicks and traffic drops off exponentially from there. To compete effectively good prices and an optimal bid are both important – the meta site uses a proprietary algorithm that involves the hotel’s price and associated bid to choose a hotel offer for the winner spot. We have to be smart about our bids for each of…

Read More
Published
December 29, 2016
Author
Category
Comments
Time To Read
Estimated reading time: 10 minutes

Operationalizing Spark Streaming (Part 1)

By in Big Data, Data Science, Lessons Learned on December 29, 2016 |

For those looking to run Spark Streaming in production, this two-part article contains tips and best practices collected from the front lines during a recent exercise in taking Spark Streaming to production. For my use case, Spark Streaming serves as the core processing engine for a new real time Lodging Market Intelligence system used across the Lodging Shopping stack on Expedia.com, Hotels.com and other brands. The system integrates with Kafka, S3, Aurora and Redshift and processes 500 msg/sec average with spikes up to 2000 msg/sec. The topics discussed are: Availability: Getting Spark running and…

Read More
Published
July 28, 2016
Author
Willie Wheeler
Category
Comments
Time To Read
Estimated reading time: 10 minutes

Applying data science to monitoring

By Willie Wheeler in Big Data, Data Science, Devops on July 28, 2016 |

Lately, in collaboration with Karan Shah, I’ve been focusing most of my efforts on operational monitoring. We need to know when bad things are happening, are about to happen, or have been happening for a long time. Monitoring is a good example of a problem that’s easier to state than it is to solve. In practice a lot goes wrong: Sometimes we fail to monitor things that we care about. Sometimes we monitor things we care about, but we route the alerts to the wrong audience. Sometimes we monitor things…

Read More
Published
September 25, 2015
Author
Patrick Bradley, Philippe Deschenes and Rolland Mewanou
Category
Comments
Time To Read
Estimated reading time: 8 minutes

Solving problems with very large java heaps

By Patrick Bradley, Philippe Deschenes and Rolland Mewanou in Big Data, Devops, Lessons Learned on September 25, 2015 |

Users depend on our sites to be up and running at all times. We have many critical services that are required to be up and running in order to deliver that uptime, and thus those critical services need to be able to respond to changes in user behavior and load. This is a story about one of our hotel content services that feeds large amount of hotel content to systems throughout Expedia, how it temporarily ran into problems, and how we fixed it. This content service is responsible for serving…

Read More