Data Deduplication For One Of The Most Popular Travel Websites Of The UK

The client:

One of the most popular travel websites of UK.

Client background:

The above mentioned website has been ranked as one of the ten most popular travel websites of the UK by a leading newspaper of the country. It is primarily a hotel comparison site and aims to help vacationers find the best accommodation at the cheapest rates. It has been doing some very brisk business since the time it was launched and is expected to soon outpace all its competitors. More so, since it has attracted some high profile technology investors and is expected to draw more in the near future.

Project requirement:

Identifying and merging duplicate records of hotels on the client’s website.

The challenges:
  • The amount of data was incredibly large as the client had uploaded information on hotels from all over the world on his website. In all, we had to sift through data on 150,000 hotels to identify the duplicated data.
  • Many of the hotels had very similar sounding names. This made identifying the duplicated data a very complicated process.
  • The client’s reputation hinged on the successful accomplishment of the project. We had to do neat and hundred per cent accurate work.
  • The project had to be accomplished as quickly as possible since the work was being done while the website was still live.
The solution:

The solution offered by us was data de-duplication. A team of twenty-five data de-duplication professionals meticulously combed through all the data to identify and remove the duplicated content. At times, the data de-duplication professionals were assisted by web research experts who helped them distinguish between hotels with similar sounding names. The work was coordinated and overseen by a dedicated Project Manager. Every bit of work done was double checked to ensure neatness and consistency.

Find out how SunTecData can help you in making the most out of your critical data. Please get in touch with our experts for a free one-on-one consultation or email us at