InfoSpace, a subsidiary of Blucora, is a leading provider of metasearch and monetization solutions for customers and partners worldwide. Founded in 1996, InfoSpace offers search monetization solutions to a global network of more than 100 partners. The company blends top search results from Google, Yahoo! and other popular search engines to deliver relevant results for customers, such as Publishers Clearing House, Info.com and Iminent, as well as its own branded search sites, which include Dogpile, MetaCrawler, and WebCrawler. InfoSpace is based in Bellevue, WA, and has approximately 120 employees, 52 percent of which are engineers.
Prior to AWS, InfoSpace was using colocation facilities in Washington and Virginia to manage its infrastructure. Contracts with both facilities were due to expire by mid-2013, and power, distribution, and maintenance problems led the IT organization to evaluate the viability of remaining with these data centers. It was also time to refresh server and network equipment and continuing to use the data centers would mean investing more than $1.3 million in capital investments in one year.
InfoSpace processes about 128 million queries per day and collects between 75 and 80 GB of log data daily. Furthermore, the company’s international presence was expanding. "We operate a large global network of partners and traffic and we wanted to be able to locate infrastructure close to partners to improve search response times,” says Wayson Vannatta, Sr. Director of IT and Operations. “Once we reviewed our options, we decided to move to a cloud solution.”
Why Amazon Web Services
After considering several cloud service providers, InfoSpace chose Amazon Web Services (AWS) because of the maturity of the platform and availability of APIs and tools that engineers could use to automate processes. Additionally, as Paul Kearney, Chief Architect of Infospace explains, “AWS has a wealth of knowledge and best practices that we wanted to leverage to run a highly available infrastructure.” After finalizing a proof of concept, InfoSpace started the migration process in January 2013, with the goal of moving all traffic to the cloud by June.
Running a Microsoft stack on AWS
To keep the business running while meeting a tight deadline, the IT organization created a master plan to migrate its data centers’ traffic to AWS and restructured the IT organization’s engineers and operations staff into functional teams for the transition. One team worked on making the search application stack cloud ready while another team developed tools to support the cloud environment. InfoSpace also restructured a team to maintain the current infrastructure and keep partners informed about the move to AWS.
The search application was created using the Microsoft.NET framework and runs exclusively on Microsoft Windows Server 2008 R2. The application’s back-end consists of a set of APIs that accept requests for queries. When a search request comes in from a partner site, the application looks up and retrieves configuration information to identify how the partner wants the search results displayed. After retrieving the configuration information, the search application makes calls to content sources, (e.g., Google or Yahoo), to retrieve results. Then it applies an algorithm to dedupe and order results in a way that is useful to the partner. If the search request is from an InfoSpace site, the application turns the XML search results data into HTML. Partner sites are responsible for displaying the XML results on an HTML page for their customers.
The IT team provisioned Amazon Virtual Private Cloud (Amazon VPC) to create a private section of the AWS Cloud for the application. “Content partners like Google and Yahoo whitelist (register) IP addresses based on origin of request,” explains Kearney. “By using Amazon VPC technology, we can easily maintain a manageable list of IP addresses that are accepted by our providers.” The environment includes Microsoft Amazon Machine Images (AMIs) servers running on Amazon Elastic Compute Cloud (Amazon EC2) instances across multiple Availability Zones in the US East (Northern Virginia), US West (Northern California), and EU (Ireland) Regions.
InfoSpace uses Amazon CloudFront as its content delivery network, Amazon Route 53 for DNS service, and relies on Amazon Simple Storage Service (Amazon S3) to store assets as well as log files. Amazon S3 is also an intermediary transfer point to move log files from Amazon EC2 to its on-premises data warehouse for reporting and analytics. Figure 1 demonstrates the architecture for InfoSpace’s search architecture on AWS.
Figure 1. InfoSpace Architecture on AWS
Before moving to AWS, the engineering team created a test tool called “fire and forget” that sent requests to AWS whenever the application received a request at InfoSpace’s data center environment. The data center request was processed and returned to the user. A duplicate of the request was processed in the AWS Cloud, which allowed InfoSpace to test production level loads that matched actual traffic patterns. Using this tool, engineers were able to identify the capacity requirements for a given traffic level and identify the size and number of instances they would need in each Region. InfoSpace currently uses Elastic Load Balancing to distribute traffic across 490 Amazon EC2 instances.
Optimizing a Windows environment on the AWS Cloud
By May 2013, InfoSpace began an incremental migration with several deployment dates. The company segmented its business into two groups: hosted traffic for InfoSpace branded sites and distribution traffic for partners. After a few pilot tests, InfoSpace moved the bulk of their traffic to AWS within a two and one-half week period.
Following a successful migration, the InfoSpace team started to stabilize the environment, which included refining the release process and moving DNS records to the correct address. InfoSpace uses Sumo Logic to manage over 200 GB of data daily and Chef to automate deployment and configuration processes. “It used to take two weeks to build, configure, and deploy a new machine at our colocation centers. There wasn’t a lot of automation even though the environment was virtualized,” says Kearney. “Now we can take a generic AWS pre-configured Windows Server AMI and use Chef at boot time to install .NET, Internet Information Services (IIS), and our application onto an instance in 20 minutes. Instead of deploying new versions of the application to existing machines, we just create new instances.”
With careful planning, and by working closely with AWS solutions architects, InfoSpace was able to complete a full data center migration, including its Microsoft Windows stack, within 6 months while supporting an over 30 percent increase in traffic. Using AWS, InfoSpace is able to create a global infrastructure to support its international clients. “Using AWS makes our approach to solving problems simpler and quicker,” says Vannatta. “There are a lot of cost and tax considerations when opening up facilities overseas. AWS provides a very easy path to an international presence.”
Search response times have improved for both international and domestic customers. Vannatta estimates that response times for international traffic improved by 20 percent and domestic traffic improved by 10 percent. “Moreover, we estimate that moving to AWS reduced our 2013 capital budget by 72%,” he continues. “We’re able to eliminate the need for 24/7 staffing by automating our monitoring, alerting and response process and we’re trending towards a 32 percent decrease in operational expenses in 2014. When our business unit told us that traffic in South America and Asia was increasing, we knew that we could deploy our application stack into these regions rapidly.”
The IT organization didn’t have cloud service experience prior to moving to AWS. “By working with AWS, breaking down internal boundaries, and staying close to our partners, we were able to do something amazing for our business,” says Vannatta. “Our employees gained cloud experience and we are now seen as a value-add organization instead of a cost center. It’s a much tighter technical organization and I think that AWS allows us to be an even closer and more talented team.”