CrowdStrike is a leader in cloud-delivered, next-generation services for endpoint protection, threat intelligence, and response. The CrowdStrike Falcon platform stops breaches by preventing and responding to all types of attacks—both malware and malware-free. The company has revolutionized endpoint protection by combining next-generation anti-virus technology with endpoint detection and response, coupled with a 24/7 managed hunting service, all delivered via the cloud in a single integrated solution. Falcon uses the patented CrowdStrike Threat Graph™ to analyze and correlate billions of events in real time, providing complete protection and five-second visibility across all endpoints. CrowdStrike Falcon is currently deployed in more than 170 countries.
To proactively stop cyberattacks, CrowdStrike relies heavily on machine learning to analyze data for Falcon Host, a software-as-a-service (SaaS) endpoint protection solution designed to integrate seamlessly into customer environments. Using Apache Spark—the open source, big-data processing engine—CrowdStrike performs feature extraction of machine learning workloads to classify event data sent from Falcon Host. “We use machine learning and behavioral analysis techniques to get the full context of what an attacker is trying to do and give customers a deeper understanding of what’s going on,” says Dr. Sven Krasser, chief scientist at CrowdStrike. As a startup, however, the company needed more agility to effectively perform analysis with Spark. He says, “Our team must stay focused on solving hard security problems, so we use AWS to reduce operational overhead.”
The company also needed a more agile way to support its Apache Cassandra distributed database system that is the foundation of the CrowdStrike Threat Graph. “We use Cassandra to help us get an idea of the current state of a customer’s environment,” says Jim Plush, senior director of engineering for CrowdStrike. “However, we needed the ability to start and stop Cassandra instances when we wanted, so we could have the flexibility to rebuild the environment if, for example, a virtual machine wasn’t working.” CrowdStrike also needed to find a new solution for storing Cassandra data. “We knew we were going to have petabytes of storage in Cassandra, but we needed to find a cost-effective solution for storing that amount of data,” says Plush.
Additionally, CrowdStrike needed more scalability for its Falcon Host environment. “Doing the amount of processing we need to do in a traditional data center would be very challenging because the data set is growing very fast,” says Krasser. “We needed to be able to quickly scale to meet the demands of the data coming in.” High availability was also a major concern for CrowdStrike. Plush says, “We always need to maximize uptime and availability.”
Why Amazon Web Services
From its founding, CrowdStrike was determined to take advantage of the cloud to revolutionize the security industry by being able to quickly aggregate threat information and deliver groundbreaking new protection for customers, without the scalability and performance challenges of on-premises solutions. “If you’re a startup and you’re not in the cloud, you’re already behind the competition,” says Plush. “We knew the cloud could help us spin up resources when we needed them, pivot where we had to, and not use capital expenditures to build our own data centers.” The company decided that Amazon Web Services (AWS) offered the best cloud solution for its purposes. “AWS is the gold standard for cloud computing,” says Krasser. “And as a startup, we were interested in AWS because there are so many services provided. That made it very easy for us to get off the ground quickly.”
CrowdStrike initially began using Amazon Elastic Compute Cloud (Amazon EC2) instances for its Falcon Host environment. The company also chose to run its Spark implementation in Amazon Elastic MapReduce (Amazon EMR), a web service that simplifies big data processing by providing a managed big data framework. CrowdStrike pushes data from its sensors to the Amazon Simple Storage Service (Amazon S3), and then uses Amazon EMR with Spark to process hundreds of terabytes of event data and roll it up into higher-level behavioral descriptions on the hosts. From that data, CrowdStrike can pull event data together and determine if there is malicious activity present. “We can stream the data into Amazon S3 and go back in and analyze it,” says Krasser. “We run three clusters operated by different internal groups, and they all tap into the same data pool.”
Most recently, CrowdStrike began moving its Cassandra database from local instance stores to Amazon Elastic Block Store (Amazon EBS), which provides persistent block-level storage volumes for use with Amazon EC2 instances. “We looked at other options, but it came down to cost,” says Plush. “Amazon EBS offered the performance we needed, at a third of the cost of the SSD-backed instance storage.” Even so, CrowdStrike had to overcome some concerns. “Availability is our number-one concern and Amazon EBS historically had some challenges,” says Plush. “But after talking with the EBS team and learning more about the new capabilities in EBS, including independent failover protection for availability zones, we felt very confident with how much work had gone into ensuring a stable product. In our experience over the past year, we have never encountered EBS unavailability.”
CrowdStrike also paired Amazon EBS with the newer, compute-optimized C4 instances of EC2. “The C4 instances give us the best balance of performance and memory for Cassandra,” says Plush. “We needed the CPUs to handle write loads of one million per second.”
By using AWS to support Falcon Host, CrowdStrike now has the agility to quickly spin up a new Amazon EMR cluster when needed. “It’s very convenient for us to add some task nodes to our Amazon EMR cluster if, for example, a processing job runs behind,” says Krasser. “That agility is critical for us when we want to try something new using large-scale analysis we hadn’t provisioned for. With EMR, it’s as simple as running a script, and the cluster is configured to run the new kinds of jobs we want to run. That means we can move a lot faster on validating a new hypothesis or training model.”
Similarly, CrowdStrike has more agility in managing its Cassandra environment. “Running Cassandra on Amazon EBS, we can pick and choose our instance types when we want to experiment, without having to reload all the data,” Plush says. “We recently had a situation where we couldn’t expand our cluster anymore because of a bug in Cassandra. Using Amazon EBS, we were able to quickly bring up a separate 100-plus node cluster. Then, after a few weeks, we got rid of the old cluster. All of this was done without affecting performance. We could never have done that in a traditional physical environment, because it would require a large maintenance window and a lot of downtime. And, ultimately, that would have cost us a lot of our developers’ time in terms of developing new features.”
The company is also taking advantage of the scalability of AWS to meet its data growth demands. “We’re growing very rapidly, and with AWS we can scale quickly without having to do a lot of capacity planning,” says Krasser. “In a traditional data center environment, you always need to overprovision to plan for the worst-case scenario. But with AWS, if that scenario happens, you can still provision more instances without needing to install new hardware.” That scalability also helps the company support its growing Cassandra database. “With Amazon EBS, we started with a smaller cluster that could handle the performance we needed initially, and as our Cassandra data scaled, we were able to easily add new nodes and increase capacity and performance,” says Plush.
In addition, the organization is benefiting from having higher availability for its machine learning and Cassandra environments. “We are getting the high availability and security we need by being on AWS,” says Krasser. “We can replicate geographically within a region across multiple availability zones. In a traditional data center environment, it would have taken us years to get that kind of availability and redundancy baked in.”
As the company continues to grow, CrowdStrike anticipates adding other AWS services. “When we started out as a security company in the cloud, people were skeptical,” Plush says. “But now more companies are realizing that the cloud is actually more secure than many data centers, and it provides agility and scalability as well. We’re seeing more of a shift to the cloud, which shows we were ahead of the curve when we went to the cloud. We’re looking forward to continuing to add more AWS services in the future.”
“We’re growing very rapidly, and with AWS we can scale quickly without having to do a lot of capacity planning.”
Dr. Sven Krasser, Chief Scientist