Evaluating the need for speed: When does IoT data analysis require a supercomputer?

Tools

Internet of Things devices and sensors generate a lot of data, and analyzing all of it to gain actionable insights can be an imposing prospect.

While many businesses might be content to use small investments in data analysis to gain insights at a leisurely pace, businesses that need rapid or complex analysis are in a position to benefit from growth and advancement in the supercomputing industry.

That's the assessment from Barry Bolding, Cray senior vice president and chief strategy officer, who argued that businesses ought to consider a supercomputing data analysis solution when speed is a critical factor.

"If you want to move data from the Internet of Things in and out of a simulation or model very quickly, having a Cray [supercomputing] resource rather than a cloud resource can actually remove bottlenecks," Bolding said.

That's an important distinction because many businesses, even those for whom speed is somewhat important, are defaulting to a cloud solution after concluding that the added speed of supercomputing doesn't justify the higher cost.

However, according to IDC analyst Steve Conway, companies making that judgement might simply not fully understand the cost-benefit tradeoff that high-powered computing could provide them. "Even in big companies, there are archaic perceptions about what supercomputers are," Conway said.

Of course, that's not to say that every business that wants to analyze IoT data needs a supercomputer. Ultimately, companies that could benefit from a supercomputer are those that need data insights delivered to them at breakneck speeds, while businesses with more modest IoT data needs, or those that can afford to let insights trickle in, can safely stick with a cloud-based solution.

When is a supercomputer warranted?

One of the oldest and largest supercomputer vendors, Cray has been in the space since 1972, accounts for 31 percent of the system share and manufactures five of the 10 most powerful supercomputers in the world.

For years Cray and the supercomputer market overall limped along, threatening to be eclipsed by ever more powerful and less expensive smaller computers. That's changed recently, though. "The market has really been on a tear, and continues to be on a tear, and there's a reason for that. It's not because government is buying a whole lot more, it's because industry is buying a lot more," Conway said.

Once an arena of multi-million dollar systems used only by a select few government bodies, 80 percent of supercomputers now cost under $100,000, according to Conway.

That puts them in an affordable place for many enterprises, although whether the cost economics work out in a supercomputer's favor varies tremendously from use case to use case.

IoT is one use case that can make a good fit for supercomputers, since it generates so much data. Still, businesses are using IoT in so many different ways that drawing a clear line about when supercomputing resources might help is a tricky proposition. But on a broad level, some companies can afford to wait a little longer for data analysis reports, while for others, data analysis is a critical part of the day-to-day.

Consider connected cars, which depend upon huge quantities of traffic data passing back and forth to operate. Those supercomputer systems need to take in traffic data from each car connected to them, analyze the data to undergo their live traffic management, then use insights from that analysis to send route information back to that car.

Furthermore, if that analysis doesn't happen near-instantaneously, the route information can come back to the car too late for it to be of use – the car will already have moved on. That's why, according to Conway, traffic data is a prime use case for supercomputers because it depends so heavily on real-time data insights. 

If you'll pardon the phrasing, that need for speed is a determining factor in whether a use case requires a supercomputer or not. Bolding included it in three criteria he broke down for why a company would need a supercomputer:

  1. Complex simulations that would need to be restarted if a hardware component goes down, as often happens in distributed, or cloud, computing.

  2. Cases where actionable data is needed faster than a cloud-based simulation can provide.

  3. Analyses that need to run continuously, thereby justifying the cost of a supercomputer.

Bolding suggested that if you're using a system continuously, and therefore would be paying high monthly services for a cloud solution, then the "one-time cost" of a supercomputer would be cheaper in the long run.

That's an analysis based on Cray's in-house cost prediction models that it doesn't share with the public, so take that with a healthy dose of skepticism. But Conway agreed that there are some use cases that make supercomputers the most cost-effective option.

When should companies default to the cloud?

There are plenty of use cases that don't fit Bolding's criteria – the storefront that's using in-store sensors to track data on customers coming and going isn't going to need a supercomputer. And to a certain extent, Bolding conceded that.

"There are many Internet of Things applications where speed isn't the most important thing," he said. "We're not even attacking the small to medium businesses at this point."

But interestingly, some companies that do fit Bolding's criteria are also choosing to avoid the supercomputer route.

For instance, while GE uses supercomputing resources in its research, it depends on its cloud-based Predix engine for a number of programs that run on data generated from IoT sensors. The GE departments using Predix rely on that data analysis for their day to day operations, but don't feel that the added speed value of a supercomputer would justify the higher costs.

"The use cases I think we're looking at don't necessarily need that extra horsepower yet," said Wesley Mukai, chief technology officer of software at GE Transportation. "So far we haven't seen where we need to have more computer power."

Mukai did say that depending on a cloud-added layer of difficulty in developing the platform, as programmers needed to correct for the possibility of a component going down, but the cost-economics required that they take that route.

Conway suggested that may not actually be the case. He said that there's a real lack of awareness about the capabilities and costs of current-day supercomputers. "Of companies that have moved up to supercomputers, a high proportion were because someone had a background in supercomputing," he said.

Still, "you shouldn't overbuy," Conway said. "People can get along without a supercomputer even if they have large volumes of data."

In fact, big data cases can be ideally suited to a cloud solution. Conway pointed to Novartis, a pharmaceutical company, which used Amazon Web Services to analyze 21 million molecules in a four-hour period.

That's not an IoT use case, but it's an example of a solution that worked great in the cloud, according to Conway, because it didn't matter if a component went down – testing each molecule was its own process, so the entire research didn't need to restart if a component went down part way through. And while the company paid a sizable sum for the testing – around $20,000, according to Conway – it could have paid less if it it had the patience to wait longer for the results.

And that's the key factor: The cloud can handle big questions, and up to a point, it can handle complex ones. But if speed is a critical factor, and a company needs an answer back not in four hours, not in an hour, but in a couple minutes or seconds, then it likely needs a supercomputing solution.

For more:
- learn about GE's Predix Engine
- check out Cray's product line

Related Articles:
Dell wants to make high performance computing more broadly available for enterprises
Computer makers showcase ARM-based servers
IoT spending to top $1.3T in 2019, IDC forecasts