Expanding the field of complex event process software with another offering, Twitter will release as open source its software for analysing live large-scale data streams, called Storm, the company said.
Twitter acquired the software when it purchased BackType in July. BackType offered a service that analysed the impact of an organisation’s Twitter feed, summarising how often Twitter messages were repeated by others, the company explained.
Although the software has been compared to Hadoop, Storm is best suited for analysing live data streams, such as millions of Twitter feeds, the company pointed out.
This approach could provide a speedier and more practical alternative to the traditional approach of real time analysis, which can involve storing the data first in a database, data store or data warehouse. Its use is not limited to Twitter, however. Storm could be used to study other forms of unstructured, frequently updated data, the company reported.
“The beauty of Storm is that it’s able to solve such a wide variety of use cases with just a simple set of primitives,” said Nathan Marz, lead engineer at BackType which was acquired by Twitter in July of 2011.
“The user creates a query, or search term, that will continue to run against an ever-updating stream of data until the query is terminated. Because Storm can be distributed across multiple servers, it is capable of analysing large amounts of data,” he explained.
This sort of analysis is sometimes called complex event processing (CEP). Oracle, StreamBase, SAP and other companies also offer CEP software. “Unlike most of these products, however, Storm does not have a built-in storage layer, relying instead of external data stores,” Marz pointed out.
Twitter will launch the software next month at the Strange Loop conference in St. Louis, the company announced.