Big Data: An Overused term with Underused Value

By David Meitz, Managing Director & CTO, ITG

David Meitz, Managing Director & CTO, ITG

Nowhere has the term “Big Data” been more overused than in financial services, particularly in capital markets. The financial services industry produces massive amounts of data daily. Firms in this industry spend millions annually on capturing and storing every imaginable aspect of a transaction. The granularity of data extracted from queries made and transactions processed allows for a wealth of information to be obtained. But how much of that “Big Data” is actually used and are firms investing enough time in understanding the value that data and information provides? Certainly plenty of funds are being allocated to building the Big Data infrastructure. What is the return?

Five years ago Big Data was at the top of conversations in technology and data management. So what has happened in the past five years to capitalize on this phenomenon? From my own interaction with senior technologists at firms similar to ours as well as with suppliers, it seems that there are two distinct approaches. While most of those I speak with can elaborate on their various storage infrastructures (including the use of the cloud and hybrid storage environments) there is a divide when it comes to the use of the data. There are a number of firms that try to capitalize on the data they capture by using it to draw trends, insight into behaviors and for predictive modeling, but I find it interesting how often I hear the comment “we store a lot of data and have a tremendous amount of information available to our firm”, which tells me these firms have little idea what to do with the data they have amassed.

The value in your data comes from first knowing what you want to get out of it. Just as capital sitting on a balance sheet for long periods of time can be a liability for a firm that should be investing and growing - storing troves of data with no strategic plan for its use is equallya liability, a cost center without a clear return.

"Big Data is an extremely valuable byproduct of a highly volatile, hyper-competitive electronic environment, and it has benefitted various firms by leveraging troves of data"

In a trading firm, where I draw my perspective on this topic, certain sets of data have always been captured. There are regulatory requirements around the globe that mandate being able to trace the history of exchange-bound orders. So storing trade data is nothing new. What is new is the amount of data that is captured from what has become a highly volatile, global and often fragmented securities trading environment. The trading environment itself is much more complex and has seen explosive growth in the data generated just over the past decade the development of complex trading algorithms, which can creatively manage large lists of securities to trade without significantly moving the market is one example of how our markets have changed and how modern trading produces much more data. Electronic trading strategies (Trading Algorithms) seek out liquidity across dozens of execution venues. Every accessing of a liquidity pool by the algorithm is a transaction, every transaction produces data. So the “Big Data” question in the case of this particular type of trading becomes: how can the data I have captured help me improve the quality of the trade? And how can that data help right now, while I am still in the process of trading?

This is now where storing a lot of data becomes interesting. Using stored data to help make decisions in real time reveals its potential value. Stored data for historical purposes is one thing, but to leverage data in real time to make better informed decisions can lead to tangible improvements in the form of reduced trading costs.

Algorithms that can leverage both their real-time trading results and historical data (I use the term “historical” loosely as historical data in this context might be only microseconds old) all within the processor can dramatically improve the speed of the decision making and the quality of the trade.

I have focused on leveraging data for the trading environment simply because that is the challenge we face daily, but the strategic value of leveraging data in real-time for decision making potentially has wide applications across the private and public sectors. For example, both physical and information security environments benefit from continually leveraging data sources in real-time.

The value is in asking–and answering - the question of “what value is in the data I have captured?”

To really leverage troves of data to benefit your firm and your clients, a thoughtful approach to investment is necessary. Many firms have already begun to invest in their data centers or in their partner relationships that are hosting data for them. But is that money well spent? Not if the data sits idle. While it can be a costly venture at first, the benefits of using data in real time for decision making is what the promise of Big Data is really all about. What aspects of all of my data do I need to retain? What do I capture in real time? What can be replayed and how quickly or easily can these replays be accessed? What trends can I derive from the data? These questions have to be asked to generate the most value from your data investments.

What I want to know from the next person who tells me how much “big data” they store are three things:

1.What is your strategy for your use of “big data” in real time?

2.What material benefits have you realized from leveraging your data?

3.Over the last 5 years have your returns from data storage offset your investment?

Big Data is not just a passing fad, but an extremely valuable byproduct of a highly volatile, hyper-competitive electronic environment. I believe those firms which will benefit most from the growth of Big Data long term will be those organizations which are able to leverage troves of data for immediate information, analysis and decision making in real-time, and are not just storing terabytes of data for posterity.