Speaking Data Science with an Investment Accent

doi:10.2469/cfm.v28.n4.4

Cynthia Harrington, CFA

CFA charterholders are using investment knowledge to translate data science into business models and career growth.

Can investment professionals use investment knowledge to translate data science into business models and career growth? Some professionals, including CFA charterholders, have already done it, and their experiences can help others find a way.

Key Points

As the investment industry’s use of machine learning and data science accelerates, the growing need for qualified candidates has caused “a big imbalance” between supply and demand, with increasing shortages of professionals capable of filling data-science roles.

For every data scientist that firms hire, they need even more “translators” with the knowledge and skills to connect what the data reveal to real-world business problems.

Some data scientists are seeking the CFA charter to give them domain expertise in investing and markets.

Introduction

The investment industry’s demand for data scientists is outstripping supply. Some data scientists are commanding multimillion-dollar salaries. In Silicon Valley, fresh-out-of-school salaries for data scientist jobs top $100,000 plus benefits and equity stakes, and once employed, new hires are contacted daily by recruiters with half-million dollar offers to switch companies. The talent shortage could reach as many as 250,000 data scientists by 2024, according to “The Age of Analytics: Competing in a Data-Driven World,” a McKinsey Global Institute (MGI) study released in December 2016.

Considering this kind of demand and lucrative compensation, investment professionals need to figure out how to position themselves for these new opportunities.

A Quant by Any Other Name

The first wave of quants was made up of PhDs in physics and computer science who flooded Wall Street in the 1980s and 1990s (after the Cold War ended) to satisfy demand for investment decision making that eliminated human bias. These early investment quants didn’t want to know anything about fundamental investing; they surprised and upset the old order of fundamental investors in their operational hierarchy of analysts and portfolio managers.

The change was abrupt. In fact, Renaissance Technologies became the first quant fund in 1982. Based in East Setauket, New York, the firm made a point of having a name with nothing in it that smacked of investments, capital, or finance. And its approach was also different. PhDs in math and computer science created algorithms that scoured financial and economic data for correlations, and the firm’s hiring process was reputed to screen out finance (and even Wall Street) backgrounds. The company’s performance, however, made their employees rich as assets grew to $65 billion.

New quants build on some of these old foundations in such ways as devising algorithms for automated trading, but with today’s new techniques and tools, modern quants are out of the back rooms and employing much broader datasets from both inside and outside their firms. Recent advances in artificial intelligence (AI) technology, combined with increased computational power, now make it possible to generate trading models that could not be imagined two or three years ago.

Rebrain.AI, an AI-powered investment management company based in San Francisco, is putting new technology to work after a decade of testing with the launch of Darwin in 2017. “Unfortunately, the rigidity of these [older quantitative] models is poorly adapted to the constant evolution of the financial markets,” says Sylvain Morel, CEO at Rebrain.AI and founder and executive chairman at digital media group Adthink, headquartered in Tassin-la-Demi-Lune, France. “Our AI-driven trading models can teach themselves to adapt to changing market conditions without any human guidance or instruction.”

In fact, the recent advances in data-driven asset management represent a third wave of quantitative investing. “The third wave is characterized by AI and deep learning, supported by sophisticated machine-learning networks running on Theano and TensorFlow,” says Aditya Khandekar, CFA, chief analytics officer at Scienaptic in New York City.

Scienaptic offers data analytics services to various sectors on its proprietary Ether platform. Clients from all sectors use the platform to “infuse intelligence in the way organizations decipher customer behavior patterns, design intelligent interventions, and create real value in delivering a superior customer experience.”

The first wave of quants devised statistical models based on the theory-driven normal distribution of data. The second wave brought models based on heteroskedasticity (when standard deviations of a variable are not constant over time) and other complicating abnormalities. To address these conditions, the second wave of quants lightened up on statistical theory in favor of machine-learning models with the goal of optimizing the math.

The third wave focuses on deep learning, supported by advances in machines and software. Graphical processing units (developed for gaming) not only facilitate speed in computing massive calculations but enable big data to classify images and provide video analytics, speech recognition, and natural language processing. Accompanying these foundational technologies are libraries of machine-learning paths (like Theano) so users don’t have to reinvent the wheel for each new process. “What used to take five or six days to converge data can now take a couple of hours,” says Khandekar. “With these tools, we’re able to train large data, be efficient and scalable.”

Data science is not a single body of knowledge. It encompasses a range of new fields. Whereas most early quants were out looking for alpha, new data scientists have a broader mandate. Daniel McAuley, CFA, a manager of data science at Wealthfront in San Francisco, illustrates this point. Wealthfront’s quant team performs essentially the same duties McAuley did while working at a hedge fund. At Wealthfront, this group is called the research team. They all have PhDs and work every day to develop tactical harvesting, automated indexing, and software from the finance perspective. McAuley uses his finance background and top-level programming skills to look at data across the business. They do A/B testing on product features, determine from whom and where growth in the business is coming, and quantify customer acquisition costs and customer lifetime value. They analyze customer-service patterns to project costs and company needs to serve 10 times the number of customers they were able to serve previously. “The old saw in our sector is true,” says McAuley. “If you ask three data scientists to define their job title, you’ll get four answers.”

McAuley, who hires data scientists for his team, says there is a big imbalance between supply and demand when it comes to the need for people who have the necessary skills. What are those skills? The technical skills—being able to write code and construct and understand models—are necessary but not sufficient. Data scientists must also have communication skills so executives trust and understand the output of the models. “Our mission at Wealthfront is to improve decision making using data and the scientific method,” says McAuley. “My team delivers an understanding of how the wealth products are working and why. If our work doesn’t cause anyone to change behavior, then it’s not fulfilling.”

Morel at Rebrain.AI is aggressively hiring to support growth at his fund. The most important goal for him is to build a team composed of very different profiles in order to have as much horizontal skill as possible. Good ideas do not necessarily come from mathematicians or AI specialists but rather from those who take a different view of a problem. Thus, in the coming months, he is looking for market specialists, quants, and computer scientists—as well as biologists, philosophers, and psychologists. “We are still at the very beginning of the AI, and no one can say what the evolutions of tomorrow will be,” he says.

As long as it is still possible, Morel is looking for profiles with the broadest skills and will gradually look for more and more specialized profiles. Rebrain.AI must hire and manage smart, as they charge no annual fees, only performance fees. “We also hire data scientists with the understanding that a single individual cannot master the end-to-end AI production chain and that knowing what will happen in one year is very difficult.”

Lost in Translation

At early, systematic quant funds like Renaissance, the quant’s daily tasks included cleaning up securities pricing data, creating algorithms, and backtesting to identify signals that would become the foundations of investing algorithms. The final task was to devise algorithms that would transact trades at the precise moment of best price advantage. The old quant worked in a context where data was scarce, even commoditized, because everyone looked at the same securities pricing and corporate-activities data. Quants got an edge by cleaning scarce data better than the competition and by devising algorithms that made best use of the valuable (though small) cache of past prices.

Some of today’s data scientists consider this specialized focus to be esoteric. Though some still want to trade by understanding financial markets, GDP, open volume, monetary rates, and bond prices, today’s data scientist has less subject specialization and broader domain expertise.

Driving the broader scope of knowledge is the abundance of high-quality data available. No longer content with running algorithms on a closed set of pricing data, the new data scientist scours the marketplace for unique datasets that best inform the question needing to be answered. “With faster machines and an unending supply of highly granular data, quants don’t want to do the predictive models on scarce data anymore,” says Sri Krishnamurthy, CFA, chief data scientist and president of QuantUniversity.com in Boston. “Now they want it all, like the rich sentiment data [gathered from social media] and geospatial-specific data, to make sense of outputs.”

Krishnamurthy also provides data analytics services for customers through his company. In that role, he reviews and evaluates vendors. As a way to illustrate his different perspective on data, he shared a report from JP Morgan evaluating data vendors. The report states, “[RavenPack’s] new platform uses proprietary sentiment analysis technology to ‘monitor market-moving events and quickly surface insights by combining a wide variety of datasets, including stock prices, geopolitical events, news flow, social media activity, payments data, weather, apps, and data from the Internet of Things.’” JP Morgan used the service and presented the results of their tests at a RavenPack research symposium using three strategies: (1) trading equity indexes with an annualized return of 5.6% and Sharpe ratio of 0.44, uncorrelated with traditional risk factors; (2) trading developed-market currencies with a contrarian long–short portfolio, similarly uncorrelated; and (3) trading sovereign bonds, from which the use of RavenPack services generated alpha streams uncorrelated to traditional risk factors.

The explosion of big data and machine-learning theory also means a parallel evolution of new ways of computing. Examples include the neural-network gradient method of identifying and optimizing model parameters to minimize the order errors in estimation output, numerical partial derivatives that help data scientists understand local minimums, and trained networks that classify data and then predict outcomes. “These are the directions of the next wave; deep learning and self-learning solutions incorporate cognitive computing,” says Khandekar. “This is leading to more sophisticated predictive models.”

Today’s business world is looking to data scientists for more than just predictive analytics. MGI’s recent study advises employers that for every data scientist they need to hire, even more “translators” will be needed to connect what data reveals to real-world business problems. (In fact, the report estimates a demand for two to four million such translators in the US alone over the next decade). The data scientist is called on not only to analyze data but to create outputs in a format that supports decisions. This task crosses into the domain of business analytics. Visualizing the output in a single dashboard and including descriptors that solve such operational problems as network inclusion, security, and marketing questions will be of key importance.

"You'll Need to Learn How to Program"

Data scientists with more than about five years’ experience have traveled a self-defined path in this brand new field. The stops on the path may have been similar—an undergrad or graduate degree in computer science, frequently an MBA or master’s degree in finance, perhaps some programming courses in neuro-linguistic programming and big-data languages, and/or some work experience using data to answer business questions. But they typically have had no formal training in data science. Now, however, the demand for data scientists has professionals scrambling for new courses and designations. Online courses, accredited colleges, and brand new schools and certification programs are springing up to meet the demand.

Also, in a complete reversal from the early quants, data scientists are adding the CFA charter to their credentials to claim domain expertise in investing and markets. And the CFA Program curriculum has been expanding to include content and case studies for fintech applications. For investment firms (and especially fintech firms), Krishnamurthy, Khandekar, and McAuley all contend that the CFA Program provides a unique background for translating business and investment-market contexts.

In its report “The Age of Analytics,” MGI notes that in answer to employers’ demands, degrees in data science and analytics grew by 7.5% between 2010 and 2015. Several top universities, including NYU, Carnegie Mellon, and Columbia, now offer master’s degrees in data science. Despite this increase in the number of programs (now in the hundreds), business leaders surveyed by MGI said finding and retaining analytics talent was far more difficult than in other areas.

Necessity being the mother of invention, savvy entrepreneurs have been moving in to fill the void. For Krishnamurthy, the task of educating clients with whom he consulted became his major business offering. During five years as a senior consultant with MATLAB, he worked extensively for financial institutions and banks. In this role, he repeatedly saw the need for—and heard clients request—big solutions to big data. “I saw another solution was to teach them how to fish,” he says.

The result is QuantUniversity, where Krishnamurthy is chief data scientist and president. QuantUniversity offers “customized training courses and workshops for executives to enable them with the tools needed to handle data and implementation challenges when designing and developing quantitative solutions.” He further trains students and executives as a data science professor at Northeastern University, which offers a certificate in data science, and has earned the certified analytics professional (CAP) designation from INFORMS, the leading international association for operations research and analytics.

Gaining skills or even awareness of the new field of data science is important for everyone. MGI warns that even if a job title remains the same, the skills needed for all jobs will soon be dramatically changed. In a December 2016 article for Education Week, Benjamin Herold predicts that as “data-driven automation yields new advances in machines’ ability to process natural language, recognize patterns, and even sense human emotion, everyone from administrative assistants to lawyers to industrial engineers will see core aspects of their daily work evolve or disappear.”

Facebook, a major employer of new data scientists, readily hands out advice on how to prepare for such a job. In another December 2016 blog post, the company highlighted the daily use of artificial intelligence on the Facebook platform by the average person and teed up a series of videos on AI they are producing. They warn students and professionals alike to prepare. “Take all the math class[es] you can possibly take, including Calc I, Calc II, Calc III, Linear Algebra, Probability, and Statistics. Computer science, too, is essential; you’ll need to learn how to program. Engineering, economics, and neuroscience are also helpful. You may also want to consider some areas of philosophy, such as epistemology, which is the study of what is knowledge, what is a scientific theory, and what does it mean to learn.”

With the potential existential threat to traditional finance that technology poses, data-science education also enhances job security and supports higher salary demands through business knowledge and data expertise across verticals. Keeping up with knowledge across sectors means data scientists can satisfy demand from such disparate fields as health care, retail, and banking, and they can even help solve such social problems as poverty, climate change, and the challenges of urban living.

Even though programs are expanding now, many working in the field of data science today are self-taught for some of the skills. Khandekar’s background spans computers and finance. He started as a computer engineer coding and programming in the early 1990s. Next, he gained an MBA in finance. As part of the MBA, he had to choose to follow a technical leg or marketing or general management. “I could have kept with IT or could have gone the business strategy route with a firm like Cap Gemini,” he says. “Once I got into the case studies in corporate finance I realized I enjoyed the application of the numbers and logic to the business side.”

His subsequent job in corporate management collapsed in the tech crash of the early 2000s and the bursting bubble led him to pursue the CFA charter on a whim. As he was working in finance and coding strategic projects, he got deeper and deeper into business analytics. His work was driven by systems and business model forecasting, and he was defining and tracking key performance indicators. With the CFA charter, an MBA, coding experience and education, and experience working in banking space with statistics-based programs, the analytics bug bit him. “I finished the CFA Program but had no interest in traditional portfolio management,” he says. “In fact, I was doing work in the banking space and working with a statistics-based program to contextualize a lot of the business issues.”

Krishnamurthy traveled a similar path. With a BS in mechanical engineering and master’s degrees in both computer science and computer systems, he pursued an MBA. During the business administration masters’ program, he joined a team from Babson College that won the first CFA Institute Investment Research Challenge in 2007. “I was a quant first, trained in computer science,” he says. “We were applying financial models to investing. We analyzed a proposed divestiture for ADI and won because the judges hadn’t seen anything like it.” Krishnamurthy says he was lucky to work on his team with a fundamental analyst as the team analyzed drivers of the company: “I still look at the CFA Program as the basis for rapidly understanding financial markets and underlying relationships.”

McAuley has a BA in international finance and then got an MBA from Wharton. For his senior thesis in undergraduate school, he and a partner analyzed the leveraged ETF, a fairly new financial product at the time. Few people understood how to use the new vehicle and even fewer understood the risks. Through analysis and modeling, the pair learned that the relationship between the underlying assets and the leveraged ETF broke down over time because of the volatility of the underlying assets.

To accomplish the research, McAuley taught himself statistics, math, and programming. Out of school, he convinced Verus Analytics (founded by serial fintech entrepreneur Dr. Carr Bettis) to hire him without a computer science degree or PhD. “While I was the only person without a PhD, I was not the only person without the CFA charter,” says McAuley. “I learned to write algorithms, but I couldn’t have done that without knowing how to pore through financial statements. The CFA Program was directly pertinent to being able to apply the technical skills at the job.”

Now, as director of data science at Wealthfront, McAuley looks for one skill above all others—a potential hire’s desire to learn on their own. He sees prospects from business school who want to make the transition into an analytics role. They’ve taught themselves how to automate, or maybe they’ve written a script to enter data. They’re spending time with online courses in basic type machine learning or R or Python. With these skills, according to McAuley, a job candidate is 80% to 90% of the way to being qualified. When he sees CFA charterholders applying to work as one of his four team members, he knows they’ve satisfied both the requirement of being passionate about finance and a mastery of basic statistics. “They’ve also worked a few years and have spent three years of intense self-study,” says McAuley. “They’ve suffered through the program and that proves they’re going to keep learning.”

For those who want to go into data science, finance and fintech are relatively untapped areas, according to McAuley. And though he gets 30 applicants for each job he posts, he realizes a career in finance isn’t as sexy as working for other, more well-known consumer applications, at least at first glance. But with developing fintech comes greater opportunity.

“Finance is one of the last industries to be impacted,” McAuley says. “At Facebook, you might be part of a team of hundreds and hundreds stuck on optimizing a search function bar. In finance, even as an intern in data science today, you will be working on projects that change the direction of not just your business but of the industry.

“It might be riskier,” he adds, “but with risk comes reward.”