AIG: Data Science in the Insurance Industry and Financial Services (CXOTalk #259)

AIG: Data Science in the Insurance Industry and Financial Services (CXOTalk #259)


I’m a
data scientist by training. I used to run data science monetization at
Foursquare. Before that I was the first data scientist
residence at Andreessen Horowitz. The company I now run is The Data Incubator,
which I founded. It’s a Cornell Tech backed data science education
company. We have offices in New York, D.C., and S.F. We do data science training for corporates
who are looking to have a customized, in-house data science training offering. We also help companies with their data science
hiring. My role at AIG, Michael, was about fundamentally
helping reshape the role of human and machine intelligence and decision-making across sales,
underwriting, and claims. In that capacity, I had the privilege of building
what is among the first few C-suite data science functions at a mature, large firm spanning
the globe. Here at BCG, at the Boston Consulting Group,
I’m working in particular with the insurance practice, but generally across industries
with the specific intent of building some IP in the space of analytics as a service. If you think about the traditional construct
of many of the high-end strategy consulting firms, they tend to be much more oriented
toward defined engagements that are time bound and more people intensive. My aspiration in joining BCG is to help them
develop some intellectual property through analytics as a service. I think we need to start this conversation
with some background about the insurance industry to give us context around how data science
is used. Murli, share with us about the insurance industry,
and what do we need to know in the context of data science? Certainly. The core challenge for the insurance sector
is similar to some of financial services. In insurance you’re trying to predict your
cost of goods sold at the point of sale. Getting that right is absolutely critical
in your ability to achieve margins down the road. Anything and everything that you can do to
understand that at its core will give you a significant competitive advantage. Now if you zoom out from that problem statement,
in general there are many similarities in insurance other industries around the role
of data science and machine learning in augmenting human intelligence and making better decisions–more
structured, granular, sophisticated, consistent decisions–in sales and marketing, as well
as in pricing, underwriting, and in claims, which is a significant part of the fulfillment
of the promise that insurance carriers make to their customers. What we call data science today is really
part of a long history of the application of mathematics and computing to industry. When I joined the industry, and I started
my world in finance at Wall Street, back then we used to call these jobs quant roles. You would figure out how to trade in capital
markets, make predictions about which way the stock price would move. I think what we’ve seen is that the tools
and the technologies that we used there were then really adopted in Silicon Valley, really
turbocharged, frankly made, actually, much more usable. Then the cost of computing made it so that
you could apply this not just to a few select problems on Wall Street, but all over main
street, all over the rest of the financial services industry. Really, if we zoom out, as Michael was just
describing, can you talk about some of the similarities between data science in the insurance
industry and other non-insurance data science applications as well, since it seems there
are a lot of commonalities there? Most certainly. The first big dissimilarity, so to speak,
when comparing insurance to other sectors is that the role of the actuarial profession
dates back to the early days when insurance was actually created as a sector. The role of analytics in insurance has largely
been driven by the actuarial function, which brings a certain set of nuanced competencies
and capabilities that are relevant to insurance. The challenge has been that if you were to
think about the broader role that data science could play in particular in the world that
we live in today in insurance, you can actually fundamentally reshape human judgment when
it comes to sales, when it comes to underwriting judgment, and even when it comes to claims
through the lens of data and technology in ways that might not have been feasible 10,
12, 15 years ago. The similarity lies in the fact that, much
like many other sectors, in insurance you’ve got a sales or distribution channel. You’ve got a product channel that is around
pricing the product. Some of that is around your cost of goods
sold, and some of that is trying to understand the market’s appetite and the customers’ demands,
so to speak, or demand elasticity, if you would. Last, but not the least, you’ve got the fulfillment
of that promise that you’ve made that is very, very data rich, so if you break down that
value chain to its core elements, there are similarities to other sectors. Now the difference could be that if you think
about healthcare, for instance, healthcare is much more of a transaction, data rich industry
perhaps compared to insurance because you’re engaging with the customers on a very consistent
basis, just as you are in financial services, in banking, and credit cards and such. The different perhaps between insurance and
these other sectors is, while certainly getting your cost of goods sold right early on is
absolutely critical, you’re not necessarily as data rich, as transaction data rich, as
some other sectors are. Right, but you see this with retail. You see this through the smart phone, and
we were doing a lot of that when I was at Foursquare trying to make that retail brick
and mortar experience a bit more digital through your smart phone. You see this all over the place. I think that that’s going to be a major driver
of a lot of consumer electronics that you’re going to see coming up is the need for companies
to have data is going to drive a lot of those interactions onto smart phones, tablets, [and]
wearables. To build on what you just said, Michael, if
you were to contextualize that to insurance, where I see the big leap in innovation happening
in the next two to three years is around this notion of making much more granular, real
time decisions on the basis of machine learning and by really defining data not just in the
traditional internal structured terms, but thinking of it in four quadrants: internal
and external on one dimension, and structured and unstructured on the other dimension. The ability to build machine learning algorithms
on some of these platforms will reshape what humans do in terms of decision-making and
judgment and where models harmonize or balance human judgment with machine intelligence. The way I would frame it is oftentimes people
think of it as an either/or. But if you were to re-paraphrase machine intelligence
as nothing but the collective experience of the institution manifested through some data,
what it does is brings more consistency and granularity to decision-making. That’s not to say that it would obviate the
role of human judgment completely, but it is to say that that balance, that harmony
should and will look dramatically different two years, three years from now than it has
for the last decade and before that. The next big step-change that I see for this
sector as a whole is evolving from a predictor of risk to an actual risk partner that can
actually mitigate outcomes through the power of real time insights. The most obvious example of that is the role
that sensors can play in providing real-time feedback to drivers of vehicles in a way that
hopefully reduces risky driving and mitigates the likelihood of accidents. To me that is the true power of data science
in insurance. The beauty of that is not only does it mitigate
accidents from happening, or adverse events from happening, but what it does in doing
so is reduces the cost of insurance and expands the reach of insurance to a much broader population,
both in the developed and developing world. To me, that’s a beautiful thing if you think
about society having a much higher level of financial protection across every aspect of
our lives. If we think about what’s new in data science,
that is, why is data science different from or how does data science expand upon things
like the actuarial tradition, like statisticians, the quants of [indiscernible, 00:09:27], I
think it really does kind of come down to this idea that, one, we’re using not just
structure data, so it’s not just SQL queries any more, but it’s semistructured and unstructured
data. How do you start handling things when they
don’t come in nice tables that you can load into Excel or that you can put into SQL? We are also in a world where data is much
larger. You mentioned telematics. If you were taking a reading off of every
car every second, that’s a lot of numbers you’ve got to store, and that’s a very different
paradigm for computation. You start having to think about, how do you
store this data? How do you deal with data now that it’s stored
across multiple computers? How do you think about computation in that
context? Then of course the last thing is always this
idea around real time data. I think that analytics has historically been–you
might call it–kind of a batch process. Run it once; generate a report; show it to
people; you’re done. Now it’s a continuous process. You run it; you have to instantly find the
latest trends; put that into production so that you can adapt to that in an intelligent
way; and then do that again the next hour, the next minute. That’s kind of where competition is driving
you. If you look at what Silicon Valley has been
doing, it is very much your server is constantly learning from user behavior and then able
to adjust how it interacts with users in a way that–to borrow their expression–delights
the user. I think that we’re seeing that. Traditional companies, that is non-tech-based
companies, are having to kind of emulate that kind of level of customer service and satisfaction. I think a lot of that comes down to big data
and being able to have a team that’s capable of understanding how to manipulate this new
type of data faster, more data, different kinds of data in a world that’s rapidly evolving. That’s right, Michael. If you think about the historic definition
of transactional data in healthcare and banking, we know that that’s been at the core of how
they think about analytics for quite a while now. Traditionally, most of insurance has not had
that version. But if you were to zoom out and define data
in a much broader sense that includes images, that includes audio, that includes all sorts
of unstructured data, now insurance has its own version layered on top with IoT and such. Insurance has its own version of transactional
data. The ability to harness that and dramatically
change the cycle time of decision-making, as well as the granularity of decision-making,
is where the goldmine is for insurance in the coming five years or so. I was going to say, no, that’s right. Just to give you one example, we work with
a large consumer bank, both on their trading side and to help them hire their data science
talent. One of the really cool applications they’ve
been able to develop is around merging data that they get from multiple channels, so from
the Web, from their mobile, from tablets, from even in-store visits and phone calls
to customer service. Right? [They can] bring all that data together so
that their customer service representatives can see that in one clear, simple visualization. They know that when a customer calls in; they
instantly get this information and they know that they’ve been having trouble opening a
checking account, [for example]. [Then] they can directly target the question
that the customer would like, and then solve that problem for them. That’s even to the point where, if the customer
has been browsing around trying to get the answer to a question, the answer might actually
just be populated from their knowledge base straight onto their screen so that they don’t
have to have this awkward process of asking the customer of the question, then slowly
searching for it for themselves, but the customer service rep is able just to see that and answer
the question right away. That creates a much more pleasant customer
experience. It certainly makes the customer service reps
at the bank seem far more knowledgeable. I think that that’s just one example of how
you can have so much more data in something where none of that that I just talked about
was the traditional data of transactions and moving money in and out of your bank account. This is all a new type of data and from new
sources that we’re talking about. Murli, we have an interesting comment from
Arsalan Khan on Twitter who is asking about the question of bias because, of course, sensors
have no bias; but when creating models and selecting data, people do. How do we deal with that issue? All models are incorrect; some are useful;
and so the question to me is not about whether the model is perfection personified, but rather,
how much of an improvement is it relative to the status quo? What I’ve seen very consistently in the sectors
that I’ve been exposed to in my career is that if you were to use that litmus test,
models are vastly superior to the judgment-oriented decision-making that occurs certainly in a
good chunk of the insurance value chain today, but also in other sectors too. What we’ve got to teach ourselves is to not
be naïve to the data gods and assume that the models are perfection personified, to
understand where they’re prone to bias or error, but also to realize that if we were
to hold human judgment to the same standards of bias and objectivity that we’d like to
hold models to, it would not be a competition at any level whatsoever. Mm-hmm. The question, to me, isn’t whether the model
is biased or not because models do have an inherent bias because they tend to be biased
by historical experiences, but to ask the question where and when and to what extent
are they biased. The even more important question is, how much
of a step change are they from the caliber of a decision-making that is the current status
quo in that particular part of the value chain? We often of end up wanting to criticize models
for their imperfections in a way that we may not hold up the same scrutiny to, say, human
judgments. There was famous article that just came out
in the last few months about how neural nets that were being trained on large corpuses
of human text were inherently sexist. They would say things; they would have gender
associations with, for example, occupations, or perhaps certain derogatory phrases that
we might really cringe at. I think that that comes down to the models
are being a mirror, right? They are holding up to society the data that
we’re feeding into them. They’re showing it back to us. That’s not just an idle, philosophical point. I think it’s a very real point that, for example,
in an industry that’s highly regulated like finance, there are lots of laws around equal
opportunity for lending. You can’t make judgments that are disproportionately
negatively impact people of a certain race, sex, or various other protected categories. I think that that’s one of those places where
we have to be careful as we train these models that they haven’t then picked up some of the
biases that may be inherent in society that we don’t want to keep. I think that that’s part of the reason why
I don’t really buy this whole scare mongering that these computers will take over all of
our jobs because, in the end, there is this human judgment that comes in where we say,
“Well, okay. That model probably did learn something about
our society, and we would rather that not exist, and so we’re going to tweak that a
little bit and find ways to mitigate those effects.” Murli, this issue of bias, when you’re inside
an organization like an insurance company, what are the practical implications of that
and how do you ensure that the model is as fair as possible and that it doesn’t embody
preconceived bias? Even recognizing what you were just saying
that the question is comparing to what the human would do, it’s still something that
I’m sure you have to be concerned about. Most certainly. I think the imperative is to go into this
process and effort with eyes wide open. I’ll give you a classic example of what you
just described. Oftentimes historically, whether it’s in insurance
or even in financial services, healthcare, credit cards, and such, the ability to detect,
sniff out fraudulent activity has typically depended on human judgment. That’s not to say that human judgment isn’t
valuable. It’s extremely valuable. However, when you then try to augment that
human intelligence with machine intelligence, effectively what you are doing is actually
propagating a little bit of the historic bias that the human judgment has had because you’re
using historic data to be able to predict future fraud based on human judgment in the
past. The way to break that cycle is two-fold. Frankly, when you do use algorithms, it’ll
tease out noise in human judgment. The beauty of noise in human judgment is that
it truly is noise, and it’s very inconsistent. Models have that ability to overcome that
inconsistency in judgment in the past, which is actually a very good thing. The second thing that I would do in that particular
instance, and there are many other analogies to this, is also be very purposeful in creating
a particular random sample that stretches the range of the predictions of some of these
models. That allows you to go to the periphery and
assess how well the models are or are not working based on their predictions so that
you create a virtuous cycle where you’re actually challenging the assumptions of your models. You’ve got some human beings at the other
end of that spectrum coming up with their own judgment and throwing out the gauntlet
to create a feedback loop that will allow the models to get better because they’re invariably
going to miss insights that human beings might have. How you choose your sample size or your sample
is really, really important for this, for data. Just to cite an example from my old field
of being a quant, I think we all know about the 2008 financial crisis and how, if you
train a bunch of models on the bull years, those models may not apply very well in a
situation that’s a bear year. You have to be very careful to realize that
just because the last 5, 10, 15 years of macroeconomic data have been bullish doesn’t mean that next
year’s will be. You have to really think about: How do I stress
test my model? How do I give it examples that may be things
that I expect will happen even if I haven’t quite seen them in my data or in the data
I’ve been training? How do you select that data set correctly
so that you do find representative elements so that your data set isn’t horribly biased
and, therefore, giving you a badly biased model? That’s right, Michael. I think the secret that you and I know quite
well that perhaps is not more widely understood is that there’s quite a bit of art in the
science of data science. Absolutely. That art is absolutely critical because the
biggest risk is one of the blind leading the blind and the data scientists not really appreciating
the context of the historic data and not having a basis in which they could test the efficacy
of the predictions in a different environment than the one that the model was actually built
on. A classic example of that, in addition to
the financial crisis in 2008, is the o-ring debacle with the Challenger space shuttle
where they didn’t actually test the effectiveness of the o-rings in a different temperature
setting, and they ended up extrapolating. Yes. I think that’s really where the magic of human
judgment and machine intelligence actually comes in. As important as the science of the data science
is, the art of the data science is perhaps equally and sometimes even more critical depending
on the consequences of your errors in prediction, whether they’re false positives or false negatives. That’s really where understanding the context;
making sure that you’re asking the right questions and framing them appropriately; and understand
what data you have, what you don’t have, and how that could bias your understanding of
the future; is absolutely critical. Data science is very technical, lots of math,
lots of programming. It’s really far down the rabbit hole of technical
stuff that you could be doing. But as a manager, it’s still very important
to understand it. And so, when we are talking to managers and
we’re training managers on how to do this, one of the things we have to really focus
on is, “Well, you may not be able to understand the probabilities and the nuances perfectly,
but you can understand what happens if you have a false positive or false negative; which
way you’re more willing to make a mistake,” right? Then use that to set your thresholds and your
comfort level about, “Okay, I’d rather have more of one type or the other.” That even goes down to training, right? You can train models that are focused more
on one type of error than another. The ultimate call about which way you want
to go, that’s a business decision, and that’s why it’s such an important lesson for businesspeople
to know about this distinction between false positives and false negatives. That’s right, Michael, which is why I get
very excited about the notion of really pairing data scientists with economists or business
analysts who can really shape the context of how those models are built because you’ve
got issues around stability; you’ve got issues around the standard deviation of some of your
predictions and noise in your predictions perhaps being higher in certain segments;
you’ve got issues around tradeoffs that you make between false positives and false negatives. Depending on the context in which you operate,
you value that dramatically differently. If you do value that dramatically differently,
that difference should actually be reflected in the art of the choices that you make in
practicing data science. The art of understanding those priors, those
assumptions that I’m making, how does that impact the data science; how do the results
that come out of these models; how does that then impact my business decision-making, my
P&L? I think that that kind of high level understanding
is so important on the business side. We’re talking about the business side. What about the organizational issues? How do you introduce this type of thinking
into an organization for which data science is relatively new? I’ll start with my perspective, Michael, and
I’d be delighted for you to jump in as well, please. I don’t think there’s any obvious, easy answers. The challenge that I see today in most mature
firms is you’ve got C-suite leaders that say, “I’m doing data science,” or, “I’m doing data,”
or, “I’m doing digital.” “There’s my chief data officer. There’s my chief digital officer. There’s my chief analytics officer.” The way I would reframe that is you help them
fundamentally recognize that this is not just a separate pillar that you should be thinking
of as being incremental to how you will shape your business strategy. These competencies are in the very near future
or, in fact, even in the here and now. In effect, a mitochondria that will shape
the energy and the life that your firm will have in terms of its sustainability in a world
of data and tech driven disruption. The challenge then is that typically in many
of these large institutions, you’ve got leaders who have risen to those senior positions on
the basis of historic experiences, which are less relevant if you extrapolate them to the
future. And so it really does become an issue around
having the humility to develop much more of a learning mindset; and recognizing that the
more ambitious you are in terms of really re-sculpting and reshaping your competitive
positioning, the more you have to be willing to break glass based on the insights that
you achieved through data science. It is very much of a leadership, courage of
conviction, issue. It’s very much about CXOs having a view on
what legacy do they want to build in that organization, what is their courage of conviction,
and how do they shape the problems and questions that they would like to tackle through the
competencies in a way that will fundamentally shape their sustenance in the medium and longer
term? You need a broad swath of the organization
to understand the value of data, how you use data–think about some of the issues that
Murli and I were just talking about earlier–that really embrace taking time to have their employees
learn about data science and big data. On the cultural side, actually, I’d be curious,
Murli, to ask you this question. I think one of the things that’s maybe unique
about insurance or banking is that there is kind of a legacy of data around the actuarials,
around the statisticians. How does that change the dynamic of creating
a data culture when you have a legacy group that’s somewhat already steeped in this? I think there are two parts to that, Michael. One is, how does that change decision-making
today, and how should that change decision-making tomorrow? If one were to zoom out, in general I think
the actuarial function, the profession, and the exams have not embraced, from my point
of view, the power of data science in its totality the way perhaps they should. Maybe they will, looking into the coming few
years. As you move to that, or if we move to that,
I think it obviates the need for having rigid titles such as an actuary or a data scientist. It infuses, into the fundamental DNA of the
organization, a sense of curiosity and a comfort with challenging one’s own assumptions, and
the ability to consistently ask the question, “What do the data tell us, and where could
the data possibly mislead us?” even if the models seem perfection personified. That’s one piece of it, I think. The other piece of it is, if you disaggregate
the entire value chain of insurance, there’s data science that can be applied to many,
many, many aspects of it that can fundamentally shape the sophistication, timeliness, [and]
granularity of decision-making in ways that the industry could not have imagined a decade
ago. To me, the role of data science is very, very
widespread, even if one were to dodge the traditional domain of the actuarial sciences. Where I’m hoping the industry is going to
head toward is, rather than have this mindset of creating rigid silos or pillars, see that
the competencies are interchangeable and they’re one in the same. Let’s actually move to a world where we’re
challenging; we understand our assumptions and are challenging those assumptions to shape
the caliber, effectiveness, and efficiency of decision-making as opposed to hanging our
hats on what titles we’ve got, what professional credentials we’ve got, or what academic experiences
we have because those are an interesting starting point, but are really not particularly relevant
in a world where everything around us is changing at a more profound pace than ever before. With the actuarials, I think that a lot of
the really farsighted ones, the ones who are really looking to the future, seem to really
understand this and are embracing a lot of these new techniques around data science,
around big data, really looking to challenge the assumptions that maybe their own discipline
has engrained into them through indoctrination. [They’re] really leveraging the existing knowledge
that they have, this really strong knowledge of probability and statistics, and then seeing
how they can apply that to the data science, which of course is very rich in probability
and stats. Much of this falls into the general category
of helping change an organization. Are there aspects of this that are specific
to the data science as opposed to general change issues? The way I would frame that is, Michael, data
science is the engine that is fundamentally reshaping practically every industry that
we know about. The pace at which the aggregation of data
is changing and the definition of data itself is so rapid today that it necessitates this
discussion about the pace at which firms need to fundamentally question their paradigms
on how they’ve made decisions historically. Yes, you’re right. It’s a broader sort of change management issue. If the question is, “Why is this specific
to data science and why isn’t it broader?” the answer is it is broader, but the force
of change that data and technology are imposing upon our society across sectors make this
issue much, much more critical in the here and now than perhaps other forces of change
might. That pace of change is only going to accelerate. If you think about a lot of the secular trends
that are pushing data into the forefront of this conversation, those things are not going
away: the falling costs of storage, the falling costs of being able to transmit data, the
increasing rate of CPUs. The greater on the social side, so the greater
demand by consumers to have instant responses, to be on their phones interacting with their
friends as well as companies in digital ways, I think that trend is not going away. It’s only accelerating. That’s going to be forcing companies to move
more and more in the direction of data. As we finish up, I’ll ask each one of you. Maybe I’ll start with Murli. What is your advice for organizations that
want to adopt data science in a larger way? What should they do? Number one is, develop a sense of humility
about yourselves and evolve from what Carol Dweck would describe as a fixed mindset to
a growth mindset; i.e. please recognize that the future is not an extrapolation of the
past, certainly not at least a linear extrapolation of the past. The fundamental foundations on which you’ve
made decisions and built your businesses are shifting today at a faster pace than ever
before. That requires you to develop, as an organization,
as individual professionals, that mental agility to question your assumptions and to challenge
your traditional paradigms in which you run your functions or businesses. The first critical aspect of that is to develop
that curiosity and ask questions around the art of the possible by drawing from learnings
that you have around innovation across sectors and across fields. It kind of comes down to two basic first steps. The first step: get the data, collect it,
[and] store it, what have you. Second step is to find the talent that’s necessary
to deal with the data, manipulate the data, and be able to come up with actionable insights
from that data. If you can do both of those things, then I
think you will be at least taking the first few steps in the direction of building a data
driven culture. Okay. Well, thank you so much. This has been a very interesting show. We’ve been talking about the ins and outs
of data science. I want to thank our guests, Michael Li, who
is the CEO of The Data Incubator; and Murli Buluswar, who is the former chief science
officer at AIG and currently is working with Boston Consulting Group. Thanks so much, everybody. You’ve been watching Episode #259 of CxOTalk. We will see you next time. Bye-bye.

Leave a Response

Your email address will not be published. Required fields are marked *

1 thought on “AIG: Data Science in the Insurance Industry and Financial Services (CXOTalk #259)”