Gianthomas Volpe & Bertrand Cariou | DataWorks Summit Europe 2017

Gianthomas Volpe & Bertrand Cariou | DataWorks Summit Europe 2017


(upbeat music)>>Announcer: Live from Munich, Germany, it’s the Cube covering DataWorks Summit Europe, 2017. Brought to you by Hortonworks.>>Hey, welcome back everyone. We’re here live in Munich, Germany, at the DataWorks 2017 Summit. I’m John Furrier, my co-host
Dave Vellante with the Cube, and our next two guests
are Gianthomas Volpe, head of customer development
e-media for Alation. Welcome to the Cube. And we have Bertrand Cariou, who’s the director of solution marketing at Trifecta with partners. Guys, welcome to the Cube.>>Thank you.
>>Thank you for having us.>>Big fans of both your
start-ups and growing. You guys are doing great. We had your CEO on our big data SV, Joe Hellerstein, he talked about the rang, all the cool stuff that’s going on, and Alation, we know Stephanie
has been on many times, but you guys are start ups
that are doing very well and growing in this ecosystem, and, you know, everyone’s going public. Cloud Air has filed their S1,
great news for those guys, so the data world has
changed beyond Hadoop. You’re seeing it, obviously Hadoop is not dead, but it’s still going to
be a critical component of a larger ecosystem that’s developing. You guys are part of that. So I want to get your thoughts of why you’re here in Europe, okay? And how you guys are working together to take data to the next level, because, you know, we’re
hearing more and more data is a foundational
conversation starter, because now there’s
other things happening, IOT, business analysts, you
guys are in the heart of it. Your thoughts?>>You know, going to be you.>>All in, yeah, sure. So definitely at Alation
what we’re seeing is more and more people
across the organization want to get access to the data, and we’re kind of breaking out of the traditional roles around
IP managing both metadata, data preparation, like
Trifecta’s focused on. So we’re pretty squarely focused on how do we bring that access
to a wider range of people? How do we enable that social
and collaborative approach to working with that data, whether it’s in a data lake
so, or here at DataWorks. So clearly that’s one of the main topics. But also other data sources
within the organization.>>So you’re freeing the data up and the whole collaboration thing is more of, okay, don’t just
look at IT as this black box of give me some data and now
spit out some data at me. Maybe that’s the old way. The new way is okay, all
of the data’s out there, they’re doing their thing, but the collaboration is for the user to get into that data you know, ingestion. Playing with the data, using
the data, shaping the data. Developing with the data. Whatever they’re doing, right?>>It’s just bringing transparency to not only what IT is doing and making that accessible to users, but also helping users collaborate across different silos within
an organization, so. We look at things like logs to understand who is
doing what with the data, so if I’m working in one group, I can find out that somebody in a completely different
group in the organization is working with similar data, bringing new techniques to their analysis, and can start leveraging that and have a conversation that
others can learn from, too.>>So basically it’s like a
discovery platform for saying hey, you know, Mary in department X has got these models. I can leverage that. Is that kind of what
you guys are all about?>>Yeah, definitely. And breaking through that,
enabling communication across the different
levels of the organization, and teaching other people at all different levels of
maturity within the company, how they can start interacting with data and giving them the tools to up skill
throughout that process.>>Bertrand, how about the Trifecta? ‘Cause one of the things
that I find exciting about Europe value proposition
and talking to Joe, the founder, besides the fact that they all have GitHub on their about page, which is the coolest thing ever, ’cause they’re all developers. But the more reality is is that a business person or
person dealing with data in some part of a geography, could be whether it’s
in Europe or in the US, might have a completely different
view and interest in data than someone in another area. It could be sales data,
could be retail data, it doesn’t matter but it’s never going to be the same schema. So the issue is, got to take that away
from the user complexity. That is really fundamental change.>>Yeah. You’re totally correct. So information is there, it is available. Alation helps identify what is the right information that can be used, so if I’m in marketing, I could reuse sales information, associating maybe with
web logs information. Alation will give me
the opportunity to know what information is available
and if I can trust it. If someone in finance is
using that information, I can trust that data. So now as a user, I
want to take that data, maybe combine the data, and
the data is always a different format, structure, level of quality, and the work of data wrangling
is really for the end user, you can be an analyst. Someone in the line of
business most of the time, these could be like some of the customers we are here in Germany like Munich Re would be actuaries. Building risk models and
or claimed for casting, payment for casting. So they are not technologies at all, but they need to combine these data sets by themselves, and at scale, and the work they’re doing, they are producing new information and this information is used
directly to their own business, but as soon as they
share this information, back to the data lake, Alation
will index this information, see how it is used, and
put it to this visibility to the other users for reuse as well.>>So you guys have a partnership, or is this more of a
standard API kind of thing?>>So we do have a partnership, we have plan development on the road map. It’s currently happening. So I think by the end of the quarter, we’re going to be
delivering a new integration where whether I’m in
Alation and looking for data and finding something
that I want to work with, I know needs to be prepared I can quickly jump into
Trifecta to do that. Or the other way around in Trifecta, if I’m looking for data to prepare, I can open the catalog,
quickly find out what exists and how to work with it better.>>So basically the relationship,
if I get this right is, you guys pass on your
expertise of the data wrangling all the back processes you guys have, and advertise that into Alation. They discover it, make it surfaceable for the social collaboration
or the business collaboration.>>Exactly. And when the data is wrangled, it began indexed and so
it’s a virtual circle where all the data that
is traded and combined is exposed to the user to be reused.>>So if I were Chief Data Officer, I’d say okay, there’s
three sequential things that I need to do, and you can maybe help
me with a couple of them. So the first one is I need to understand how data contributes to the
monetization of my company, if I’m a public company
or a for profit company. That’s, I guess my challenge. But then, there are other two things that I need to give people
access to that data, and I need quality. So I presume Alation
can help me understand what data’s available. I can actually, it kind of
helps with number one as well because like you said, okay,
this is the type of data, this is how the business process works. Feed it. And then the access piece and quality. I guess the quality is really
where Trifecta comes in.>>GianThomas: Yes.>>What about that sequential
flow that I just described? Is that common?>>Yeah>>In your business, your customer base.>>It’s definitely very common. So, kind of going back to
the Munich Re examples, since we’re here in Munich, they’re very focused on
providing better services around risk reduction for their customers. Data that can impact that risk can be of all kinds from
all different places. You kind of have to think
five, ten years ahead of where we are now to see
where it might be coming from. So you’re going to have a ton of data going in to the data lake. Just because you have a lot of data, that does not mean that people
will know how to work with it they won’t know that it exists. And especially since
the volumes are so high. It doesn’t mean that it’s all coming in at a greatly usable format. So Alation comes in to
play in helping you find not only what exists, by automating that process of extraction but also looking at what data
people are actually using. So going back to your
point of how do I know what data’s driving value
for the organization, we can tell you in this schema, this is what’s actually
being used the most. That’s a pretty good
starting point to focus in on what is driving value and
when you do find something, then you can move over
to Trifecta to prepare it and get it ready for analysis.>>So keying on that for a second, so in the example of Munich Re, the value there is my
reduction in expected loss. I’m going to reduce my risk, that puts money in my bottom line. Okay, so you can help me with number one, and then take that Munich
Re example into Trifecta.>>Yes, so the user will be the same user using Alation and Trifecta. So is an actuary. So as soon as the actuary
items you find the data that is the most relevant
for what you’ll be planning, so the actuaries are
working with terms like development triangles over 20 years. And usually it’s column by column. So they have to pivot the data row by row. They have to associate
that with the paid claims the new claims coming in, so all these information
is different format. Then they have to look at
maybe weather information, or additional third party information where the level of
quality is not well known, so they are bringing data in the lake that is not yet known. And they’re combining all this data. The outcome of that work, that
helps in the Reese modeling so that could be used by, they could use Sass or
our older technology for the risk modeling. But when they’ve done that modeling and building these new data sets. They’re, again, available to the community because Alation would
index that information and explain how it is used. The other things that
we’ve seen with our users is there’s also a very strong, if you think about insurances banks, farmer companies,
there is a lot of regulation. So, as the user, as you
are creating new data, said where the data coming from. Where the data is going, how
is it used in the company? So we’re capturing all that information. Trifecta would have the
rules to transform the data, Alation will see the
overall eye level picture from table to the source
system where the data is come. So super important as well for the team.>>And just one follow up. In that example, the actuary, I know hard core data
scientists hate this term, but the actuaries, the
citizen data scientist. Is that right?>>The actuaries would know I
would say statistics, usually. But you get multiple level of actuaries. You get many actuaries,
they’re Excel users. They have to prepare data. They have to pin up, structure the data to give it to next actuary that will be doing the pricing model or the next actuary
that will risk modeling.>>You guys are hitting on a great formula which is cutting edge, which is why you guys are on the startups. But, Bertrand I want to talk to you about your experience at Informatica. You were the founder
the Informatica France. And you’re also involved
in some product development in the old, I’d say old days, but like. Back in the days when structured data and enterprise data, which
was once a hard problem, deal with metadata, deal with search, you had schemes, all kinds
of stuff to deal with. It was very difficult. You have expertise. I want you to talk about what’s different now in this environment. Because it’s still challenging. But now the world has
got so much fast data, we got so much new IOT data, especially here in Europe.>>Oh yes.>>Where you have an industrialized focus, certainly Germany, like case in point, but it’s pretty smart
mobility going on in Europe. You’ve always had that mobile environment. You’ve got smart cities. A lot of focus on data. What’s the new world like now? How are people dealing with this? What’s your perspective?>>Yes, so there’s and we all know about the big data and with all this
volume, additional volume and new structure of data. And I would say legacy technology can deal as you mentioned, with
well structured information. Also you want to give that
information to the masses. Because the people who know the data best, are the business people. They know what to do with the data, but the access of this
data is pretty complicated. So where Trifecta is
really differentiating and has been thinking
through that is to say whatever the structure of the data, IOT, Web Logs, Value per J son, XML, that should be for an
end user, just metrics. So that’s the way you understand the data. The next thing when play with data, usually you don’t know what
the schema would be at the end. Because you don’t know
what the outcome is. So, you are, as an end user, you are exploring the
data combining data set and the structure is trading
as you discover the data. So that is also something
new compared to the old model where an end user would
go to the data engineer to say I need that information, can you give me that information? And engineers would look
at that and say okay. We can access here, what is the schema? There was all this back and forth.>>There was so much
friction in the old way, because the creativity of the user is independent now of all that scaffolding and all the wrangling, pre-processing. So I get that piece of
the Citizen’s Journal, Citizen Analyst. But the key thing here
is you were shrecking with the complexity to get the job done. So the question then comes in, because it’s interesting, all the theme here at DataWorks Summit in Europe and in the US is all the big transformative conversations are starting with business people. So this a business unit so the front lines if you will, not IT. Although IT now’s got to support that. If that’s the case, the world’s shifting to the business owners. Hence your start up. Is that kind of getting that right?>>I think so. And I think that’s also where
we’re positioning ourselves is you have a data lake, you
can put tons of data in it, but if you don’t find an easy way to make that accessible
to a business user, you’re not going to get a value out of it. It’s just going to become a storage place. So really, what we’ve focused on is how do you make that
layer easily accessible? How do you share around and bring some of the common
business practices to that? And make sure that you’re
communicating with IT. So IT shouldn’t be cast aside, but they should have
an ongoing relationship with the business user.>>By the way, I’ll point out that Dave knows I’m not really a big
fan of the data lake concept mainly because they’ve
turned it into data swamps because IT deploys it, we’re done! You know, check the box. But, data’s getting stale
because it’s not being leveraged. You’re not impacting the data
or making it addressable, or discoverable or even wrangleable. If that’s a word. But my point is that’s all complexities.>>Yes, so we call it
sort of frozen data lake. You build a lake, and then it’s frozen and nobody can go fishing.>>You play hockey on it. (laughs)>>You dig and you’re fishing.>>And you need to have this collaboration ongoing with the IT people, because they own the infrastructure. They can feed the lake with
data with the business. If there is no collaboration, and we’ve seen that multiple times. Data lake initiatives, and then
we come back one year after there is no one using the lake, like one, two person of
the processing power, or the data is used. Nobody is going to the lake. So you need to index the
data, catalog the data to know what is available.>>And the psychology
for IT is important here, and I was talking yesterday
with IBM folks, Nevacarti here, but this is important because IT is not
necessarily in a position of doing it because doing
the frozen lake or data swamp because they want to screw
over the business people, they just do their job, but here you’re empowering them because you guys are got some tech that’s enabling the IT to do a data lake or data environment that allows them to free up the hassles, but more importantly, satisfy
the business customer.>>GeanThomas: Exactly.>>There’s a lot of tech involved. And certainly we’ve talked
to you guys about that. Talk about that dynamic of the psychology because that’s what IT wants. So what’s that dev ops mindset for data, data ops if you will or you know, data as code if you will, constantly what we’ve been calling it but that’s now the cloud
ethos hits the date ethos. Kind of coming together.>>Yes, I think data
catalogs are subtly different in that traditionally they
are more of an IT function, but to some extent on the metadata side, where as on the business side, they tended to be a siloed organization of information that business itself kept to maintain very manually. So we’ve tried to bring that together. All the different parties
within this process from the IT side to the govern stewardship all the way down to the
analysts and data scientists can get value out of a data catalog that can help each other
out throughout that process. So if it’s communicating to end users what kind of impact any
change IT will make, that makes their life easier, and have one way to communicate that out and see what’s going to happen. But also understand what
the business is doing for governance or stewardship. You can’t really govern or curate if you don’t know what exists and what matters to the business itself. So bring those different stages together, helping them help each other is really what Alation does.>>Tell about the prospects
that you guys are engaging in from a customer standpoint. What are some of the
conversations of those customers you haven’t gotten yet together. And and also give an example of a customer that you guys have, and use cases where
they’ve been successful.>>Absolutely. So typically what we see, is that an organization
is starting up a data lake or they already have
legacy data warehouses. Often it’s both, together. And they just need a unified way of making information
about those environments available to end users. And they want to have
that better relationship. So we’re often seeing IT engaged in trying to develop that relationship along with the business. So that’s typically how we start and we in the process of deploying, work in to that conversation of now that you know what exists, what you might want to work with, you’re often going to have to
do some level of preparation or transformation. And that’s what makes
Trifecta a great fit for us, as a partner, is coming to that next step.>>Yeah, on Mobile Market Share, one of our common customers, we have DNSS, also a common customer, eBay, a common customer. So we’ve got already multiple customers and so some information
about the issue Market Share, they have to deal with
their customer information. So the first thing they receive is data, digital information about ads, and so it’s really marketing type of data. They have to assess the
quality of the data. They have to understand what values and combine the value
with their existing data to provide back analytics
to their customers. And that use case, we were
talking to the business users, my people selling Market
Share to their customers because the fastest they
can unboard their data, they can qualify the quality of the data the easiest it is to deliver right level of quality analytics. And also to engage more customers. So it was really was to be
fast onboarding customer data and deliver analytics. And where Alatia explain is
that they can then analyze all the sequel statement
that the customers, maybe I’ll let you talk about use case, but there’s also, it was the same users looking at the same information, so we engage with the business users.>>I wonder if we can talk
about the different roles. You hear about the data
scientists obviously, the data engineer, there might be a data quality
professional involved, there’s certainly the
application developer. These guys may or may not even be in IT. And then you got a DVA. Then you may have somebody
who’s statistician. They might sit in the line of business. Am I overcomplicating it? Do larger organizations
have these different roles? And how do you help bring them together?>>I’d say that those roles are
still influx in the industry. Sometimes they sit on IT’s legs, sometimes they sit in the business. I think there’s a lot
of movement happening it’s not a consistent definition
of those different roles. So I think it comes down
to different functions. Sometimes you find those
functions happening within different places in the company. So stewardship and governance
may happen on the IT side, it might happen on the business side, and it’s almost a maturity scale of how involved the two
sides are within that. So we play with all of
those different groups so it’s sometimes hard to
narrow down exactly who it is. But generally it’s on
the consumptions side whether it’s the analyst
or data scientists, and there’s definitely a
crossover between the two groups, moving up towards the
governance and stewardship that wants to enable those users or document curing the data for them all the way to the IT data engineers that operationalize a lot of the work that the data scientists and analysts might be hypothesizing and
working with in their research.>>And you sell to all of those roles? Who’s your primary user
constituency, or advocate?>>We sell both to the analytics groups as well as governance and
they often merge together. But we tend to talk to all
of those constituencies throughout a sales cycle.>>And how prominent in your customer base do you see that the role
of the Chief Data Officer? Is it only reconfined
within regulated industries? Does he seep into
non-regulated industries?>>I’d say for us, it seeps
with non-regulated industries.>>What percent of the
customers, for instance have, just anecdotally, not even customers, just people that you talk to,
have a Chief Data Officer? Formal Chief Data Officer?>>I’d say probably
about 60 to 70 percent.>>That high?>>Yeah, same for us. In regulated industries (mumbles). I think they play a role. The real advantage a Chief
Data and Analytical Officer, it’s data and analytics, and they have to look at governance. Governance could be for regulation, because you have to, you’ve
got governance policy, which data can be combined with
which data, there is a lot. And you need to add that. But then, even if you are less regulated, you need to know what data is available, and what data is (mumbles). So you have this requirement as well. We see them a lot. We are more and more powerful, I would say in the enterprise where
they are able to collaborate with the business to enable the business.>>Thanks so much for coming on the Cube, I really appreciate it. Congratulations on your partnership. Final word I’ll give you guys before we end the segment. Share a story, obviously you
guys have a unique partnership, you’ve been in the business for awhile, breaking into the business with Alation. Hot startups. What observations out there
that people should know about that might not be known
in this data world. Obviously there’s a lot of
false premises out there on what the industry may or may not be, but there’s a lot of certainly
a sea change happening. You see AI, it gives a
mental model for people, Eugene Learning, Autonomous
Vehicles, Smart Cities, some amazing, kind of
magical things going on. But for the basic business
out there, they’re struggling. And there’s a lot of opportunities
if they get it right, what thing, observation,
data, pattern you’re seeing that people should know
about that may not be known? It could be something anecdotal
or something specific.>>You go first. (laughs)>>So maybe there will be surprising, but like Kaiser is a big customer of us. And you know Kaiser in
California in the US. They have hundreds or
thousands of hospitals. And surprisingly, some of
the supply chain people where I’ve been working for years, trying to analyze,
optimizing the relationship with their suppliers. Typically they would buy a
staple gun without staples. Stupid. But they see that happening over and over with many products. They were never able to
sell these, because why? There will be one product
that have to go to IT, they have to work, it
would take two months and there’s another
supplier, new products. So how to know->>John: They’re chasing their tail!>>Yeah. It’s not super excited, they are now to do that
in a couple of hours. So for them, they are able,
by going to the data lakes, see what data, see how
this hospital is buying, they were not able to do it. So there is nothing magical here, it’s just giving access to the
data who know the data best, the analyst.>>So your point is don’t underestimate the innovation, as small as it may seem, or inconsequential,
could have huge impacts.>>The innovation goes with the process to be more efficient with the data, not so much building new products, just basically being good at what you do, so then you can focus on the
value you bring to the company.>>GianThomas what’s your thoughts?>>So it’s sort of related. I would actually say something
we’ve seen pretty often is companies, all sizes, are all struggling with very similar, similar problems in the
data space specifically so it’s not a big companies
have it all figured out, small companies are
behind trying to catch up, and small companies aren’t
necessarily super agile and aren’t able to change
at the drop of a hat. So it’s a journey. It’s a journey and it’s
understanding what your problems are with the data in the company and it’s about figuring out what works best for your
solution, or for your problems. And understanding how that
impacts everyone in the business. So it’s really a learning process to understand what’s going->>What are your friends who aren’t in the tech
business say to you? Hey, what’s this data thing? How do you explain it? The fundamental shift,
how do you explain it? What do you say to them?>>I’m more and more getting people that already have an idea
of what this data thing is. Which five years ago was not the case. Five years ago, it was oh, what’s data? Tell me more about that? Why do you need to know about
what’s in these databases? Now, they actually get
why that’s important. So it’s becoming a concept
that everyone understands. Now it’s just a matter
of moving its practice and how that actually works.>>Operationalizing it, all the
things you’re talking about. Guys, thanks so much for
bringing the insights. We wrangled it here on the Cube. Live. Congratulations to Trifecta and Alation. Great startups, you guys are doing great. Good to see you guys successful again and rising tide floats all boats in this open source world we’re living in and we’re bringing you more coverage here at DataWowrks 2017, I’m John
Furrier with Dave Vellante. Stay with us, more great content coming after this short break. (upbeat music)

Leave a Response

Your email address will not be published. Required fields are marked *