Focus on Data Science: Doc Ligot, Cirrolytix
Data Science is an inescapable buzzword in the Philippines, yet companies continue to struggle to integrate it into their businesses. TechShake hopes to shed some light on the area with our series “Focus on Data Science,” where we speak to data practitioners across verticals, companies, and disciplines. Check out the rest of our Focus on Data Science series here.
Dominic “Doc” Ligot is an entrepreneur and consultant with a background in Banking and FinTech. Before founding Cirrolytix in 2016, he was the Head of Risk Analytics at ANZ after heading Consumer Credit and Analytics at HSBC and was co-founder of P2P lender Uploan.
Here’s our conversation, edited for clarity and flow.
How do you see Data Science in the Philippines?
Back before I started Cirrolytix, data was mostly viewed as a back office thing. Data science sounded sexy, but no one believed data scientists could exist in the Philippines considering the state of education locally. Up to now, people can't agree on what a data scientist should have in terms of credentials — maybe it doesn't really matter at some point.
It was, and maybe still is, a fight on all fronts. You have to teach the client what they need, that they need it, and that they have to pay for it. At the same time you have to find people to train while teaching the government to support data education.
How did you end up starting Cirrolytix?
After my stints in banking, I joined Teradata, a company that sold data warehousing solutions in 2014. Although I had been running analytics for ANZ, I didn't know enough about the back end. I thought working for an IT company — one of the reputed leaders in data warehousing — would help. It was a sales role, but I was more of an industry expert. And that's where I saw the sad state of startups here.
I saw small and medium enterprise struggling with their data, but all the big vendors — the IBMs, SASs, the Oracles — were too expensive. The final straw for me — I would say this was late 2016, Halloween in San Francisco — I was near Union Square and I rode an Uber to my hotel. Back then, Uber had this interesting thing in their app. While you’re riding, the app asks you, “would you want to play a game?” They show you these mind puzzles, programming puzzles. By the fifth question, the app says, “Hey, it looks like you know a little bit about technology. Would you like a job at Uber?”
I thought: wow, it's gamified and suddenly they're recruiting you. I was a bit teary eyed when I got out of the car, because it was so far removed from what was happening in the Philippines. In the Philippines, data people couldn’t find good work even though they're badly needed by everyone. I was meeting companies who couldn't afford the 20 to 40 million peso price tag of the typical data warehouse system. So I thought maybe it’s a price issue — maybe a 1 to 2 million price tag would work, right? That's where I got inspired. You have to understand, at the time, I had already been in corporate working for almost 14 years. It was a pretty late start out the front door.
I started gathering everything I knew and came up with an initial offering. There weren't enough startups doing what I was building and offering good rates. Plus, open source wasn't trusted back then. I called the company Cirrolytix because Cirrus for clouds. I thought we'll buck the trend — we'll offer analytics on the cloud.
What does Cirrolytix do?
At Cirrolytix, we had three ideas. Training was the first — and that space has gotten really crowded. The second is data engineering, which was the missing link in many companies. These companies have reports, models, but they don't have consistent ways of sourcing, storing, and cleansing data much less data warehouses. So I felt why don't we just build those? And then the rest will follow.
What surprised me was the third, which is business transformation, culture change, just getting the right mindset. There were many leadership groups teaching digital transformation, but data wasn’t a part of it. I thought , “how can you achieve digital transformation without data?” Digital is not just websites and banner ads and reading guru books, it starts with putting the customer and the data together. So I felt, okay, maybe we'll attack that to drive digital transformation.
How are companies going through digital transformation?
It’s all over the place. Industry-wise, banks have lots of data but the cultures need to change a lot. Telcos also are next in line. Their cultures are a little better, but they struggle with big data in its literal form — it's really, really big. You have data centers full of records and no one combing through them. It's also very IT-centric culture in telcos — lots of engineers running everything while for analytics to thrive you need people with a real sense for business. And then everyone else follows.
On the other hand, FMCG, that's fun to be in. Because it's a lot of marketing and product... but the data that they’re used to was just surveys from market research. Thankfully now they're starting to ingest point-of-sale data (POS) from their retailer partners. We got a lot of our first successes in that sector talking to the likes of P&G, Unilever, Nestle.
You’ve shifted a lot of your focus from companies to government, education, and social impact. How did that happen?
Companies are transforming, don’t get me wrong. All those projects, maybe it helped these companies’ bottom lines, maybe not. But from a personal fulfillment perspective, I didn’t feel like I was making an impact.
I was working with AAP to help education but the government education sector was taking too long — another big pivot is needed there. So I got my feet wet, started teaching in a university setting... went to UA&P, I helped them craft their master's degree in business analytics. This was about the same time AIM launched their MSDS program. I felt that we were seeing the next wave come in, with schools starting to get on board the data science trend.
I also felt — and this still remains to be seen — that freelance consulting for analytics is going to get a little harder because graduates are going to start coming out with some data skills and with local companies being what they are, they will still prefer to hire their own people. Cirrolytix would run out of clients very quickly unless we pivot and that's when the call to do more social impact-centric projects started.
How do you approach data science for social impact?
The economics change a lot because your client doesn't pay for the bill. You have to find a funder willing to pay on behalf of a beneficiary. But I find the projects in social impact more interesting. We recently bagged the NASA international space apps challenge for a dengue outbreak prediction system... and, in that scenario, while a government agency like the DOH may or may not pay for it, a global health organization like the WHO with a specific mission to end dengue might. And the impact is very real. There's a real dengue epidemic going on today.
There was also the Break the Fake Hackathon for fake news, where my group bagged a solution to digitally map sources of disinformation — a big problem now with Facebook and YouTube intentionally used to distribute propaganda. So there seem to be more interesting use cases in social impact with very real benefits to the public. The challenge is just matching funders with beneficiaries. Looking ahead, another social challenge we’re looking at is traffic, how do we solve traffic? Such a big issue for commuters in Manila.
But public health is the big one. Polio is coming back — I have a worksheet I made based on DOH stats that shows a dip in polio immunizations a few years ago. The data has been there for years, but no one cared to have a look. No one seems to have seen it. And now we have seven cases. Polio is alive in the Philippines. It's been eradicated everywhere else. I think this is where data science should go — maximum impact to as many lives.
These are big issues. What do you see as government’s role in all this?
One is government is obviously the best — or at least, the most powerful people — when it comes to moving money. You just have to contend with the red tape and paperwork. One of the bright spots I've seen is, in the past 12 months, “Smart Cities” is a bandwagon everyone everyone keeps jumping on. They can't necessarily agree on what a smart city is, but at least it's a good catch all theme. So there’s funding and the DOST (Department of Science and Technology) is always available to provide grants and partnerships.
The second is data — the government has a lot of data. It's just not in the right form and this is data that could be brought to great use even in the private sector.
Third, the government is still the biggest employer so creating a data-driven culture in government could have a massive impact. To my knowledge, data is not part of any government training curriculum. In fact, I have been dealing with the Development Academy of the Philippines and they're now open to adding data science to the curriculum.
It sounds like all of these changes are happening, but there are still many blind spots. What should governments, companies, individuals be aware of in regards to data and data science?
Data ethics. That's a big thing for me because I can see it happening almost immediately. The moment companies become literate about data, they're going to start abusing it — not even intentionally, they might just accidentally do it. For example, using facial recognition to judge if you're going to be a criminal or not in China. [Psychologist Michal] Kosinski did a similar thing, but for gender preference — you know, detect if you're gay based on your photograph. This sort of idea, this was a pseudoscience back in the 19th century — phrenology — judging people by appearance. With data, people are going back to phrenology with surprisingly accurate results. Is that even ethical?
There doesn’t seem to be a discussion about data with an ethical lens, which already exists in other practices — something like medical malpractice. Right now, there's only a big emphasis on data privacy. But if you think about it, all the abuses happen after the privacy questions have already been solved. Companies already have your data when the trouble starts.
We haven't even factored in stuff like algorithmic liabilities. Let’s say you've created an automated model that ended up killing someone. The classic example is the self driving car that runs over a pedestrian. Who do you sue? There are ethical questions that I don't think existing laws are prepared for.
I don't think these things are intentionally ignored. By default, people are going to go with their normal way of doing things. That's why it's still hard to convince people simple things like they can be entrepreneurs, they can freelance, they can be consultants. It's the same reason why it’s hard to convince companies to outsource decision making to algorithms. We're all trained by the second and third industrial revolutions.
The Fourth Industrial Revolution is all about data and how algorithms can make machines act independently, which will force us to change our fundamental thinking about everything. We need to meet these challenges head-on.