Focus on Data Science: Xavier Puspus, FTW Foundation

Data Science is an inescapable buzzword in the Philippines, yet companies continue to struggle to integrate it into their businesses. TechShake hopes to shed some light on the area with our series “Focus on Data Science,” where we speak to data practitioners across verticals, companies, and disciplines. Check out the rest of our Focus on Data Science series here.

Xavier Puspus has worked in data science across fintech, media, and more. Currently a Data Science Manager at FTW Foundation, Xavier shares his views on how companies think about data science as well as what makes a good data scientist and how he translates that into his own teaching style.

Here’s our interview, edited for clarity and flow.

You’ve worked with data across several industries. How do you think companies are adopting Data Science?

Data Science is definitely in demand right now. I see companies wanting data scientists for their teams, creating data science teams... but they don't understand it enough. Maybe because they struggle to articulate their business problems or how figure out data can impact their business.

As an example, ABS-CBN started a digital transformation but they saw data science as just a small part of it. So models would take years to get deployed. I feel businesses think things should be perfect before they “see the wild” but there’s a disconnect there… these models can’t improve or be perfect if they don’t interact with your customers, brands, or businesses.

There needs to be more awareness for how data can and should impact businesses across the company. Not just analytics but data science.

Are there industries that you feel have been able to integrate Data Science well?

My first job was at SN Aboitiz Power. I started as a Market Operations Associate and then became a trader. That was my first stint with data. What we did at Aboitiz was run operations for commercial power: power trading, managing contracts to sell power.

I mention this because a lot of companies are unconsciously competent in data science. At the time, Aboitiz Power was doing a lot of prescriptive and predictive data science work — for example, forecasting our demand and supply. They had deployed many solutions but the people working on those solutions were called production planners, commercial operations analysts, or market analysts.

In actuality, they were data scientists. They were literally using neural networks and these data science techniques — but they weren’t called data scientists. Using that example, data science was intrinsic in the work that they did but overall the industry was iffy about the idea of “Data Science.” 

Why do you think that is?

I think it has something to do with perceived velocity. In the Philippines, power isn’t traded fast as in other countries — it's traded hourly. This means humans can still trade and power companies don’t have a lot of demand for actual AI, high frequency trading. But as trading mechanisms transition to faster speeds, even say, the five minute mark... humans can't trade that fast. If power companies just understood that data scientists can do that — that they're actually already doing fundamental data science work — they can transition quickly.

What about industries that are primed to use data well but don’t?


Really? But there's so much AI embedded in marketing already.

Exactly. That’s also one of the industries — functions — that’s unconsciously competent.

My first teaching stint was actually for certified digital marketing, which is pretty funny because marketing has a lot of AI, software services. There's a lot out there and the data is ready. But marketers don't know. Like, "wow, you can do that?" For example, attribution modeling — even Google Analytics can do that for you. 

But marketers struggle to execute. There’s definitely a need for education. 

What do you think the gap is?

It's not the training comprehension. It's not even the trainer. I feel like it comes from confidence. You hear students say, “I hate that. I hate having to do work with code. Tech is coding… APIs and all that.”

It's intimidating. Before we train, we do a pretest where we ask students what they are most scared of. They say, “Oh, I hate the coding. I even hate Excel. I hate numbers.” A lot of people don't want to start because it's scary.

How do you address that when you teach?

You can only solve the confidence issue by doing. What I saw in previous classes was theory covered in class then students write code as homework. For me, that's not ideal because I don't get to see students work, which means I can’t fix code or help them do think things through more efficiently.

At FTW, every class is always: “Let's deploy this. Let's clean this. Don’t be scared of data. Don’t be scared of code. Have the best model, but stop when it's usable.”

At the end of the day, most data scientists are trying to solve a business problem. Data science is not just understanding the theory or say, making the best AI that can classify a dog as a dog. You need to find that balance between theory and practical use. Most of the time, you start with a business problem, take it down to a data problem, try to solve it using math, and then go back to the business.

It sounds like you focus on a really practical approach.

Yeah, for sure. I’m going to be more data-driven about my response here. 60% of algorithms don't get deployed — it's a global problem. Most algorithms just sit in their Jupyter Notebooks and the business never sees them. Education definitely shapes that. In most classes, it's more about how well can you can model or the most advanced model out there as opposed to how a business can use the model.

For me, one of the great things about FTW is that they trusted me to structure the curriculum the way I wanted it — to be much more practical. I met with some Stanford graduates recently and they all said, “We’re big fans of Andrew Ng’s course.” And I’m like, “It’s cool but it's useless.” Because it’s teaching niche theory. Who cares how many trees there are in your model? It’s more… How can I grow my sales?

I want my course to be more practical — meaning the business can interact with your model. The endgame for FTW Class 3 is not just a model score or the best Python Notebook but a deployed web application that anyone can interact with.

I think that's where the data science lifecycle should end... well, it doesn't end — it's a cycle that propagates itself, learns on its own.

How do you think companies can use data science more?

Many companies use data science, but it’s like they’re sprinkling it on top. It's a two way street. What a data scientist needs is for management to delegate the problem to us, not micromanage the solution. You should bring us in because we're good enough to see things you may not have seen. Management also needs to understand that data science is iterative. You can't just hire us for a day.

The other thing is often, data management is not in place. You can’t do great data science without a data engineering and data infrastructure foundation. But you also have companies that are again, unconsciously competent. They unwittingly have data structures in place. We have one client who kept saying they don’t have data but it turned out they had a good CRM setup.

Data science is very much overhyped, especially in this country. The unsung stars are data engineers, system administrators — they're just as important. If you don't have those people, when we come in, we're just going to be useless.

Any advice for people who want to become data scientists?

Um, don't study too hard. [laughs] I probably wasted too much time just studying models. But then again, it paid off for me because I teach. Do more. Check out Flask, React — frameworks or libraries that are machine learning friendly. Look at it from a web app perspective so you understanding how things connect. These are the things you need in order to make the business understand.

Don't hide in the glamorous Python-machine-learning-whatever —machine learning is just one day out of your 365 days. The rest of the year is you trying to convince business that you did something good, that you’re worth your salary being paid for.

Also, be a communicator. If your thoughts are stuck in a document, even in your mind — it doesn't translate to real execution. Especially in this country, we don't have too many data scientists and data science translators, so you have to be the one to communicate. At some point, you’re going to have to present and if your slides are all numbers, all formula — it's useless. Nobody cares. Nobody knows what that means. At the C-level, they’ll often just say, “What's your recommendation?” And if you can’t answer, it doesn't help your case.

We’ve covered a lot but I think there are four pillars to being a data scientist: communication, business, math, and coding. If you lack one, it's hard to be a data scientist.