AI in the Real World - The Linux Foundation

Hilary Mason, general manager for machine learning at Cloudera, discussed AI in the real world in her keynote the recent Open FinTech Forum.

We are living in the future – it is just unevenly distributed with “an outstanding amount of hype and this anthropomorphization of what [AI] technology can actually provide for us,” observed Hilary Mason, general manager for machine learning at Cloudera, who led a keynote on “AI in the Real World: Today and Tomorrow,” at the recent Open FinTech Forum.

AI has existed as an academic field of research since the mid-1950s, and if the forum had been held 10 years ago, we would have been talking about big data, she said. But, today, we have machine learning and feedback loops that allow systems continue to improve with the introduction of more data.

Machine learning provides a set of techniques that fall under the broad umbrella of data science. AI has returned, from a terminology perspective, Mason said, because of the rise of deep learning, a subset of machine learning techniques based around neural networks that has provided not just more efficient capabilities but the ability to do things we couldn’t do at all five years ago.

Imagine the future

All of this “creates a technical foundation on which we can start to imagine the future,’’ she said. Her favorite machine learning application is Google Maps. Google is getting real-time data from people’s smartphones, then it is integrating that data with public data sets, so the app can make predictions based on historical data, she noted.

Getting this right, however, is really hard. Mason shared an anecdote about how her name is a “machine learning-edge case.” She shares her name with a British actress who passed away around 2005 after a very successful career.

Late in her career, the actress played the role of a ugly witch, and a search engine from 2009 combined photos with text results. At the time, Mason was working as a professor, and her bio was paired with the actress’s picture in that role. “Here she is, the ugly hag… and the implication here is obvious,’’ Mason said. “This named entity disambiguation problem is still a problem for us in machine learning in every domain.”

This example illustrates that “this technology has a tremendous amount of potential to make our lives more efficient, to build new products. But it also has limitations, and when we have conferences like this, we tend to talk about the potential, but not about the limitations, and not about where things tend to go a bit wrong.”

Machine learning in FinTech

Large companies operating complex businesses have a huge amount of human and technical expertise on where the ROI in machine learning would be, she said. That’s because they also have huge amounts of data, generally created as a result of operating those businesses for some time. Mason’s rule of thumb when she works with companies, is to find some clear ROI on a cost savings or process improvement using machine learning.

“Lots of people, in FinTech especially, want to start in security, anti-money laundering, and fraud detection. These are really fruitful areas because a small percentage improvement is very high impact.”

Other areas where machine learning can be useful is in understanding your customers, churn analysis and marketing techniques, all of which are pretty easy to get started in, she said.

“But if you only think about the ROI in the terms of cost reduction, you put a boundary on the amount of potential your use of AI will have. Think also about new revenue opportunities, new growth opportunities that can come out of the same technologies. That’s where the real potential is.”

Getting started

The first thing to do, she said is to “drink coffee, have ideas.” Mason said she visits lots of companies and when she sees their list of projects, they’re always good ideas. “I get very worried, because you are missing out on a huge amount of opportunity that would likely look like bad ideas on the surface.”

It’s important to “validate against robust criteria” and create a broad sweep of ideas. Then, go through and validate capabilities. Some of the questions to ask include: is there research activity relevant to what you’re doing? Is there work in one domain you can transfer to another domain? Has somebody done something in another industry that you can use or in an academic context that you can use?

Organizations also need to figure out whether systems are becoming commoditized in open source; meaning “you have a robust software and infrastructure you can build on without having to own and create it yourself.” Then, the organization must figure out if data is available — either within the company or available to purchase.

Then it’s time to “progressively explore the risky capabilities. That means have a phased investment plan,’’ Mason explained. In machine learning, this is done in three phases, starting with validation and exploration: Does the data exist? Can you build a very simple model in a week?

“At each [phase], you have a cost gate to make sure you’re not investing in things that aren’t ready and to make sure that your people are happy, making progress, and not going down little rabbit holes that are technically interesting, but ultimately not tied to the application.”

That said, Mason said predicting the future is of course, very hard, so people write reports on different technologies that are designed to be six months to two years ahead of what they would put in production.

Looking ahead

As progress is made in the development of AI, machine learning and deep learning, there are still things we need to keep in mind, Mason said. “One of the biggest topics in our field right now is how we incorporate ethics, how we comply with expectations of privacy in the practice of data science.”

She gave a plug to a short, free ebook called “Data Driven: Creating a Data Culture,” that she co-authored with DJ Patil, who worked as chief data scientist for President Barack Obama. Their goal, she said, is “to try and get folks who are practicing out in the world of machine learning and data science to think about their tools [and] for them to practice ethics in the context of their work.”

Mason ended her presentation on an optimistic note, observing that “AI will find its way into many fundamental processes of the businesses that we all run. So when I say, ‘Let’s make it boring,’ I actually think that’s what makes it more exciting.’”

You can watch the complete presentation below:

About
Latest Posts

Esther Shein