Probabilistic Programming - The promising new AI technology from an applied perspective

If you think Deep Learning made for a frog leap in applied AI then just wait until Probabilistic Programming hits you in face.

At Paperflow we used Probabilistic programming successfully, and in my mind there are some very useful benefits from a business perspective to this new technology. For some reason though the technology is not very widespread yet. My guess is that it comes down to not being very commonly understood. So in order to make probabilistic programming more known I thought it would make sense to go through what it is and what the ups downs are. All from a business perspective off course.

Probabilistic Programming definition

Probabilistic programming is a paradigm or methodology that mixes programming frameworks with bayesian statistical modelling, inference algorithms and elements of machine learning. Compared to traditional machine learning and deep learning you can say that a deep learning model is usually one big compiled model that is black boxed from end to end. In Probabilistic programming you define your own model - so to say.

The most important difference to understand is that probabilistic programming is based on bayesian reasoning. I wrote another blog post about the difference between bayesianism and frequentism but you don’t have to understand that to understand the clear benefits to probabilistic programming. I heard a joke from AI-researchers saying that with deep learning we need “more brawn and less brain”. The idea is that with deep learning having a lot of data and a lot of computer power can give you better results than smart algorithms. The problem is of course that this approach comes with some physical limitations. Getting enough data and enough computer power is just too expensive. 

It might sound like you have to totally change tracks in order to work with probabilistic programming. Luckily you don’t. Deep learning and probabilistic models are easily combined. So you should see this as an added layer rather than an all new way to work. 

The benefits 

So what’s the benefits of probabilistic programming from a business perspective? Besides just generally better results in my experience there are three main benefits.

You need less data

In probabilistic programming you can implement your domain knowledge into the model and then let the model learn from data as it goes. A deep neural network can’t do that. This means that you can start off with way less data than you would need in traditional machine learning. 

Let’s say you want to make an AI predicting customer churn. You might not have enough data about your customer churns to get a very useful model in traditional machine learning. But if you know that the average churn in your market is 10% per year you can implement this into your model and over time let the data affect that. The 10% is what we call a “prior”. The churn probability you have after observing data is 12%. We call that your posterior. You can even choose between needing a lot or a little data to affect the prior. We call that strong and weak priors respectively. 

You know your uncertainty

The other really great benefit is that probabilistic models usually come with uncertainty distributions. So where traditional learning returns a probability of something, you now get a probability distribution. And why is that important? Let’s say we are making a self driving car. Our AI is 99% sure that there’s a green light ahead. How sure are we that the 99% is a correct estimate? Normally we just don’t know. With probabilistic programming we get a distribution. That means that you know, how sure you are, that the 99% is in fact the correct probability. When driving an autonomous car that is pretty useful. 

You can explain your algorithm

Explainability in AI is in high demand but often a very scarce resource. As mentioned earlier many machine learning models are black boxed end-to-end, and you will not know why the model made a certain decision. In many cases that’s a problem. E.g. there can be legislation giving loan appliers the right to know why they were rejected on a loan. Probabilistic programming offers much easier explainability.

The challenges

Of course there’s some downsides besides the obvious tongue twister name. In my opinion the without a doubt are not big enough to be afraid of probabilistic programming but you should be aware. 

Still some technological challenges

Since this is a new field there are still some technological problems that are just not quite solved yet. E.g. Inference tends to be very slow making the training time of the models extremely slow. The field is in very fast development and new techniques are coming out everyday but so far you will still have to jump through some hoops.

Not your usual programming

So this paradigm comes with programming, machine learning and statistical tools. A very useful combination but also a demanding one for the developers. Most developers are not trained in statistics in school and will have to learn some new stuff in this area. That makes it a bit more difficult to find the right people when looking for developers.

So what’s next?

I would recommend everyone in AI to follow the news in probabilistic programming and if possible start to incorporate it into your AI-development. Googles machine learning framework Tensorflow already has Bayesian methods out of the box. So there’s no excuse not to give it a go.

If you are in Copenhagen I can also recommend joining one of the meetups here:

 https://www.meetup.com/Pioneers-of-Probabilistic-Programming/

Previous
Previous

Active Learning - The AI data strategy that will save you time and money

Next
Next

Frequentism vs bayesianism - The paradigm war that makes you understand AI better