Dutch MMA veteran aiming for a last title shot in the twilight of his career

Image for post
Image for post
Alistair Overeem, via Wikimedia.

New Year’s Eve, December 31st 2011. For a select group of people, that evening is better known as “UFC 141: Lesnar vs. Overeem”. Both men pushed the limits of the heavyweight division: Brock Lesnar weighed in at 266lbs, Alistair Overeem at 263. Not an ounce of fat on either of them; there may have never been as much muscle mass in the octagon as on that night. The match was promoted as the ultimate ‘wrestler versus striker’ match-up. Could Overeem maintain his footing against the insane speed and power of this colossal pro-wrestler? Could Lesnar hold his own against the…


Large and high-dimensional action spaces are often computational bottlenecks in Reinforcement Learning. Formulating your decision problem as a linear program could vastly enhance the range of problems your algorithm can handle.

Image for post
Image for post
Photo by Mehmet Turgut Kirkgoz via Pexels

Ask an operations researcher to solve any problem — be it optimizing your stock portfolio, scheduling your delivery routes, or fixing your marital problems — and they are likely to blurt out ‘linear programming’ as their go-to solution. A mathematical method conceived in the aftermath of WWII as a means to automate planning procedures at the US Army Air Forces, linear programming has since evolved into a mature field with widespread applications in transportation, manufacturing, finance, health care and many other domains. Typical implementations handle problems with thousands of decisions variables, countless constraints and a multitude of cost- and reward…


A multi-armed bandit example for training discrete actor networks. With the aid of the GradientTape functionality, the actor network can be trained using only a few lines of code.

Image for post
Image for post
Photo by Hello I’m Nik via Unsplash

Training discrete actor networks with TensorFlow 2.0 is easy once you know how to do it, but also rather different from implementations in TensorFlow 1.0. As the 2.0 version was only released in September 2019, most examples that circulate on the web are still designed for TensorFlow 1.0. In a related article — in which we also discuss the mathematics in more detail — we already treated the continuous case. Here, we use a simple multi-armed bandit problem to show how we can implement and update an actor network the discrete setting [1].

A bit of mathematics

We use the classical policy gradient algorithm…


A simple example for training Gaussian actor networks. Defining a custom loss function and applying the GradientTape functionality, the actor network can be trained using only a few lines of code.

Image for post
Image for post
Photo by Lenin Estrada on Unsplash

At the root of all the sophisticated actor-critic algorithms that are designed and applied these days is the vanilla policy gradient algorithm, which essentially is an actor-only algorithm. Nowadays, the actor that learns the decision-making policy is often represented by a neural network. In continuous control problems, this network outputs the relevant distribution parameters to sample appropriate actions.

With so many deep reinforcement learning algorithms in circulation, you’d expect it to be easy to find abundant plug-and-play TensorFlow implementations for a basic actor network in continuous control, but this is hardly the case. Various reasons may exist for this. First…


A basic explanation of the mathematical concept of filtrations and how we may apply them in Reinforcement Learning.

Image for post
Image for post
A filtration that is needed to practice reinforcement learning [source: pixel2013 via pixabay.com ]

Once in a while, when reading papers in the Reinforcement Learning domain, you may stumble across mysterious-sounding phrases such as ‘we deal with a filtered probability space’, ‘the expected value is conditional on a filtration’ or ‘the decision-making policy is ℱ-measurable’. Without formal training in measure theory [2,3], it may be difficult to grasp what exactly such a filtration entails. Formal definitions look something like this:


Image for post
Image for post
source: patmikemckane, via pixabay.com (Pixabay license)

Some truths in life are universal. Nobody likes standing in traffic jams. Nobody likes large trucks block their favorite shopping street. Nobody likes to fill up their lungs with soot particles. Sadly, given the ever-increasing consumption of goods within city centers, we can only expect more of all that.

Urban trends give plenty of reason to worry. Aside from a world population that continues to grow at alarming rates, in a relative sense more people will start living in urban areas as well: in 2050, 84% of the European population is expected to live in urban areas¹. Most goods and…


Image for post
Image for post
source: Ichigo121212, via pixabay.com (Pixabay license)

With half of the world placed into various stages of quarantine due to the rapidly spreading corona virus, chances are your workout routine is drastically shook up as well. Gyms are closed, sports classes are cancelled, maybe you cannot even go outdoors for a run. Perhaps you are compelled to stay at home, perhaps you voluntarily chose to do so. If you also work from home these days, you are pretty much locked up except for the occasional walk or supermarket trip. Whether this situation will last for weeks or even months, nobody can tell right now. Unless you are…


Burn more calories than you eat

Image for post
Image for post
source: Ella Olson, via pexels.com (Pexels license)

To lose weight, you need to consistently burn off more calories than you consume. In all likelihood, you already knew that. Surprisingly, many people still tend to overlook this absolutely essential rule when attempting to get rid of excess body fat. Amid a myriad of diet gurus, urban myths, science-packed news articles, lists of superfoods and not to forget well-meant advices, it can be difficult to keep track of what should be the essence of dieting. …

Wouter van Heeswijk, PhD

Assistant professor in Financial Engineering and Operations Research. Writing about reinforcement learning, optimization problems, and fitness.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store