Tutorial | Justin Dixon

Finite Markov Decision Processes: Chapter 3 IRL

Introduction In Markov Decision Processes you have: * Agent: The decision maker / learner. The agent sends an action to the environment. * Environment: Everything that is not the agent. The environment sends a reward back to the agent. * Reward: The signal that agent tries to maximize. Example GridWorld Lets say we have a 5x5 grid. There are four possible actions: left, right, up, and down. If you reach the point (1,2) and move in any direction you recieve the reward of 10 and are moved to the point (5,2).

Multiarmed Bandits: Chapter 2 IRL

Introduction This is going to be part of series where I illustrate examples and questions from the brilliant book by Sutton and Barto Sutton and Barto (1998). You can download the pdf version of the newly updated book online, just google it. I am planning on going through each chapter and illustrating 1 or 2 examples from each chapter. MultiArmed Bandits What on earth is a mutliarmed bandit? It might be easier to think of it as which pokie you choose to play on down at the local.

XOR Deep Learning Example

The Problem OpenAI recently released some open research questions. As a beginner in AI I decided to tackle the begineer ‘Warmups’ they have offered. You can view their blog post here: ⭐ Train an LSTM to solve the XOR problem: that is, given a sequence of bits, determine its parity. The LSTM should consume the sequence, one bit at a time, and then output the correct answer at the sequence’s end.

Random Forests

Introduction Following on from the previous post about decision trees let us move on to Random Forests. Let us use the Soybean data from the ‘mlbench’ package. There are 35 features and 683 observations with 16 varieties of Soybean. Why care about Random Forests? Let us look at how our decision trees predict previous unseen data. First we will load the data in: library(mlbench) library(caret) data("BreastCancer") dim(BreastCancer) Let us now split the data up into a training and test data set.

Who is the angriest?

Overall sentiments - magnitude overallData <- subset(sentimentData, select = c('file','Date','magnitude','score')) p <- ggplot(overallData, aes(x=Date, y = magnitude, colour=file)) + geom_line() + ggtitle('Overall show sentiment magnitude') + xlab('Date') + ylab('Magnitude') + labs(color="Shock Jock") + theme_bw() p ggsave('1.png',p) Overall sentiments - score p <- ggplot(overallData, aes(x=Date, y = score, colour=file)) + geom_line() + ggtitle('Overall show sentiment score') + xlab('Date') + ylab('Score') + labs(color="Shock Jock") + theme_bw() p ggsave('2.png',p) Segment Analysis - By Day - 1st August dateData <- filter(sentimentData, sentimentData$Date == '2018-08-01') dateData <- mutate(dateData, percentageDone = case_when( file == 'Ben Fordham' ~ X / nrow(filter(dateData, file == 'Ben Fordham')), file == 'Ray Hadley' ~ X / nrow(filter(dateData, file == 'Ray Hadley')), file == 'Chris Smith' ~ X / nrow(filter(dateData, file == 'Chris Smith')), file == 'Alan Jones' ~ X / nrow(filter(dateData, file == 'Alan Jones')) )) p <- ggplot(dateData, aes(x=percentageDone, y = sentiment.