News from TeamPeko
This Spring, while judging the Boston Regional FIRST Robotics Competition (fun for the whole family!) I found myself in need of mathematics.
As judges, we are required to evaluate the effectiveness of individual robots and robotic components. But we don’t have scoring data for individual robots. Instead we’ve got FRC match data that tells us which teams played in each match and the total number of points scored by the alliance. (Imagine you want to know how many points Michael Jordan scored in a game but you only have the final score of the basketball game; what should you do?)
Fundamentally, the question I needed to answer was:
How can I estimate the number of points that I expect an individual robot to score based on the aggregate alliance scores from the previously played matches?
There are many ways to do this and I’m guessing many of the FIRST teams already have better algorithms in place! But for those teams who do not and are looking to quickly generate scouting data, here’s an algorithm that can get you started.
Suppose each robot has a scoring potential represented by Si, which is the number of points we expect that robot will score on average based on the data we have from previous matches. For example, here is the data posted on FRC Spyder for the first match at the Boston regional:
|Team 1||Team 2||Team 3||Score|
In a perfect world, the sum of the Si‘s of the robots in each match would total the actual number of points scored by the alliance. So from match 1,
S3173 + S1511 + S1754 = 95.
But the world is not perfect. In reality, each team will not score exactly Si points every game so there will be some error in our prediction:
S3173 + S1511 + S1754 – 95 = Error.
Which means that the million dollar question now becomes:
How should we choose the Si‘s to minimize this Error for all of the matches played so far in the tournament?
Again there are many ways to do this. In the interest of getting answers as quickly as possible (I only had 12 hours to make something work before the next set of matches started! Sound familiar?), I decided to go with a least squares approach, namely minimize the following quantity:
Σg (Sg,1 + Sg,2 + Sg,3 – Scoreg)2
where Σg is the sum over all games played, Sg,1 is the expected number of points from robot 1 in game g, Sg,2 is the expected number of points from robot 2 in game g, etc., and Scoreg is the actual number of points scored by the alliance in game g. For those of you interested in the details of the implementation, I’ve attached the MATLAB script at the end of this post. The script reads data from an excel spread sheet that I copied from http://www2.usfirst.org/2013comp/Events/MABO/matchresults.html. This script gives me the expected number of points for each robot in the competition, based on Friday’s matches, which are represented by the light blue bars in the plot below. (I’ve labeled a few of the top performers in the competition.)
These numbers seem to be consistent with what we see on the field BUT, as with any algorithm, we need to compare the predictions with actual data so see if the algorithm is worth anything.
And this is where it gets cool.
I presented this data to all of the FIRST teams on Saturday after the alliance selection and Team 2876, the Devil Bots, found me after the finals and gave me the actual data. (We love data!) Their scouting operation includes six individuals from their team who record every point scored by every robot in every match. That data is represented by the red dots in the plot above (also from Friday’s matches).
The first thing I noticed when we looked at the data is that the algorithm predictions are not too bad! In fact they are surprisingly good considering that each robot only played about six times on Friday so we don’t have a lot of statistics to work with. In particular, we seem to do a pretty good job correctly identifying the outstanding performers from that day’s matches.
BUT if we look a little more closely, we find that some of the most interesting features lie in the discrepancies between the algorithm and the real data. The first discrepancy that jumps out is in the total number of points scored by all robots. Note that our algorithm cannot add or subtract from the total number of points scored on Friday. All it does is distribute the actual number of points scored among the most likely robots. Hence, if we sum the expected number of points over all robots, that number should equal the actual average number of points scored by all robots during the Friday matches. i.e.
ΣiSi = ΣiSmeasured,i
However, when we run the numbers, ΣiSi = 843.6 and ΣiSmeasured,i = 699.6. How can this be? The Devil Bots were quick to point out that there is human error associated with the measured numbers, but I believe they are doing a better job tallying points than they think. I suspect that the discrepancy does not come from human error but from fouls. In this particular game, a technical foul costs 20(!) points which can have a significant impact on the score. In my data, the foul points were added to the winning alliance points (which is how the alliance score is reported on the FRC website). Whereas the Devil Bot scouts (correctly in my opinion) subtracted the foul points from the robot committing the foul.
Because of this discrepancy, we expect ΣiSi to be greater than ΣiSmeasured,i (just as we see in the data!). Furthermore, the algorithm is likely to over-predict the number of point scored by robots that frequently draw fouls. This is consistent with the few matches I observed, although I don’t have the foul data to check this rigorously.
So the good news is that the judges now have a fun (albeit imperfect) new tool to play with! I encourage all FIRSTies to try it out and modify the code as you see fit to meet your scouting needs.
clear all; iteams = [23, 69, 88, 97, 125, 126, 229, 246, 383, 501, 529, 846, 1099, 1100, 1153, 1350, 1474, 1511, ... 1721, 1754, 1757, 1761, 1768, 1965, 1973, 2013, 2079, 2084, 2262, 2349, 2423, 2523, ... 2713, 2871, 2876, 2877, 3173, 3236, 3466, 3479, 3780, 3927, 3958, 4048, 4151, 4176, 4311, ... 4474, 4761, 4796]; tmp = size(iteams); NTEAMS = tmp(1,2); teams = zeros(5000,1); for i = 1:NTEAMS teams(iteams(i)) = i; end % Read in data [tmp_data,tmp] = xlsread('data_list.xls'); [M,N] = size(tmp_data); data = zeros(M*2,4); for i=1:M for j=1:3 data(2*i-1,j) = tmp_data(i,j+2); data(2*i,j) = tmp_data(i,j+5); end data(2*i-1,4) = tmp_data(i,9); data(2*i,4) = tmp_data(i,10); end %Fill matrix tmp = size(data); NGAMES = tmp(1,1); A = zeros(NTEAMS,NTEAMS); scores = zeros(NTEAMS,1); for t = 1:NTEAMS for games = 1: NGAMES for i = 1:3 if data(games,i) == iteams(t) for j = 1:3 A(t,teams(data(games,j))) = A(t,teams(data(games,j))) + 1; end scores(t) = scores(t) + data(games,4); end end end end ExpVal = A\scores;