-
Notifications
You must be signed in to change notification settings - Fork 0
/
henry.tex
executable file
·267 lines (205 loc) · 32.1 KB
/
henry.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
% This is a sample document using the University of Minnesota, Morris, Computer Science
% Senior Seminar modification of the ACM sig-alternate style. Much of this content is taken
% directly from the ACM sample document illustrating the use of the sig-alternate class. Certain
% parts that we never use have been removed to simplify the example, and a few additional
% components have been added.
% See https://github.com/UMM-CSci/Senior_seminar_templates for more info and to make
% suggestions and corrections.
\documentclass{sig-alternate}
\usepackage{color}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage{graphicx}
\usepackage[ruled]{algorithm2e}
\usepackage{amssymb}
\definecolor{Teal}{RGB}{2,132,130}
\newcommand{\allcomments}[1]{{#1}}
\newcommand{\hfcomment}[1]{\textcolor{Teal}{\allcomments{Henry: {#1}}}}
%%%%% Uncomment the following line and comment out the previous one
%%%%% to remove all comments
%%%%% NOTE: comments still occupy a line even if invisible;
%%%%% Don't write them as a separate paragraph
%\newcommand{\mycomment}[1]{}
\begin{document}
% --- Author Metadata here ---
%%% REMEMBER TO CHANGE THE SEMESTER AND YEAR AS NEEDED
\conferenceinfo{UMM CSci Senior Seminar Conference, December 2016}{Morris, MN}
\title{Modern Approaches to the Rich Vehicle Routing Problem}
\numberofauthors{1}
\author{
% The command \alignauthor (no curly braces needed) should
% precede each author name, affiliation/snail-mail address and
% e-mail address. Additionally, tag each line of
% affiliation/address with \affaddr, and tag the
% e-mail address with \email
\alignauthor
Henry F. R. Fellows\\
\affaddr{Division of Science and Mathematics}\\
\affaddr{University of Minnesota, Morris}\\
\affaddr{Morris, Minnesota, USA 56267}\\
\email{[email protected]}
}
\maketitle
\begin{abstract}
The Rich Vehicle Routing Problem is a class of problems that revolve around finding the most optimal route for a certain set of deliveries. Obtaining approximate solutions to these computationally hard problems is both economically valuable and academically interesting. This paper describes several recent approaches to popular variants of the Rich Vehicle Routing Problem such as Hybrid Genetic Search with Advanced Diversity Control, Probability Collectives, Distributed Reverse Vickrey Auctions, and Interactive Genetic Routing.
\end{abstract}
\keywords{Vehicle Routing Problem, Black-box optimization, Genetic Algorithms, Probability Collectives, Distributed Systems}
\section{Introduction}
\label{sec:intro}
Routing is a deeply pervasive feature of the modern world. From the point-to-point navigation of everyday driving to the shipping of bulk freight, the task of creating those routes has been increasingly automated. The long term trend towards self-driving vehicles means that computers will soon control the entire process of transportation; aside from selecting a time of departure and destination, humans need no longer be involved. Software to create routes for school buses, mail deliveries and garbage trucks has been in use for decades; everything from Amazon to Delta uses these systems on a daily basis.
The exact impact of improvements in routing is difficult to measure, but in 2011, the external costs of traffic jams in the European Union were 1-2\% of GDP \cite{Caceres-Cruz:2014}. In the United States, 57 tons of goods were moved for each person in 2012 \cite{BOTS:2012}. A small change in routing efficiency has the potential to reduce some of the staggeringly large economic and environmental costs of transportation.
The general case of routing is the traveling salesman problem. The traveling salesman problem asks for the shortest route which passes through each point once. Assuming that each pair of points is connected by a link, the number of potential routes is $\tfrac{1}{2}n!$, which grows extremely quickly, and is significant even for small values. For $n=10$, the number of routes is $1,814,400$, and for $n=15$, it reaches $653,837,184,000$. The traveling salesman problem is in a class of problems known as NP-complete. Although it has not yet been proven, it is considered likely that there is no algorithm for NP-complete problems that allows them to be solved quickly. Fortunately, solutions of NP-complete problems are easy to verify, and this property encourages approximate solutions; routes that satisfy all customers, but may not be the most optimal means of doing so.
Routing in the real world is much more complex than the traveling salesman problem implies. There may be an expectation that the vehicle arrives near a specified time. New stops may be added during the trip, or the exact demand of each customer might not be known ahead of time.
Dealing with these types of problems is the domain of the Rich Vehicle Routing Problem, which extends the Vehicle Routing Problem described in the next section. After that, two major classes of approximate solutions to the Vehicle Routing Problem are discussed along with recent algorithms in these categories: genetic algorithms and agent-based algorithms. Finally, conclusions are drawn about the results of these algorithm.
\section{Rich Vehicle Routing Problem}
\label{sec:VRP}
\begin{figure}
\centering
\includegraphics[width=3in, keepaspectratio]{VRP.png}
\caption{Example of the VRP}
\label{fig:VRPgraph}
\end{figure}
Originally titled the Truck Dispatching Problem, the original formulation of the Vehicle Routing Problem (VRP) was created by G. B. Danzig and J. H. Ramser in 1959 \cite{Danzig:1959}. The premise of the problem is that each vehicle (or truck), has a limited capacity and the fleet must make deliveries to as many customers as possible, starting from a specific location known as a depot as shown in Figure~\ref{fig:VRPgraph}. The vehicles may have to make multiple trips in order to satisfy all of the customers. The primary metric is the number of trips taken to satisfy all customer demand. A measure that is also considered worth minimizing is the maximum tour length - the longest single route taken by one truck. Danzig and Ramser note that if the total capacity of a vehicle is greater than the total demand of the customers, the problem is mathematically identical to the traveling salesman problem, because then a single truck can serve all customers in a single route. There has been a great deal of research done on the original VRP, and recent research focuses on versions of the VRP with different constraints or many constraints simultaneously~\cite{Caceres-Cruz:2014}. These variations of the Vehicle Routing Problem are collectively referred to as the Rich Vehicle Routing Problem (RVRP) . In the following sections of, we define prominent variants of the RVRP.
\subsection{Decentralized Vehicle Routing Problem}
A common problem in rich vehicle routing literature is decentralized control, where each vehicle or depot is modeled as having an independent agent who makes decisions out of self-interest \cite{Caceres-Cruz:2014}. This is in contrast to the behavior of the traditional VRP, which has a single controlling entity that is aware of the entire state of the problem. The scheduling system that Delta Airlines uses to dispatch flights is an example of a centralized system; however, when the system crashed in august 2016, it forced the cancellation of hundreds of flights. A decentralized system is not as vulnerable to the failure of individual entities, which makes it useful in high-availability systems. In some cases, decentralized routing also makes sense when the vehicles are owned by independent owners that do not wish to maintain their own systems for routing.
\begin{figure}[t]
\centering
\includegraphics[width=3.5in, keepaspectratio]{Utility.png}
\caption{Example of poor global utility in DVRP}
\label{fig:DVRPgraph}
\end{figure}
A recurring problem in the DRVP and decentralized systems in general is that pure self-interest can cause decisions that degrade the solutions of others. Imagine a vehicle which finds a route that fulfills a large number of deliveries extremely quickly; other vehicles might find that their routes force them to cross through this area where they cannot make any deliveries along the way. Figure \ref{fig:DVRPgraph} is an example of this, where the blue (solid line) route is highly efficient for the first vehicle, it is less efficient for all other vehicles. The ideal is to pursue optimal routes - both on the local (individual) scale, and on a global scale. Finding good methods of making decisions that allow for both autonomy and global utility is an ongoing research question.
\subsection{VRP with Time Windows}
Another variant of the VRP is the vehicle routing problem with time window (VRPTW), where the customer expects that the deliveries take place within a certain time interval \cite{Caceres-Cruz:2014}. Unlike the estimates provided by shipping services, these constraints are set by the customers. A simple example is the constraint that deliveries are only accepted during business hours. This adds the dimension of time to the problem -- the time it takes the truck to traverse nodes and to complete the delivery must be considered. The VRPTW is one of the more popular extensions of the VRP because deliveries without implicit or explicit time constraints are rare. In practice, the time window constraint is rarely found without other constraints; industrial settings tend to consider the VRPTW to be the base problem for real-world settings.
\subsection{Black Box Optimization}
In general, the approaches used to solve the VRP are approximate methods; they intend to find 'good', not perfect solutions. The algorithms tend to use simple rules to guess solutions, and the process of narrowing down or refining these guesses is known as optimization. Most modern methods of solving the VRP belong to the blackbox style of optimization; blackbox optimization is used when a problem does not have a formal algebraic model, or the model is too computationally expensive \cite{Amaran:2014}. There is no method to solve the VRP without using the computationally expensive exact solution. The natural choice is to use blackbox methods which trade accuracy for speed. The common feature of these methods is the use of stochastic (random) elements, especially in selecting starting states. This makes blackbox methods into a double edged sword; they are often effective at solving otherwise intractable problems, but the solutions frequently fail to produce any insight into the problem. Stochastic decision making makes it hard to determine why the algorithm provides a specific solution.
\section{Genetic Algorithms}
A genetic algorithm (GA) is a type of problem-solving that is inspired by natural selection. It is often used to find solutions for processes that are not well understood, but can be modeled well. In the case of the VRP, the problem is both known and modeled well, but the approximate solution given by a well-made genetic is often nearly as good, and much easier to compute than the exact solution.
The fundamental notion of a genetic algorithm is a \textit{genetic representation}, which is a means of representing an individual. Individuals in VRP tend to be routes or sets of routes. By analogy, the representation is the genome of the individual. A common representation is a fixed-length array of bits, but other representations are possible. These genetic representations are manipulated by genetic operators that simulate various processes. A mutation operator provides the analogue to biological mutation by altering values in the genetic representation; for a fixed length array of bit, a common example is a function that flips bits randomly generator. The crossover operator represents reproduction - it combines traits from two (or more) individuals to produce a new individual.
Genetic algorithms aim to improve their solutions, and they measure how `good' a solution is by using a fitness function. Fitness functions are problem specific; in VRP, they measure if the individual solves the problem, and how good that solution is.
\begin{algorithm}[t]
Initialize population\;
\While{iteration limit not reached}{
Select best individuals via fitness function\;
Generate new individuals using genetic operators\;
Build new population from old \& new individuals\;
}
\Return best solution\;
\caption{Genetic Algorithm Pseudocode\label{GA}}
\end{algorithm}
The general form of a GA begins with initializing the population. A selection of individuals with a random genetic representation is created. As illustrated in algorithm \ref{GA}, the algorithm enters its main loop; the loop is usually terminated when an iteration limit is reached, but other conditions are sometimes used that are specific to the problem. In the loop, the individuals are scored using the fitness function and the best are set aside for the next stage. The crossover operator is then applied to this group, and mutation operator on the resulting individuals. Then, it evaluates each individual and replaces the worst members of the population with the best new individuals. This process is repeated until the termination condition is reached; this is frequently an iteration limit. The individual with the highest fitness in the population is then returned.
\subsection{Interactive Genetic Routing}
\begin{figure}[t]
\centering
\includegraphics[width=3.5in, keepaspectratio]{TAG.png}
\caption{Example of selecting a good region}
\label{fig:Humangraph}
\end{figure}
Humans tend to be very good at solving visual problems, and this extends to the routing if the problem is displayed appropriately. We have a talent for deciding whether a given route is good or bad by simply looking at a diagram. Recently, a number of platforms were created to allow developers to make use of human intelligence in solving computationally complex problems. Amazon's Mechanical Turk, an example of these platforms, allows the exploitation of human intuition by offering an API that makes it possible to integrate into software.
Continuing the theme of using humans to do things humans are good at, S. Ismail, F. Legras, and G. Coppin \cite{Ismail:2012} created a genetic algorithm that uses humans to determine how good a particular solution is.
\begin{algorithm}[t]
Initialize population\;
\While{iteration limit not reached}{
Select best individuals via fitness function\;
Generate new individuals using crossover\;
Humans to tag portions of new individual\;
Mutate individuals based on tagging\;
Build new population from old \& new individuals\;
}
\Return best feasible solution\;
\caption{Interactive Genetic Algorithm\label{HUMGA}}
\end{algorithm}
The interactive genetic algorithm, algorithm \ref{HUMGA}, uses humans to find good sequences of destinations. It begins by initializing the population with a set of random individuals. The individuals are sets of routes in this case. While the iteration cap has not been reached, the algorithm first creates a new group of individuals using a standard crossover operator. Then, humans are used to tag good or bad sections of the routes in each individual as shown in Figure \ref{fig:Humangraph}. A good section should be a segment of the route that is likely to be part of the most optimal route; a bad section is a poor segment of a route. The tags influence the mutation operator; a `good' section is unlikely to be modified, while a bad section is likely to be modified. This encourages preservation of good sections, while bad sections are removed. Conventional genetic algorithms are unable to gather or act on any information about the details of the individual aside from the overall fitness. The human involvement allows semi-intelligent decisions to be made about what parts of the individual should be changed. After the mutation operator, it evaluates each individual and replaces the worst members of the population with the best new individuals. The algorithm repeats until the iteration limit is reached and the individual with the best routes is returned.
Ismail, Legras, and Coppin don't provide any experimental results in the paper, but note that humans are relatively slow. In practice, humans would be best used in very large scale routing problems where the added time isn't as much of a concern, and the tagging procedure may not be called every iteration.
\subsection{Hybrid Genetic Search with Advanced Diversity Control}
Adding time window constraints to customer demand and depot availability poses a significant challenge. Time, distance, and speed are now important to the problem, and all of these factors must be accounted for. The opposing demands of temporal and spatial constraints is a particularly frustrating problem in the VRPTW.
The Hybrid Genetic Search with Advanced Diversity Control (HGSADC) addresses some of these challenges in the VRPTW, especially in route-duration constraints. The primary feature of the HGSADC is a different approach to diversity management in its population. HGSADC adds diversity to its objectives as a term to be optimized, which allows it to avoid dead-end solutions by recalling earlier solutions that do not encounter the same problem. The algorithm is considered the current state of the art in the multi-depot vehicle routing problem with time windows \cite{Vidal:2013}.
\subsubsection{Algorithm and mechanics}
HGSADC is a complex algorithm, and cannot be fully described in this paper. What follows is a summary of Algorithm~\ref{HGSADC} and the most notable features.
\begin{algorithm}[t]
Initialize populations at $4\mu$ size\;
\While{number of interactions without improvement $< lt_{NI}$, and time $< T_{max}$}{
Select parent solutions $P_1$ and $P_2$\;
Create child $C$ from $P_1$ and $P_2$ (crossover)\;
Educate $C$ (local search procedure)\;
\If{$C$ infeasible}{
Insert $C$ into infeasible subpopulation\;
Repair with probability $P_{rep}$\;
}
\If{$C$ feasible}{
Insert $C$ into feasible subpopulation\;
}
\If{maximum subpopulation size, $\mu$, reached}{
Select survivors\;
}
\If{best solution not improved for $It_{div}$ iterations}{
Diversify population\;
}
Adjust penalty parameters for infeasibility\;
\If{number of iterations $modulo$ $It_{dec} = 0$}{
Decompose the master problem\;
Use HGSADC on each subproblem\;
Reconstitute three solutions, and insert them in the
population\;
}
}
\caption{Hybrid Genetic Search with Advanced Diversity Control\label{HGSADC}}
\Return best feasible solution\;
\end{algorithm}
HGSDAC evolves infeasible and feasible solutions as two separate subpopulations. In most genetic algorithms, infeasible (`incorrect') solutions are removed from the population, but in the HGSADC, they are kept in a separate subpopulation. The rationale is that this allows the algorithm to create a reserve of genetic diversity that can be used to get the population out of dead ends. Two random individuals are pulled from the combined population and the best of them is chosen to be a parent. The crossover operator is an applied to two parents, and the resulting child is educated.
The education process is a limited local search that seeks to improve the child; if the child is still infeasible after this procedure, the repair procedure may be called. $P_{rep}$ is the probability that the repair operator is called; repair is a very intensive local search. Finally, the child is placed into an appropriate subpopulation. If a subpopulation exceeds a maximum size, a survivor selection stage is triggered. A number of individuals is removed until the population returns to the nominal population size $\mu$. An individual that is most similar in terms of fitness score and route to another individual is called a clone. Survivor selection iteratively removes a clone until the population size is returned to $\mu$. The purpose of this style of population management is to maintain diversity. A diversification round may be started if there has been $It_{div}$ iterations without improvement. The best third of each population is retained, and adds $4\mu$ randomly created individuals to introduce new genetic material to the populations.
Finally, every $It_{dec}$ iterations, the problem is decomposed into simplified subproblems and the HGSADC is run on those subproblems (without further decomposition). The initial population of subproblem solutions are created from the relevant genetic material of existing individuals in the populations of the original problem. The three best individuals of each subproblem are retained and merged to form three elite individuals who are added to the population of the original problem.
When the algorithm has finally hit the maximum time, the best solution from the feasible population is returned.
\subsubsection{HGSADC Results}
The HGSADC is extremely good at solving VRPTW. Algorithms are tested by measuring how close to optimal is the best solution the algorithm produced on a number of test data sets. Out of 465 instances of the VRPTW, the HGSADC improved or found the previous best solution 397 times. It strictly improved the best known solution 233 times. For certain types of instances, the HGSADC is the best known algorithm in literature. On the remaining instances, the HGSADC generates solutions of comparable quality to an earlier algorithm by Nagata et al. \cite{Nagata:2010}, and is less computationally efficient. However, it does not require as many hand tailored components for each instance, and performs better on problems with tighter time windows and shorter routes. The average standard deviation of the results is $0.2-0.4\%$, which shows that the algorithm is highly stable or consistent.
\section{Agent-based models}
Agents are simply small decision making elements that typically have some internal state and the ability to make decisions based on this internal state and potentially some of the global state. They typically represent and individual in a population. In optimization, a typical goal is for an agent to attempt to achieve some sort of optimal state. Agents (should) take actions to improve their state, and we describe this as the agent acting in its own interest. The function that returns the quantification of this local optimization is termed the local utility function, in contrast to the global, or system utility function. The global utility function describes how optimal the system is, which can be quite different from the local utility. A situation that is locally optimal for an agent, yet globally non-optimal is show in Figure \ref{fig:DVRPgraph}.
Here we discuss two major approaches in modeling individuals for the decentralized VRP. The first is a distributed reverse Vickrey auction, the more traditional approach which tends to avoid sharing information. Probability collectives, the second approach is a recent development from statistical physics and game theory that shares information.
\subsection{Distributed reverse Vickrey auctions}
Saleh et al. \cite{Saleh:2012} examined a version of the decentralized VRP where each depot and customer are treated as agents, as opposed to the more common version that treats vehicles as agents. The authors were interested in designing a system that encouraged near-optimal solutions while still being completely self-interested. They pursued this by building mechanisms that encouraged depots to offer an accurate estimate of minimum costs while still allowing the cost functions of each depot to remain private.
The core feature of the algorithm is a reverse Vickrey auction, which is a sealed bid reverse auction where competitors offer a estimate of the cost of providing a service. The consumer selects the bidder that has the lowest cost, but pays them the amount of the second lowest bid. This encourages bidders to bid an honest estimate of cost; in the case that they win, they will always make a profit over their bid.
The algorithm is played as a number of rounds. In each round, each depot $k$ submits a bid to each unassigned customer $j$ where demand is less than or equal to remaining capacity. Bids consist of the estimated costs of servicing the customer. The customer chooses the lowest bid satisfying their demands and notifies $k^*$. The depot $k^*$ may receive responses from multiple customers. From the given responses, the depot chooses a single customer $j^*$ with the lowest insertion cost (the customer whose addition to the vehicle's route will cause the least change in the routing cost). Once the customer is assigned to a depot, the customer $j^*$ submits payment to depot $k^*$. The payment is equal to the second lowest offer of the round, or the lowest if only one bid is submitted. The completion of these steps is a single auction round, and many rounds may take place until every customer is assigned. Once the auction rounds are complete, the depot can apply other optimization strategies to its routes, or in the case of small routes, an exact solution.
The distributed reverse Vickrey auction is an effective technique of solving the decentralized VRP in a competitive environment where information is not shared. The authors note that the performance of the algorithm is highly dependent on the choice of lower level routing algorithms, but overall, the DRVA is not as existing algorithms.
\subsection{Probability Collectives}
\label{ssec:PC}
Probability Collectives (PC) take a different approach to selecting a good solution. Instead of attempting to evolve an exemplary individual in a population like genetic algorithms, probability collectives selects optimal strategy for each agent \cite{Kulkarni:2008}. A PC agent is a self-interested, learning individual that selects a strategy with the highest probability of optimizing the local and global objectives. In contrast to the Distributed Reverse Vickrey Auction, the agents willingly and freely share information about their strategies, and the cost function is global.
\begin{figure}[h]
\centering
\includegraphics[width=2in, keepaspectratio]{"Probability Collectives Diagram".pdf}
\caption{Probability Collectives Algorithm.}
\label{fig:PCDiagram}
\end{figure}
The general form of PC in the context of VRP is as follows, along with the flowchart below. It is important to note this process is repeated for every agent in the system. Imagine you have $N$ agents, which can be depots or trucks, each having a set $X$ of strategies, of (potentially variable) length $m$. The strategies are most often routes in RVRP applications \cite{Vasirani:2008}. $X$ is sampled from the interval $\psi_i$ with upper and lower bounds $\psi^u$ and $\psi^l$, which is a non-strict subset of all possible strategies. A strategy for agent $i$ is denoted as $X_i^{[r]}$, where $r$ is an identifier for that strategy. The set is represented as $X_i=\{X_i^{[1]}, X_i^{[2]}, ..., X_i^{[m]}\}$.
For each agent, assign a uniform probability, $1/m_i$ to all actions; the resulting probability of that strategy being selected is signified by $q(X_i^{[r]})$. The agents then selects a random action $r$ and a sampling of random actions from other agents. The resulting set, the `combined strategy set', $Y_i^{[r]}=\{X_1^{[?]}, ...,X_i^{[r]}, ...,X_N^{[?]}\}$, represents a guess as to a potential future. Accordingly, for each of the strategy sets $Y_i^{[r]}$, compute the expected local utility using the following measure:
\begin{equation}
\textit{Exp. Utility of Agent } i^r =q_i^r\prod_{(i)}{q(X_{(i)}^{[?]})\cdot G(Y_i^{[r]})}
\end{equation}
Here $q_i^r$ represents the probability of action $r$ for vehicle $i$, $(i)$ is the set of all agents excluding $i$, and $G$ is the function that computes global utility for a given set of strategies. $G$ is problem specific, but in the RVRP it could be a measure of unused capacity, unvisited destinations, maximum tour length or various combinations of measurements.
After computing the local utility, the next step is to update the probability of each action for all agents as follows:
\begin{equation}
q(X_i^{[r]})\leftarrow q(X_i^{[r]})-\alpha_{step}\cdot q(X_i^{[r]})\cdot O_k^r
\end{equation}
where $k$ is the iteration, $\alpha_{step}$ is a constant that controls the amount of change each step.
\begin{equation}
O_k^r = \dfrac{\textit{Contrib. of Agent }i}{T_k}+S_i(q)+ln(q(X_i^{[r]}))
\end{equation}
Here, temperature $T_k$ is a scalar that represethe nts the relative importance of the of the contribution of agent $i$ over time. It is computed as $T_{(k+1)}=T_{k} - \alpha_{T}\cdot T_{k}$, and $T_0$ is the initial temperature. The initial value is unimportant - it must simply be ``sufficently high" \cite{book}. $\alpha_{T}$ controls the rate of temperature change, and the values are often chosen experimentally.
The contribution of Agent $i$ is then:
\begin{equation}
\textit{Contrib. Agent }i = \textit{Utility Agent }i^r - \textit{Global Utility}
\end{equation}
The $S_i(q)$ term is the entropy of the combined strategy set. In information theory, Entropy is the average value of the information in a message. Probability collectives treats the combined strategy set as a message in order to find the set with the highest Entropy. As it increases, the probability distribution more clearly distinguishes the contribution of each strategy toward optimizing the expected global utility. When it reaches a maximum, it represents the set that has the most information about the probabilities of each strategy. Entropy is computed by:
\begin{equation}
S_i(q)=-\sum_{r=1}^{m_i}q(X_i^{[r]})\cdot ln(q(X_i^{[r]})
\end{equation}
which is the sum of every strategy $q(X_i^{[r]})$ times the natural logarithm of itself. At this point, the algorithm checks to see if it will continue updating the global utility or it will terminate. If the probabilities of the combined strategy set have not changed by a constant amount, $\sigma$, or if the number of iterations ($k$) have reached a maximum, the strategy with the highest probabilities will be returned. Otherwise, the algorithm will continue by narrowing and re-centering the sampling interval $\psi_i$. The exact process for adjusting the sampling interval can be found in Kulkarni et al.\cite{book}.After $\psi$ is narrowed then a new strategy set is sampled, and the process of refining those probabilities is started anew. Once every agent has returned a final strategy, each agent can then pursue those strategies.
To summarize, if a strategy $r$ creates a larger contribution to the optimization of the objective than other strategies, the probability associated with $r$ increases by a larger amount. This entire process is repeated until the probability distribution converges or the maximum number of iterations are completed.
\subsubsection{Probability Collectives Results}
There are no specific results comparing Probability Collectives to other DRVP-solving algorithms. Probability Collectives do have some noteworthy trends; they outperform GA in avoiding dead end solutions and in long term optimization \cite{book}. Vasirani and Ossowski \cite{Vasirani:2008} do show that the choice of global utility estimate has a profound effect on how well the algorithm performs. Global utility estimates that do not take the utility of the agent invoking the function are preferred over functions that take all agents into consideration.
\section{Conclusion}
\label{conclusion}
The vehicle routing problem is an important problem in real world logistics, and as in the field of optimization. The most performant algorithm in RVRP is HGSADC, which is dominant within the scope of VRPTW. Adding human intuition or systems emulating it to optimization systems may be important in improving routing systems. In decentralized RVRP variants, the specific constraints of each problem limit the comparability of each algorithm. The distributed reverse Vickrey auction is conservative with the amount of information available to each individual; probability collectives The distributed reverse Vickrey algorithm is within several percent of the best known solutions. Probability collectives is not as good in a straightforward sense, but it handles noisy and imperfect agent behaviors better than other distributed systems. Decentralized means of solving the VRP are not as well-developed as centralized methods, but they show good promise for solving near-future routing problems.
\section*{Acknowledgements}
I'd like to thank caffeine, water, and stress: the raw elements that formed this paper. Harris L. Mayer's 1964 paper, ``Opacity Calculations, Past and Future" was a wonderful source of literary inspiration. Finally, this paper owes a great deal to Nic McPhee and Kirbie Dramdahl for their thoughtful feedback and advice.
\bibliographystyle{abbrv}
\bibliography{annotated_bibliography}
\end{document}