-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/uniform point distribution #618
base: develop
Are you sure you want to change the base?
Feature/uniform point distribution #618
Conversation
…esian 2d only at this time) or convex and a general rejection sampling strategy based on convex hull.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR Tinko!
My main comment here is that you create strategies for random points in geometries and you dispatch w.r.t. the geometry type. Currently, in the library this dispatching is happening in the algorithms side and strategies are dispatching w.r.t. coordinate systems. So all the code that is coordinate system unspecific should be moved from agnostic
strategies to algorithms. Then the code that is going to work only for one coordinate system e.g. cartesian
should constitute one (or more) strategy.
Did I understand your intention correctly, or am I missing something?
|
||
std::default_random_engine generator(1); | ||
polygon poly; | ||
boost::geometry::read_wkt("POLYGON((16 21,17.1226 17.5451,20.7553 17.5451,17.8164 15.4098,18.9389 11.9549,16 14.0902,13.0611 11.9549,14.1836 15.4098,11.2447 17.5451,14.8774 17.5451,16 21))", poly); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please reduce line's length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.:
boost::geometry::read_wkt("POLYGON((16 21,17.1226 17.5451,20.7553 17.5451,17.8164 15.4098,"
"18.9389 11.9549,16 14.0902,13.0611 11.9549,14.1836 15.4098,"
"11.2447 17.5451,14.8774 17.5451,16 21))",
poly);
// Boost.Geometry (aka GGL, Generic Geometry Library) | ||
// Unit Test | ||
|
||
// Copyright (c) 2019 Tinko Bartels, Berlin, Germany. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can ack gsoc (see other PR)
typename TriangleStrategy, | ||
typename SideStrategy //Actually, we need a triangle area strategy here. | ||
> | ||
struct uniform_convex_fan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this really a strategy (i.e. coordinate type specific) or an algorithm? To my point of view agnostic strategies are just algorithms
// relative size is smaller than doubles epsilon, it is too unlikely to | ||
// realistically occur in a random sample anyway. | ||
public: | ||
uniform_convex_fan(DomainGeometry const& g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any clue for choosing that name? As far as I know, in geometry, fan is a collection of cones and thus fan makes no sense. Maybe I am missing some notation or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name was chosen based on the fact that this class implicitly uses the fan triangulation of convex polygons for sampling. Reconsidering that choice, the fact that it uses the fan triangulation may be an implementation detail and alluding to it may be unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so this is probably not the right name. The name should declare what this class is doing, e.g. uniform_convex_polygon_sampler
makes sense?
private: | ||
typedef typename point_type<DomainGeometry>::type domain_point_type; | ||
typedef boost::geometry::model::ring<domain_point_type> ring; | ||
ring hull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be m_hull
return true; | ||
} | ||
template<typename Gen> | ||
Point apply(Gen& g, DomainGeometry const& d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is more clear to name it Generator
instead of Gen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also please be consistent; above g
had DomainGeometry
type, here d
is DomainGeometry
and g
type Gen
< | ||
typename Point, | ||
typename DomainGeometry, | ||
typename BoxStrategy = services::default_strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if that was an algorithm instead of agnostic strategy, default strategy could have been dispatched as in other bg algorithms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could leave it to the user to randomize points and then reject samples when they don't meet some criteria like being within a box in this case.
extensions/test/random/random.cpp
Outdated
typedef bg::model::polygon<point2d_cart> polygon; | ||
polygon poly; | ||
bg::read_wkt( | ||
"POLYGON((16 21,17.1226 17.5451,20.7553 17.5451, 17.8164 15.4098,18.9389 11.9549,16 14.0902,13.0611 11.9549, 14.1836 15.4098,11.2447 17.5451,14.8774 17.5451,16 21))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please reduce length
std::vector<double> accumulated_areas; | ||
// It is hard to see a reason not to use double here. If a triangles | ||
// relative size is smaller than doubles epsilon, it is too unlikely to | ||
// realistically occur in a random sample anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type should probably be taken from area strategy if that's what's being calculated here (not side) .
typename Point, | ||
typename DomainGeometry, | ||
typename TriangleStrategy, | ||
typename SideStrategy //Actually, we need a triangle area strategy here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use already existing AreaStrategy here? Can this agnostic strategy (or algorithm) be used in noncartesian CS? Side strategy would not allow to calculate areas in general case as you noticed.
uniform_convex_fan(DomainGeometry const& g) | ||
{ | ||
accumulated_areas.push_back(0); | ||
for (int i = 2 ; i < g.size() ; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We put {
in the next line.
DomainGeometry const& r_domain, | ||
uniform_convex_fan const& r_strategy) const | ||
{ | ||
if( l_domain.size() != r_domain.size() ) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{
return false;
}
Yeah, I know it looks ugly. I personally would put return false;
without brackets in the next line but this is what our guidelines say about formatting.
Btw, @barendgehrels what do you think about allowing single-line returns without brackets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for one line returns.
also in if(
add a space
for (int i = 0; i < l_domain.size(); ++i) { | ||
if( !boost::geometry::equals(*(l_domain.begin() + i), | ||
*(r_domain.begin() + i))) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So like that, though in this case it would make sense to have brakcets because it's hard to see the return false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the for
's {
should be in new line and a space between if
and (
DomainGeometry const& r_domain, | ||
uniform_convex_fan const& r_strategy) const | ||
{ | ||
if( l_domain.size() != r_domain.size() ) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is DomainGeometry
? Is it always STL container? Always std::vector
? I'm asking because it's a generic type but you call .size()
. For generic ranges we use Boost.Range's boost::size()
.
uniform_convex_fan const& r_strategy) const | ||
{ | ||
if( l_domain.size() != r_domain.size() ) return false; | ||
for (int i = 0; i < l_domain.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then here int
should either be DomainGeometry::size_type
, std::vector<...>::size_type
or boost::size_type<DomainGeometry>::type
. Or at least std::size_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding a few more comments. I am stopping here my review since I found them many. I will do a second round after addressing / discussing them.
} | ||
bool equals(DomainGeometry const& l_domain, | ||
DomainGeometry const& r_domain, | ||
uniform_convex_fan const& r_strategy) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this strategy is unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have that parameter because the rhs strategy is used in equal-comparison for other strategies and so the code in uniform_point_distribution passes it. Would it be ok, if I just removed the name (r_strategy)? Or is their a way around the necessity for having all strategies take all values that other strategies could need for comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing the name r_strategy
would be ok for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively you can use boost::ignore_unused(r_strategy)
Point apply(Gen& g, DomainGeometry const& d) | ||
{ | ||
std::uniform_real_distribution<double> dist(0, 1); | ||
double r = dist(g) * accumulated_areas.back(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find it in guidelines but bg uses one initialization per variable as
double r = dist(g) * accumulated_areas.back();
double s = dist(g);
@barendgehrels right?
// relative size is smaller than doubles epsilon, it is too unlikely to | ||
// realistically occur in a random sample anyway. | ||
public: | ||
uniform_convex_fan(DomainGeometry const& g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so this is probably not the right name. The name should declare what this class is doing, e.g. uniform_convex_polygon_sampler
makes sense?
template | ||
< | ||
typename Point, | ||
typename DomainGeometry, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is either a convex ring or convex polygon (they are actually the same but the can have different types)
*(g.begin() + i)))); | ||
} | ||
} | ||
bool equals(DomainGeometry const& l_domain, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see why do you need this function? BG has functions for checking equality of geometries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you check all the possible linestrings of the convex polygons from a fixed point. What if the geometries are equal and they start at different vertices i.e. domain.begin()
. What am I missing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notion of equality as defined by the C++ named requirement "RandomNumberDistribution" on which I based my design of random point distributions considers two distribution instances A, B equal if their parameters are equal and they will produce the same output given the same generators, i.e. from A == B and g1 == g2 it needs to follow that A(g1) == A(g2).
Because of the way I generate samples for convex rings, two distributions will only produce the same output if the convex rings start from the same point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see, I remember also our discussions on this topic. Please add a comment or document this for future reference
Point apply(Gen& g, DomainGeometry const& d) | ||
{ | ||
Point p; | ||
do{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intersection of hull
with d
must be non-empty otherwise you end up in infinite loop. More interestingly, if dim(d) < 2 e.g. d
is a linestring, or point, you end up in a long (if not endless) loop because the probability for hitting the line or point by sampling in an intersecting polygon is very small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the implementation of uniform_point_distribution which is the user of all those strategies, hull will always be the convex hull of d, hence their intersection must be non-empty if d has a non-empty interior. I can see that this is problematic since the strategy in itself is an exposed interface that could be misused by passing different geometries to constructor and apply-call. I will defer that issue since it may be rendered moot by future refactoring because of the question of what belongs into strategies.
If d (as an areal geometry) has an empty interior (area is zero), I consider it to be invalid input since I am sampling from the interior with the areal types. I understand that this will need to be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please put (at least) a comment here.
{ | ||
typedef typename point_type<DomainGeometry>::type domain_point; | ||
domain_point p; | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly with above comment that could end up in infinite loop
typename TriangleStrategy, | ||
typename SideStrategy //Actually, we need a triangle area strategy here. | ||
> | ||
struct uniform_convex_hull_rejection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any reason to have rejection sampling with convex polygons that are not envelopes (boxes)? Since it is more difficult to generate random points in polygon than boxes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have different performance characteristics.
Rejection sampling with envelope:
cost of initialization: obtain the envelope
cost per sample: (sample from the envelope (constant) + evaluate within-predicate (linear, I guess) ) * ( area of envelope / area of geometry)
Rejection sampling with convex hull:
cost of initialization: obtain the convex hull
cost per sample: (sample from the convex hull (logarithmic in convex hull size) + evaluate within-predicate (linear, I guess) ) * (area of convex hull / area of geometry)
Now, (area of convex / area of geometry) is usually smaller and at most equal to ( area of envelope / area of geometry) and the other factor should be dominated by the work to check for within. I guess the computation of the convex hull is more expensive than the computation of the envelope. So I'd expect rejection sampling with the convex hull to be slower if few samples are requested but faster if many samples are requested.
I did some benchmarks and was able to confirm that, if you are interested I can create a gist with the code and the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering why do we need rejection sampling at all? Are there any benefits of this solution vs. forcing the user to reject samples? AFAIU with this strategy it's the same as randomizing points using non-rejection sampling strategy and then manually rejecting samples if they are not within a polygon or a convex hull of a polygon. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that depends on the interface one wants to provide. E.g. consider uniform_real_distribution. It takes a domain from the user, even though it could just provide random numbers in [0,1], since mapping that to [a,b] is trivial, but for some reason they decided to add those constructor parameters.
My intended design for uniform_point_distribution was such that a user of this facility could just pass some geometry to the constructor and get the expected results. I could remove both rejection strategies and then the uniform_point_distribution would not work with arbitrary polygons. The user would have to call envelope or and use within or, if he wants more performance for a higher number of samples, use convex_polygon. As a related anecdote, until near the end of GSoC I was unaware of the existence of line_interpolate, so user users might also not know all facilities.
Another point is that the advantage of having an interface where the constructor of uniform_point_distribution takes polygons directly may speed up code that uses it that way, once a faster algorithm for sampling in polygons has been implemented. For example, if Boost.Geometry some day gets an algorithm for the triangulation of non-convex polygons then that would allow a strategy that is faster than the rejection-based strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tinko92 if you have the code and results already, making them public in a gist will benefit all of us. There are also (at least) two more parameters that play a role in trade-offs between those two rejection samplers, the ratio of area(envelope)/area(polygon) and the size of the hull. But I agree that since there should be a better way of sampling (via non-convex triangulations) spending more time in comparisons between those two could be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vissarion Here is the gist: https://gist.github.com/tinko92/21d72e4dac4b38f3723511960d3613ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tinko92 Right, it makes sense. The interface allows to pass a geometry and internally various algorithms can be used during randomization. Whether it is rejection sampling or not doesn't really matter.
domain_point p; | ||
do { | ||
p = m_env_strat.apply(g, m_env); | ||
}while( !::boost::geometry::within(p, d) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space missing in }while
std::lower_bound(accumulated_areas.begin(), | ||
accumulated_areas.end(), | ||
r)); | ||
return TriangleStrategy::template map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why map
and not sampling
or something that declares sampling from a triangle
with TriangleStrategy
?
@tinko92 one last comment, in gsoc project you also included the documentation of the code. Is it going to be in a different PR, or will be left as future work? |
typename Point, | ||
typename DomainGeometry, | ||
typename TriangleStrategy, | ||
typename SideStrategy //Actually, we need a triangle area strategy here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could existing area strategy be used here?
typename cs_tag<DomainGeometry>::type | ||
>::type | ||
> | ||
struct uniform_linear_single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also looks like an algorithm. Are there any other strategies defined for single linear geometries?
double, | ||
typename coordinate_type<Point>::type | ||
>::type sample_type; | ||
std::uniform_real_distribution<sample_type> dist(0, length(d)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's bg::length()
right? If that's the case then it needs pt-pt distance strategy.
std::uniform_real_distribution<sample_type> dist(0, length(d)); | ||
sample_type r = dist(g); | ||
Point p; | ||
boost::geometry::line_interpolate(d, r, p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line_interpolate()
needs a strategy as well.
Would it make sense to make uniform_linear_single
a part of the randomization algorithm (so algorithm dispatched for single linear geometries) and add umbrella strategy (or 3, one for each CS) containing getters for distance
/length
and line_interpolate
strategies?
It's also the case for multi strategy below and in fact AFAIU in both cases (single and multi) the same umbrella strategy could be used because below also length
and line_interpolate
is called.
Thanks for all the comments, I already addressed a couple of them in comments and some with a commit. I think the main question, though, is the matter of strategies and algorithms. I am not yet sure how to best change my code to address the problem that it dispatches strategies w.r.t. geometry tag. Here is what makes it difficult:
So what would be the preferable way to make parts of the class implementation interchangeable and extensible? It seems to be a situation where one could use the strategy pattern, but I understand that strategies should only be used for coordinate_system dependent differences. |
DomainGeometry m_domain; | ||
}; | ||
uniform_point_distribution(DomainGeometry const& domain) | ||
: m_strategy(domain), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to pass a CS-specific strategy into the distribution somehow. If not in the constructor then maybe in the operator(). It's because a strategy could take CS-specific parameters like radius in spherical CS or spheroid in geographic CS. This is another argument backing up the division of CS-specific part and algorithm-specific part of strategies that are currently implemented. EDIT: And by that I mean that e.g. uniform_linear_single
would work in all coordinate systems if the user passed correct length and linear interpolation strategies into it.
@tinko92 I think we should have another level of abstraction between a CS-specific part of what is now a strategy and algorithmic part. Note that currently the CS-specific part is not really exposed, because the algorithms are called with default strategies in random strategies. E.g. in case of I think we need additional layer somewhere between the standard random-generator and uniform-point-distribution which would be at the algorithms/distribution side and then on the strategies side there would be CS-specific part. But I'm not sure what the interface should look like. The easiest thing would be to implement it as various point distributions and then have one default distribution able to take any geometry as a Domain. Another solution would be to take Policy in distribution, or a static tag defining the algorithm used internally. But I don't really like these ideas. I feel that it would be better to divide distribution, distributing/randomizing algorithm, CS-specific-strategy and generator somehow and combine them by either passing in constructors or operator(). I'm not sure how exactly though. |
CsTag | ||
> : public uniform_envelope_rejection<Point, DomainGeometry> { | ||
typedef uniform_envelope_rejection<Point, DomainGeometry> base; | ||
using base::base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Boost.Geometry services::default_strategy
is not the strategy itself but instead it's a utility defining the type of the default strategy as type
type.
< | ||
Point, | ||
DomainGeometry | ||
> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how default_strategy
would typically be used. It should define type
member type and be used like this:
typename xxx::services::default_strategy<X, Y, Z>::type
There are 4 parts of the randomization:
Randomizing algorithm has to know the domain/geometry and may need to preprocess data once and store the results for future use. This means it needs geometry and CS-specific strategy during construction (passed into the constructor). Then it has to be called while randomizing and for that it again needs CS-specific strategy and generator. Let's start from the obvious. Generator is std or Boost.Random engine. So this is known and has to be passed into the Then the strategy. I propose to contain the CS-specific parts needed by all randomizing algorithms in one umbrella strategy. So to have 3 strategies, one for each CS, e.g.:
And internally define strategies used by whatever is needed by all of the algorithms: distance, interpolation, convex_hull, envelope, expand, etc. Maybe add a new strategy if there is a specific method of randomization for some CS like Then there is distribution and algorithm. I can think of several possible solutions:
There is also an issue mentioned in one of the points above. That is, what different distributions there might be aside from If only certain algorithms can be used with certain distributions then probably dividing them into 2 parts doesn't make sense. Btw, it's not clear to me. Is What do you think? |
@awulkiew thank you for your elaborate comment. I'll address the easy questions first:
I will think more about the designs you proposed. Some tentative comments though:
I have a question regarding the policy solution: What is the difference between a strategy and a policy (conceptually)? Some sources suggest, it is the same thing. |
I also lean towards the policy solution, since it seems possible that some algorithms could be used to sample from other distributions as well. In particular, rejection sampling can be used to sample from Gaussian (or any other distribution by first sampling in a bounding box). Random walks (that referred above) can also be used to sample from various distributions (depends on the particular random walk though). All in all, I think it is a good idea to have that interface and not integrate the algorithm with the distribution. Regarding policies vs strategies in Boost Geometry strategies are used (or supposed to be used, since there is no official documentation for this) only to separate the CS specific part of the algorithm. We use policies for the other "CS-non-specific" uses. |
This is my suggestion number 2, correct? You differentiate between policies and tags while I considered policies and tags as the same interface, so interface 2. At least from the user's perspective they would look the same. So it seems I was not precise enough and I'd like to make sure we're on the same page. So the difference I was originally thinking about would be something like this:
Note that in this case all policies has to have the same interface:
The only difference is that Tags are used to indirectly define the Policy.
So then in the future if we e.g. need to add a policy taking more types than the other policies we're able to modify the internals like So this is the different between Policies and Tags I had in mind. Are we talking about the same thing? Of course instead of template template parameter we could take the Policy as a type but it'd mean we'd force the user to duplicate the types like that:
|
I am sorry for the long delay. I will resume work on this PR and I will attempt to implement the policy solution in the near future and commit it to this PR. |
Hello @awulkiew , I have reconsidered your solutions 2.1 und 2.2 and I genereally lean towards solution 2.1. I have tried to consider the sensible design space for uniform random point generation and how to properly separate the non-cs/cs concerns into policies and strategies. I came up with the following proposal: For pointlike domains, everything is trivial, no strategy/policy questions. For linestrings and multilinestrings, one needs to obtain the total length L and interpolate.
For areal domains there are more sensible solutions, depending on the situation
The ideas for areal domains should work similarly for volumes as well. My GSoC 2019 does not include implementations for all of this (e.g. I do not have an algorithm to triangulate arbitrary polygons) but a lot of this and I should be able to quickly refactor my code to match this proposed layout. Consequences of these considerations:
I hope these remarks present my proposed refactoring clearly, if not it should become more clear with the next commit. |
d32a500
to
ec6adac
Compare
I have committed and pushed changes that move non-specific code into policies. Please consider this commit WIP. I opted to push this early, so that it can discussed whether this general approach is appropriate with regard to the paradigms of Boost.Geometry and is a sound design, generally. If this approach is deemed correct, I will do one more check of the formatting, unnecessary headers, and so on, and rebase everything, so that it is a single commit that can be applied to the current develop branch. Having said that, I will give a summary of the changes: There are the following policies: There are the following strategies: These are basically for boxes and triangles, what line_interpolate is for segments. They map numbers from [0,1] to the primitive geometries (boxes or triangles) in a uniform way, which is the CS-specific aspect of the uniform point distributions. Here is an illustration of what the strategies do: The implementation that I managed to get working in the end is closer to 2.1, but I take the policy as a type. I am aware that this causes the problem of having to duplicate template parameters that you mentioned in the end, but I don't think the policies can all have the same interface because the number of strategies that they take depends on the policy. |
This is the second of three branches that I created during the GSoC. It does not depend on any of the other two branches. It contains uniform_point_distribution (for now in the extension directory), a class modeled after the random number distribution found in Boost.Random and the STL and designed to generate random geometries; in this case, points uniformly distributed on some domain.
Strategies for domains that are pointlike (CS-agnostic), box (n-d cartesian, 2d/3d spherical), segment (everything supported by line_interpolate), linear (everything supported by line_interpolate) or areal (everything that is cartesian or spherical and supported by envelope) are provided. Tests and examples are included, pictures of how the output may look like are found in this blog post:
https://tinko92.github.io/gsoc2019/2019/09/24/work-product.html (second and fourth picture).
@vissarion @awulkiew This branch slightly differs from the one I submitted to you at the end of GSoC because I replaced some of my implementations for uniform sampling on segments with the usage of line_interpolate. I only noticed the algorithm yesterday which is the reason for this last-minute change.