Skip to content
This repository was archived by the owner on Apr 30, 2025. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
93ea66c
added smooth c++ file
andeElliott Jun 1, 2018
d4d7d84
fixed a few annoying bugs in the implementation.
andeElliott Jun 1, 2018
92d2abd
Modifications to add the c code into the function
andeElliott Jun 3, 2018
1dda134
removed 1 if statement out of the loop
andeElliott Jun 3, 2018
5ef89bd
added speed test for smooth version
andeElliott Jun 5, 2018
4496097
removed some of the print messages
andeElliott Jun 18, 2018
80cd6df
added the new methods to the namespace for exports
andeElliott Jun 18, 2018
c0a28e8
commented and removing unneeded code
andeElliott Nov 14, 2019
87bd247
Merge branch 'master' into fastSmoothEMD
andeElliott Nov 14, 2019
3e09482
figuring out what is wrong
andeElliott Nov 14, 2019
87d00b7
removing changes to see if it helps
andeElliott Nov 14, 2019
761898d
Inital working version of fastEmdSmooth
andeElliott Nov 20, 2019
b7a6674
fixed small bug
andeElliott Nov 20, 2019
6b64eb2
missed a few internal paths used for testing
andeElliott Nov 20, 2019
40dddeb
adding the V2 to namespace
andeElliott Nov 20, 2019
b19b6fe
updated versions of wrapper files
andeElliott Nov 20, 2019
15ee86b
stable version
andeElliott Nov 20, 2019
d959a79
updated iupdated
andeElliott Nov 21, 2019
c7b1cc2
updated test
andeElliott Nov 21, 2019
b5bc0a8
updated tests
andeElliott Nov 21, 2019
0cec4ed
Merge branch 'newBranch' into fastSmoothEMD
andeElliott Nov 21, 2019
5ed553e
added old failure case. as a test
andeElliott Nov 22, 2019
cac3bab
first real fast version
andeElliott Nov 22, 2019
deae045
working partially cleaned version
andeElliott Nov 22, 2019
f4c816a
current working version
andeElliott Nov 25, 2019
dc7e53d
updated file naming
andeElliott Nov 25, 2019
9c99088
new fast smooth file
andeElliott Nov 25, 2019
10d7de8
changed name of old version
andeElliott Nov 25, 2019
cc76a3b
removed sourceCpp
andeElliott Nov 25, 2019
aa1f6e8
fixed introduced bug
andeElliott Nov 25, 2019
6f7a515
updated the code
andeElliott Nov 26, 2019
8000fa1
replace missing file
andeElliott Nov 26, 2019
1f5d9d7
Formatting and whitespace
ots22 Apr 17, 2020
91c9561
Formatting
ots22 Apr 17, 2020
9151e52
Local for-loop index
ots22 Apr 17, 2020
b35563e
Combine some conditional cases in get_segment
ots22 Apr 17, 2020
0ca7c89
Combine cases for handling the leftmost segments (NetEmdSmoothV2)
ots22 Apr 17, 2020
c36c3f6
Remove unused function
ots22 Apr 17, 2020
4af238d
Rename loop counters
ots22 Apr 17, 2020
3feb0ae
added tests
andeElliott Apr 17, 2020
a398920
Iterable class to interleave two sequences in the required way
ots22 Apr 20, 2020
c302c43
added fast smooth header
andeElliott Apr 21, 2020
4b57f15
Merge branch 'fastSmoothEMD-review' of https://github.com/alan-turing…
andeElliott Apr 21, 2020
89bba30
updated tests
andeElliott Apr 21, 2020
5436310
remove browser
andeElliott Apr 21, 2020
ce8cbb3
Merge branch 'master' into fastSmoothEMD-review
andeElliott Apr 21, 2020
5c4dd42
Iterate over overlapping intervals
ots22 Apr 22, 2020
77b9338
Move overlapping segment iterator into fastSmoothV2.h
ots22 Apr 22, 2020
72de9ea
Use NumericVectors in tests
ots22 Apr 22, 2020
8e12200
Use overlapping interval iterator in NetEmdSmoothV2
ots22 Apr 22, 2020
857c91c
Small refactor of OverlappingSegments, and use for NetEmdSmoothV2
ots22 Apr 23, 2020
17ef75e
Small formatting changes
ots22 Apr 23, 2020
a67c2ce
Move begin() and end() after iterator class (fix breaking build)
ots22 Apr 23, 2020
a59cc6f
test
leospinaf Jun 23, 2020
1fd6fb7
removed the flag for sse to test if this will work on M1
andeElliott May 18, 2022
66f9f8f
attempt2 to fix the M1 issues
andeElliott May 18, 2022
dcff9a3
adding arg back in to see if it is needed
andeElliott May 18, 2022
e569025
Merge branch 'M1fix' into fastSmoothEMD-review-m1
andeElliott May 18, 2022
0a04e3b
updated Rcpp Exports
andeElliott May 18, 2022
474d895
update namespace to test if functions will export
andeElliott May 18, 2022
7c8055c
Merge branch 'master' into fastSmoothEMD-review
ots22 Jun 7, 2022
f11f068
Regenerated NAMESPACE and RcppExports.R after devtools::document()
ots22 Jun 7, 2022
13eba31
Update to renamed function in test case
ots22 Jun 8, 2022
d3ab662
Use Kahan summation in NetEmdSmoothV2
ots22 Jun 8, 2022
a543607
Remove old unused versions of fastSmooth
ots22 Jun 8, 2022
b0eb0a5
adding tests with previous version rather than integrate
andeElliott Jun 8, 2022
2abf918
Merge branch 'fastSmoothEMD-review' of github.com:alan-turing-institu…
andeElliott Jun 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Generated by roxygen2: do not edit by hand

export(NetEmdSmoothV2)
export(adaptive_breaks)
export(area_between_dhist_ecmfs)
export(as_smoothed_dhist)
Expand Down Expand Up @@ -48,6 +49,7 @@ export(min_emd)
export(min_emd_exhaustive)
export(min_emd_optimise)
export(min_emd_optimise_fast)
export(netEMDSpeedTestSmooth)
export(netdis)
export(netdis.plot)
export(netdis_centred_graphlet_counts)
Expand Down
10 changes: 10 additions & 0 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,13 @@ emd_fast_no_smoothing <- function(locations1, values1, locations2, values2) {
.Call(`_netdist_emd_fast_no_smoothing`, locations1, values1, locations2, values2)
}

#' @title
#' Compute EMD
NULL

#'
#' @export
NetEmdSmoothV2 <- function(loc1, val1, binWidth1, loc2, val2, binWidth2) {
.Call(`_netdist_NetEmdSmoothV2`, loc1, val1, binWidth1, loc2, val2, binWidth2)
}

45 changes: 42 additions & 3 deletions R/emd.R
Original file line number Diff line number Diff line change
Expand Up @@ -81,10 +81,49 @@ min_emd_optimise_fast <- function(dhist1, dhist2) {
min_offset <- soln$minimum
return(list(min_emd = min_emd, min_offset = min_offset))
}
else {
# Fall back on other version if either dhist is smoothed
return(min_emd_optimise(dhist1, dhist2))
else #if ((dhist1$smoothing_window_width==1) && (dhist2$smoothing_window_width==1))
{
val1 <- cumsum(dhist1$masses)
val2 <- cumsum(dhist2$masses)
val1 <- val1/val1[length(val1)]
val2 <- val2/val2[length(val2)]
loc1=dhist1$locations
loc2=dhist2$locations
binWidth1=dhist1$smoothing_window_width
binWidth2=dhist2$smoothing_window_width
count=0
# Offset the histograms to make the alignments work
loc1Mod <- loc1 - binWidth1/2
loc2Mod <- loc2 - binWidth2/2
# Determine minimum and maximum offset of range in which histograms overlap
# (based on sliding histogram 1)
min_offset <- min(loc2Mod) - max(loc1Mod) - max(binWidth1,binWidth2)
max_offset <- max(loc2Mod) - min(loc1Mod) + max(binWidth1,binWidth2)
# Set lower and upper range for optimise algorithm to be somewhat wider than
# range defined by the minimum and maximum offset. This guards against a
# couple of issues that arise if the optimise range is exactly min_offset
# to max_offset
# 1) If lower and upper are equal, the optimise method will throw and error
# 2) It seems that optimise is not guaranteed to explore its lower and upper
# bounds, even in the case where one of them is the offset with minimum
# EMD
buffer <- 0.1
min_offset <- min_offset - buffer
max_offset <- max_offset + buffer
# Define a single parameter function to minimise emd as a function of offset
emd_offset <- function(offset) {
temp1<- NetEmdSmoothV2(loc1Mod+offset,val1,binWidth1,loc2Mod,val2,binWidth2)
temp1
}
# Get solution from optimiser
soln <- stats::optimise(emd_offset, lower = min_offset, upper = max_offset,
tol = .Machine$double.eps*1000)
# Return mnimum EMD and associated offset
min_emd <- soln$objective
min_offset <- soln$minimum
return(list(min_emd = min_emd, min_offset = min_offset))
}

}


Expand Down
37 changes: 37 additions & 0 deletions R/net_emd_speed_benchmark.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,40 @@ netEMDSpeedTest <- function() {
}
list(gddBuildTime = gddBuildTime, netEMDtime = netEMDtime)
}

#' @export
netEMDSpeedTestSmooth <- function()
{
##load the data
source_dir <- system.file(file.path("extdata", "random"), package = "netdist")
print(source_dir)
edge_format = "ncol"
file_pattern = ""
# source_dir <- system.file(file.path("extdata", "VRPINS"), package = "netdist")
# edge_format = "ncol"
# file_pattern = ".txt"
graphs <- read_simple_graphs(source_dir = source_dir, format = edge_format, pattern = file_pattern)
n1=names(graphs)
lab1=c()
gddBuildTime=c()
netEMDtime=c()
for (i in 1:length(graphs))
{
for (j in 1:(i))
{
g1=graphs[[i]]
g2=graphs[[j]]
lab1=append(lab1,paste(n1[i],n1[j],sep=','))
print(paste(n1[i],n1[j],sep=','))
fulltimeStart=Sys.time()
gdd1=gdd(g1)
gdd2=gdd(g2)
netEMDStart=Sys.time()
net_emd(gdd1,gdd2,smoothing_window_width = 1)
endTime=Sys.time()
gddBuildTime=append(gddBuildTime,as.double(netEMDStart-fulltimeStart))
netEMDtime=append(netEMDtime,as.double(endTime-netEMDStart))
}
}
list(gddBuildTime = gddBuildTime, netEMDtime = netEMDtime)
}
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Network Comparison
An R package implementing the Netdis and NetEMD alignment-free network comparison measures.
An R package implementing the Netdis and NetEMD alignment-free network comparison measures.


### :warning: BETA: Package under construction (pre-release) :warning:
Until this package hits release 1.0 anything can change with no notice.
Expand Down
2 changes: 1 addition & 1 deletion src/Makevars
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
CXX_STD = CXX11
PKG_CPPFLAGS += -fno-fast-math -msse2 -mfpmath=sse -mstackrealign
PKG_CPPFLAGS += -fno-fast-math -msse2 -mstackrealign
22 changes: 22 additions & 0 deletions src/RcppExports.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@

using namespace Rcpp;

#ifdef RCPP_USE_GLOBAL_ROSTREAM
Rcpp::Rostream<true>& Rcpp::Rcout = Rcpp::Rcpp_cout_get();
Rcpp::Rostream<false>& Rcpp::Rcerr = Rcpp::Rcpp_cerr_get();
#endif

// counts_from_observations
NumericMatrix counts_from_observations(NumericMatrix features);
RcppExport SEXP _netdist_counts_from_observations(SEXP featuresSEXP) {
Expand All @@ -30,12 +35,29 @@ BEGIN_RCPP
return rcpp_result_gen;
END_RCPP
}
// NetEmdSmoothV2
double NetEmdSmoothV2(NumericVector loc1, NumericVector val1, double binWidth1, NumericVector loc2, NumericVector val2, double binWidth2);
RcppExport SEXP _netdist_NetEmdSmoothV2(SEXP loc1SEXP, SEXP val1SEXP, SEXP binWidth1SEXP, SEXP loc2SEXP, SEXP val2SEXP, SEXP binWidth2SEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< NumericVector >::type loc1(loc1SEXP);
Rcpp::traits::input_parameter< NumericVector >::type val1(val1SEXP);
Rcpp::traits::input_parameter< double >::type binWidth1(binWidth1SEXP);
Rcpp::traits::input_parameter< NumericVector >::type loc2(loc2SEXP);
Rcpp::traits::input_parameter< NumericVector >::type val2(val2SEXP);
Rcpp::traits::input_parameter< double >::type binWidth2(binWidth2SEXP);
rcpp_result_gen = Rcpp::wrap(NetEmdSmoothV2(loc1, val1, binWidth1, loc2, val2, binWidth2));
return rcpp_result_gen;
END_RCPP
}

RcppExport SEXP run_testthat_tests();

static const R_CallMethodDef CallEntries[] = {
{"_netdist_counts_from_observations", (DL_FUNC) &_netdist_counts_from_observations, 1},
{"_netdist_emd_fast_no_smoothing", (DL_FUNC) &_netdist_emd_fast_no_smoothing, 4},
{"_netdist_NetEmdSmoothV2", (DL_FUNC) &_netdist_NetEmdSmoothV2, 6},
{"run_testthat_tests", (DL_FUNC) &run_testthat_tests, 0},
{NULL, NULL, 0}
};
Expand Down
197 changes: 197 additions & 0 deletions src/fastSmoothV2.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
// Enable C++11
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <unordered_set>
#include <algorithm>
#include <numeric>
#include <math.h>
#include "emd_fast_no_smoothing.h" // add_element_kahan()
#include "fastSmoothV2.h"

using namespace Rcpp;

double bowtie_area(double length, double val1_start, double val1_end,
double val2_start, double val2_end)
{
double midPoint = (val1_start - val2_start) /
((val2_end - val2_start) - (val1_end - val1_start));

const double midValue = val1_start + midPoint * (val1_end - val1_start);

midPoint = midPoint * length;

double topTriangle = 0.5 * midPoint * (midValue - val1_start);
double topRectangle = midPoint * (val1_start - val2_start);
double bottomTriangle = 0.5 * midPoint * (midValue - val2_start);

double res = topTriangle + topRectangle - bottomTriangle;

topTriangle = 0.5 * (length - midPoint) * (val2_end - midValue);
topRectangle = 0; // midPoint*(val1_start-val2_start);
bottomTriangle = 0.5 * (length - midPoint) * (val1_end - midValue);

res += topTriangle + topRectangle - bottomTriangle;
return res;
}

// Compute the unsigned area between two line segments
// assumes that val1_end > val1_start and val2_end > val2_start
double get_segment(double start, double end, double val1_start,
double val1_end, double val2_start, double val2_end)
{
const double length = end - start;

double topTriangle;
double topRectangle;
double bottomTriangle;
double midPoint;
double midValue;
double res = 0;

bool both_differences_positive = val1_start > val2_start && val1_end >= val2_end;
bool both_differences_negative = val1_start <= val2_start && val1_end <= val2_end;

if (both_differences_positive || both_differences_negative)
{
// They are in the same order: no bowtie
// triangle of seg1
topTriangle = 0.5 * length * (val1_end - val1_start);
// rectangle between seg1 and seg2
topRectangle = length * (val1_start - val2_start);
// triangle of seg2 (to be removed)
bottomTriangle = 0.5 * length * (val2_end - val2_start);

const double sign = both_differences_positive?1.0:-1.0;
return sign * (topTriangle + topRectangle - bottomTriangle);
}
else if (val1_start > val2_start) { // bowtie, first case
return bowtie_area(length, val1_start, val1_end, val2_start, val2_end);
}
else { // bowtie, second case
return bowtie_area(length, val2_start, val2_end, val1_start, val1_end);
}
}

// cut down and compute segment
double get_segment_constrained(double seg1L1, double seg1L2,
double seg2L1, double seg2L2,
double seg1V1, double seg1V2,
double seg2V1, double seg2V2)
{
//We have a valid range
double valStart1, valEnd1, valStart2, valEnd2;
double start,end;
double result;
start = std::max(seg1L1,seg2L1);
end = std::min(seg1L2,seg2L2);
if (start < end) {
valStart1 = seg1V1 + (seg1V2 - seg1V1)*(start - seg1L1)/(seg1L2 - seg1L1);
valEnd1 = seg1V1 + (seg1V2 - seg1V1)*(end - seg1L1)/(seg1L2 - seg1L1);
valStart2 = seg2V1 + (seg2V2 - seg2V1)*(start - seg2L1)/(seg2L2 - seg2L1);
valEnd2 = seg2V1 + (seg2V2 - seg2V1)*(end - seg2L1)/(seg2L2 - seg2L1);
result = get_segment(start, end, valStart1, valEnd1, valStart2, valEnd2);
return result;
}
else {
return 0;
}
}

double get_double_segment_constrained(
double seg1Loc1, double seg1Loc2, double seg1Loc3,
double seg1Val1, double seg1Val2,
double seg2Loc1, double seg2Loc2, double seg2Loc3,
double seg2Val1, double seg2Val2)
{
double res = 0;

// compare the linear section with the linear section
res += get_segment_constrained(seg1Loc1, seg1Loc2, seg2Loc1, seg2Loc2,
seg1Val1, seg1Val2, seg2Val1, seg2Val2);

// compare the linear section with the flat section
// This could be easily special cased (saving ~1 if statements ).
res += get_segment_constrained(seg1Loc1, seg1Loc2, seg2Loc2, seg2Loc3,
seg1Val1, seg1Val2, seg2Val2, seg2Val2);

// compare the flat section with the linear section
// This could be easily special cased (saving ~1 if statements ).
res += get_segment_constrained(seg1Loc2, seg1Loc3, seg2Loc1, seg2Loc2,
seg1Val2, seg1Val2, seg2Val1, seg2Val2);

// compare the flat section with the flat section
// This could be easily special cased (saving ~2 if statements ).
res += get_segment_constrained(seg1Loc2, seg1Loc3, seg2Loc2, seg2Loc3,
seg1Val2, seg1Val2, seg2Val2, seg2Val2);

return res;
}

//' @title
//' Compute EMD
////'
////' @param loc1 numeric vector.
////' @param val1 numeric vector.
////' @param loc2 numeric vector.
////' @param val2 numeric vector.
//'
//' @export
// [[Rcpp::export]]
double NetEmdSmoothV2(NumericVector loc1, NumericVector val1, double binWidth1,
NumericVector loc2, NumericVector val2, double binWidth2)
{
OverlappingSegments segs(loc1, loc2, binWidth1, binWidth2);

double res = 0.0;
double compensation = 0.0;

for (OverlappingSegments::iterator it = segs.begin(), endIt = segs.end();
it != endIt; ++it)
{
// The OverlappingSegments iterator returns pairs of the left
// indices of overlapping endpoints: it->first is the index into
// loc1, it->second is the index into loc2. When the smallest
// element in one of these vectors is larger than the current
// element of the other, an 'index' of -1 is returned.

// Hist 1
// Start of the gradient section in Seg1
double curSeg1Loc1 = segs.loc1_left(it->first);

// End of the gradient section in Seg1
double curSeg1Loc2 = segs.loc1_mid(it->first);

// End of the flat section in Seg1
double curSeg1Loc3 = segs.loc1_right(it->first);

// Start and end values in Seg1: val1 gives the values at *right*
// endpoints of the segments. A value of 0.0 is used before the
// first segment.
double curSeg1Val1 = (it->first > 0) ? val1[it->first - 1] : 0.0;
double curSeg1Val2 = (it->first >= 0) ? val1[it->first] : 0.0;

// Hist 2
// Start of the gradient section in Seg1
double curSeg2Loc1 = segs.loc2_left(it->second);

// End of the gradient section in Seg1
double curSeg2Loc2 = segs.loc2_mid(it->second);

// End of the flat section in Seg1
double curSeg2Loc3 = segs.loc2_right(it->second);

// Start and end values in Seg2: val2 gives the values at *right*
// endpoints of the segments. A value of 0.0 is used before the
// first segment.
double curSeg2Val1 = (it->second > 0) ? val2[it->second - 1] : 0.0;
double curSeg2Val2 = (it->second >= 0) ? val2[it->second] : 0.0;

double element = get_double_segment_constrained(
curSeg1Loc1, curSeg1Loc2, curSeg1Loc3, curSeg1Val1, curSeg1Val2,
curSeg2Loc1, curSeg2Loc2, curSeg2Loc3, curSeg2Val1, curSeg2Val2);

add_element_kahan(res, element, compensation);
}

return res;
}
Loading