Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classifier #1357

Merged
merged 28 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
96e5b14
the best version so far
matakleo May 22, 2024
33cece3
matche python mostly
matakleo May 22, 2024
c855b6b
before major change, kinda works
matakleo May 29, 2024
ba60af9
long one, all numpy stuff
matakleo May 29, 2024
b3c63ab
classfier v1
matakleo May 31, 2024
84395a1
Seems to be working
matakleo Jun 3, 2024
280ed2a
stlye check
matakleo Jun 3, 2024
95bc1f2
removed some not used stuff
matakleo Jun 3, 2024
ca0541d
add additional convert methods
matakleo Jun 3, 2024
4c947fa
some test fixes
matakleo Jun 3, 2024
d4a163a
rehaul
matakleo Jun 4, 2024
1ed0023
changes from code review comments
matakleo Jun 6, 2024
26eaf71
starting to work
matakleo Jun 18, 2024
2bde111
trying new ncml
matakleo Jun 18, 2024
e723c37
new tests
matakleo Jun 18, 2024
b6fe81c
all seems to work
matakleo Jun 18, 2024
52f6564
all seems to be working!
matakleo Jun 18, 2024
a751329
applied style
matakleo Jun 18, 2024
893a93c
Merge branch 'maint-5.x' into Classifier
matakleo Jun 21, 2024
56ff9dc
final style update
matakleo Jun 21, 2024
a47ebc5
fix the shape of class_specs to 17
matakleo Jun 21, 2024
ece1779
fix pr comments
matakleo Jun 25, 2024
81baa02
fix the flaky test
matakleo Jun 25, 2024
da03532
delete tests, modify constructor
matakleo Jun 26, 2024
87d5aa8
remove unused ncml, add int and float test
matakleo Jun 27, 2024
c2dc1d0
update variable length to correct num
matakleo Jun 27, 2024
5ce0491
Suppress spurious/not relevant jfreechart CVEs (#1358)
tdrwenski Jun 27, 2024
6e36dc7
Merge branch 'maint-5.x' into Classifier
haileyajohnson Jul 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cdm/core/src/main/java/ucar/nc2/constants/CDM.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ public class CDM {
public static final String VALID_RANGE = "valid_range";
public static final String VALID_MIN = "valid_min";
public static final String VALID_MAX = "valid_max";
public static final String RANGE_CAT = "range_cat";


// staggering for _Coordinate.Stagger
public static final String ARAKAWA_E = "Arakawa-E";
Expand Down
117 changes: 92 additions & 25 deletions cdm/core/src/main/java/ucar/nc2/filter/Classifier.java
Original file line number Diff line number Diff line change
@@ -1,34 +1,62 @@
package ucar.nc2.filter;

import java.io.IOException;
import static ucar.ma2.MAMath.nearlyEquals;
import ucar.ma2.Array;
import ucar.ma2.DataType;
import ucar.ma2.IndexIterator;
import ucar.nc2.dataset.VariableDS;
import ucar.nc2.Variable;
import ucar.nc2.constants.CDM;
import ucar.nc2.Attribute;
import ucar.nc2.util.Misc;
import java.util.ArrayList;
import java.util.List;


public class Classifier implements Enhancement {
private Classifier classifier = null;
private static Classifier emptyClassifier;
private int classifiedVal;
private int[] classifiedArray;

public static Classifier createFromVariable(VariableDS var) {
try {
Array arr = var.read();
// DataType type = var.getDataType();
return emptyClassifier();
} catch (IOException e) {
return emptyClassifier();
}
private List<Attribute> AttCat;
private List<int[]> rules = new ArrayList<>();

// Constructor with no arguments
public Classifier() {
this.AttCat = new ArrayList<>();
this.rules = new ArrayList<>();
}

// Constructor with attributes
public Classifier(List<Attribute> AttCat) {
this.AttCat = AttCat;
this.rules = loadClassificationRules();
}

public static Classifier emptyClassifier() {
emptyClassifier = new Classifier();
return emptyClassifier;
// Factory method to create a Classifier from a Variable
public static Classifier createFromVariable(Variable var) {

List<Attribute> attributes = var.getAttributes();

if (var.attributes().hasAttribute(CDM.RANGE_CAT)) {
return new Classifier(attributes);
} else {
return new Classifier();
}

}

/** Enough of a constructor */
public Classifier() {}

public int[] classifyWithAttributes(Array arr) {
int[] classifiedArray = new int[(int) arr.getSize()];
IndexIterator iterArr = arr.getIndexIterator();
int i = 0;
while (iterArr.hasNext()) {
Number value = (Number) iterArr.getObjectNext();
if (!Double.isNaN(value.doubleValue())) {
classifiedArray[i] = classifyArrayAttribute(value.doubleValue());
} else {
classifiedArray[i] = Integer.MIN_VALUE;
}
i++;
}
return classifiedArray;
}

/** Classify double array */
public int[] classifyDoubleArray(Array arr) {
Expand All @@ -46,25 +74,64 @@ public int[] classifyDoubleArray(Array arr) {
return classifiedArray;
}



/** for a single double */
public int classifyArray(double val) {
if (val >= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this classifyArray still being used? maybe we can delete this function as it was the placeholder version and add the categories to the ncml in testClassifier.ncml so your original tests keep working. It's possible that this is causing some of the test issues as the original tests may be using this code path and you only added the tolerance to the comparison below in classifyArrayAttribute

classifiedVal = 1;
} else {
classifiedVal = 0;
}

return classifiedVal;
}

public int classifyArrayAttribute(double val) {
for (int[] rule : rules) {
if (val > rule[0] && val <= rule[1] + ucar.nc2.util.Misc.defaultMaxRelativeDiffFloat) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor style comment: I would add an import statement for

import ucar.nc2.util.Misc;

so that this can just be Misc.defaultMaxRelativeDiffFloat

return rule[2]; // Return the matched rule's value
}
}
// Return min possible int if no rule matches
return Integer.MIN_VALUE;
}

// Method to load classification rules from the attributes
private List<int[]> loadClassificationRules() {
for (Attribute attribute : this.AttCat) {
int[] rule = stringToIntArray(attribute.getStringValue());
this.rules.add(rule);
}
return rules;
}

@Override
public double convert(double val) {
return emptyClassifier.classifyArray(val);
return classifyArray(val);
}

public static int[] stringToIntArray(String str) {
String[] stringArray = str.split(" "); // Split the string by spaces
int[] intArray = new int[stringArray.length]; // Create an array to hold the parsed integers

for (int i = 0; i < stringArray.length; i++) {

double value = Double.parseDouble(stringArray[i]); // Parse each string to a double

if (Double.isNaN(value)) {
// Check if the entry is NaN and assign Integer.MIN_VALUE or Integer.MAX_VALUE based on the index
if (i == 0) {
intArray[i] = Integer.MIN_VALUE;
} else if (i == 1) {
intArray[i] = Integer.MAX_VALUE;
} else {
intArray[i] = -99999; // Default value for other indices if needed
}
} else {
intArray[i] = (int) value; // Convert the value to int if it is not NaN
}

}
}

return intArray;
}

}
19 changes: 19 additions & 0 deletions cdm/core/src/test/data/ncml/enhance/testAddToClassifier.ncml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<?xml version="1.0" encoding="UTF-8"?>

<!--
~ Copyright (c) 1998-2020 John Caron and University Corporation for Atmospheric Research/Unidata
~ See LICENSE for license information.
-->

<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="all">

<variable name="class_specs" shape="17" type="double">
<attribute name="range_cat" type="string" value="NaN 0 0" />
<attribute name="range_cat1" type="string" value="0 10 10" />
<attribute name="range_cat2" type="string" value="10 100 100" />
<attribute name="range_cat3" type="string" value="100 NaN 1000" />
<values>-500000 NaN -10 0 1 2 3 11 25 29 NaN 100 150 121 102 199999 12211</values>
</variable>


</netcdf>
16 changes: 8 additions & 8 deletions cdm/core/src/test/data/ncml/enhance/testClassifier.ncml
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,14 @@
<values>1 2 3 4 5</values>
</variable>

<variable name="intNegatives" shape="5" type="int">
<attribute name="classify"/>
<values>-1.0 -2.0 -3.0 -4.0 -5.0</values>
</variable>
<variable name="intMix" shape="5" type="int">
<attribute name="classify"/>
<values>1.0 -2.0 0.0 4.0 -5.0</values>
</variable>
<variable name="intNegatives" shape="5" type="int">
<attribute name="classify"/>
<values>-1 -2 -3 -4 -5</values>
</variable>
<variable name="intMix" shape="5" type="int">
<attribute name="classify"/>
<values>1 -2 0 4 -5</values>
</variable>



Expand Down
33 changes: 31 additions & 2 deletions cdm/core/src/test/java/ucar/nc2/ncml/TestEnhanceClassifier.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,17 @@

import static com.google.common.truth.Truth.assertThat;
import static ucar.ma2.MAMath.nearlyEquals;

import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import org.junit.Test;
import ucar.ma2.Array;
import ucar.ma2.DataType;
import ucar.nc2.Attribute;
import ucar.nc2.NetcdfFile;
import ucar.nc2.Variable;
import ucar.nc2.dataset.NetcdfDatasets;
import ucar.nc2.filter.Classifier;
import ucar.unidata.util.test.TestDir;

public class TestEnhanceClassifier {
Expand All @@ -22,7 +25,9 @@ public class TestEnhanceClassifier {
public static final Array DATA_all_zeroes = Array.makeFromJavaArray(all_zeroes);
public static final int[] mixNumbers = {1, 0, 1, 1, 0};
public static final Array DATA_mixNumbers = Array.makeFromJavaArray(mixNumbers);

public static final int[] Classification_test =
{0, -2147483648, 0, 0, 10, 10, 10, 100, 100, 100, -2147483648, 100, 1000, 1000, 1000, 1000, 1000};
public static final Array CLASSIFICATION_TEST = Array.makeFromJavaArray(Classification_test);

/** test on doubles, all positives, all negatives and a mixed array */
@Test
Expand Down Expand Up @@ -78,6 +83,7 @@ public void testEnhanceClassifier_floats() throws IOException {
assertThat(floatMix.getDataType()).isEqualTo(DataType.FLOAT);
assertThat(floatMix.attributes().hasAttribute("classify")).isTrue();
Array datafloatsMix = floatMix.read();
// assertThat(datafloatsMix).isEqualTo(DATA_mixNumbers);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can probably go?

Suggested change
// assertThat(datafloatsMix).isEqualTo(DATA_mixNumbers);

assertThat(nearlyEquals(datafloatsMix, DATA_mixNumbers)).isTrue();

}
Expand Down Expand Up @@ -112,4 +118,27 @@ public void testEnhanceClassifier_integers() throws IOException {
}

}

@Test
public void testEnhanceClassifier_classification() throws IOException {

try (NetcdfFile ncfile = NetcdfDatasets.openDataset(dataDir + "testAddToClassifier.ncml", true, null)) {

Variable Classify_Specsx = ncfile.findVariable("class_specs");
matakleo marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style issue: would use lower camel case for the variable name, something like:

Suggested change
Variable Classify_Specsx = ncfile.findVariable("class_specs");
Variable classifySpecs = ncfile.findVariable("class_specs");

assertThat((Object) Classify_Specsx).isNotNull();
assertThat(!Classify_Specsx.attributes().isEmpty()).isTrue();
Array Data = Classify_Specsx.read();
Classifier classifier = Classifier.createFromVariable(Classify_Specsx);
// List<Attribute> Whatever = Classify_Specsx.getAttributes();
// Classifier classifier = new Classifier(Whatever);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can probably go?

Suggested change
// List<Attribute> Whatever = Classify_Specsx.getAttributes();
// Classifier classifier = new Classifier(Whatever);

int[] ClassifiedArray = classifier.classifyWithAttributes(Data);
assertThat(nearlyEquals(Array.makeFromJavaArray(ClassifiedArray), CLASSIFICATION_TEST)).isTrue();


} catch (IOException e) {
throw new RuntimeException(e);
}

}

}
Loading