Skip to content

Conversation

@nancycollins
Copy link

**## Description:

Make the CSV read module use the new CSV parsing routine, simplifying the code and handling embedded blanks. Add an optional argument to the open routine to pass in a known delimiter. Update the test routines. Make a one line change to the ARVOR converter to account for the additional optional argument in the open. (Helen told me I can make a pull request to moha's branch, so this is what i'm trying. let me know if i did it wrong.)

Fixes issue

address issues with pull request NCAR#1009

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Documentation changes needed?

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.

Tests

The CSV read routines now handle embedded blanks, and the tests all run to completion for the csv reads.

Checklist for merging

  • Updated changelog entry
  • Documentation updated
  • Update conf.py

Checklist for release

  • Merge into main
  • Create release from the main branch with appropriate tag
  • Delete feature-branch

Testing Datasets

  • Dataset needed for testing available upon request
  • Dataset download instructions included
  • No dataset needed**

it must be told what the delimiter is (generally comma or semicolon)
and splits up the fields based on the delimiter.  it handles quotes
inside the fields to allow the delimiter to be part of the string.

added a test program and test input file.
remove the routine that adds spaces and call the new parse routine
directly.  add an option on open to specify the delimiter which is
passed through to the detect routine.  make the test program use the
testeverything code.  it now handles fields with embedded spaces and
alternative delimiters.
Copy link

@hkershaw-brown hkershaw-brown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Nancy,

I've made some comments on you code. The definite change is put the csv routines in the same csv module rather than having one csv routine in parse_args_mod and the rest in read_csv_mod.

I think at the moment your tests are a bit hard to follow for what you are testing, what is expected. I think they would benefit from descriptive names, e.g. something like a name of what is being tested.

! Error handling tests
call test_missing_field()          ! Field not in header
call test_malformed_csv()          ! Inconsistent column counts
call test_io_errors()              ! File read failures
call test_size_mismatches()        ! Array size != data size

! Edge case tests  
call test_quoted_fields()          ! "Bob,Jr", "Alice Smith"
call test_embedded_delimiters()    ! "Smith, John", "Mary, Jane"
call test_empty_fields()           ! Name,,Date,Total
call test_escaped_characters()     ! "He said \"Hello\""
call test_whitespace_delims()      ! Tab or space-separated - can it cope with this?

! Data conversion tests
call test_real_conversions()       ! 3.14, 1.0e-5, missing values, -.34
call test_missing_data_handling()  ! _EMPTY_ -> MISSING_R8
call test_invalid_numbers()        ! "abc" -> gets MISSING_R8?

! Boundary tests
call test_max_field_length()       ! Fields > 512 chars
call test_max_column_count()       ! > MAX_NUM_FIELDS columns
call test_empty_files()            ! Header only, or completely empty

Similarly there is no documentation for read_csv_mod.f90, what tools are available for the user and what the limitations are.

Cheers,
Helen


integer :: rc

rc = shell_execute("rm "//trim(fname))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the Fortran standard EXECUTE_COMMAND_LINE, rather than shell_execute.
\rm to not alias rm. e.g if rm is aliased to rm -i

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added the backslash.

call test_4(fname)
call delete_file(fname)

! end test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you have a test for a failed read, where the value is given MISSING_R8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or a csv that is malformed, e.g. wrong number of columns in a row. Would the missing columns get filled with MISSING_R8?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tests for missing reals and integers. malformed lines give a fatal error so there is a test for those but it's currently commented out.

@nancycollins
Copy link
Author

i ran the converter on moha's test files after all my changes and it ran correctly.

re-enabled the 2 tests that provoke a (correct) fatal error.
added a set_term_level() routine to the utilities mod.
@mgharamti mgharamti merged commit 06c7977 into mgharamti:insitu_ocean_converters Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants