Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Windows line separation. #949

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

szarpul
Copy link

@szarpul szarpul commented Jun 25, 2024

Bug fix for #595

  • BeanOutputConverter - replace all Windows line separation with \n.
  • BeanOutputConverterTest - explicitly check line separation using \n.
  • Introduce TextBlockAssertion that asserts text blocks with line separation set to \n.

@tzolov
Copy link
Contributor

tzolov commented Jun 25, 2024

@szarpul I'm afraid your fix is not multiplatform? have you tested it on Mac, Linux and Wind?

@szarpul
Copy link
Author

szarpul commented Jun 25, 2024

Hey @tzolov , I tested it only on Windows. Nevertheless, the fix should not affect other platforms as they don't have \r\n line separation.

@nichozhan
Copy link
Contributor

nichozhan commented Jun 25, 2024

Hi @szarpul & @tzolov , How about using System.lineSeparator() as the target of replacement?
I have a similar issue in #944 while asserting multi-line string equation.

@szarpul
Copy link
Author

szarpul commented Jun 25, 2024

@nichozhan I would avoid using System.lineSeparator() at all. The test intention is clear, we want to have just \n as a line separation. On the other hand, System.lineSeparator() produces different results on different OS thus making the test flaky (failing on Windows for instance).

@szarpul
Copy link
Author

szarpul commented Jun 26, 2024

@tzolov I've tested it on Linux using docker image maven:eclipse-temurin-17 with a command clean package and all tests have passed.

@szarpul szarpul force-pushed the fix-windows-line-separation branch from ea2eedd to b1d59fd Compare July 1, 2024 06:45
@markpollack
Copy link
Member

markpollack commented Jul 2, 2024

In the issues that came up around this topic, it seems there are different opinions. Is the intention that everything only has \n no matter what the platform? Is this because the AI models can universally understand \n and the other variations \r\n and \r confuse it? Doesn't that seem unlikely? The other intention I thought was to use the correct newline per platform so that any logging/debugging would appear in an appropriate manner.

If the tests are flaky because they don't use System.lineSeparator() then shouldn't the tests be fixed? My gut feel is that the text should reflect the system line separator, so any replacement if needed would be along the lines of myString.replaceAll("\\r\\n|\\r|\\n", System.lineSeparator())

Thoughts?

@markpollack markpollack self-assigned this Jul 2, 2024
@szarpul
Copy link
Author

szarpul commented Jul 3, 2024

Hey @markpollack thanks for the reply!

If the tests are flaky because they don't use System.lineSeparator() then shouldn't the tests be fixed?

I would say it's the other way around, the tests are flaky because of the use of System.lineSeparator() . Meaning System.lineSeparator() differs between different OS. My change fixes that by defining the desired line separator.

If that's correct that \n is a desired line separator is a different topic I would say. However, right now this is how it's defined, please see the comment in the test:

// validate that output contains \n line endings
and I'm just adjsuting the tests.

* `BeanOutputConverter` - replace all Windows line separation with \n.
* `BeanOutputConverterTest` - explicitly check line separation using \n.
* Introduce `TextBlockAssertion` that asserts text blocks with line separation set to \n.

(cherry picked from commit e20c9d6)
* After the fix done in spring-projects#944, the tests were still not passing.
* Following the fix for spring-projects#944, we should compare text blocks against normalized EOL.
* If our goal is to have normalized EOL, then there is no sens to hardcode a `\n` as the target value in `BeanOutputConverterTest.normalizesLineEndingsClassType` test.
@szarpul szarpul force-pushed the fix-windows-line-separation branch from b1d59fd to 2404f07 Compare July 9, 2024 07:09
@szarpul
Copy link
Author

szarpul commented Jul 9, 2024

Hey @markpollack @tzolov

I have updated the PR.

WDYT?

@markpollack markpollack added this to the 1.0.0-M2 milestone Jul 22, 2024
@markpollack
Copy link
Member

Thanks for hanging in there on this. I'll dive back in and take a look. The way this reads now has me a bit confused. It looks like there isn't portability in text blocks across platforms?

The related issue #595 is about DEFAULT_METADATA_SEPARATOR use in Documents.

I've created a 'cross-platform' label to track / group better

@szarpul
Copy link
Author

szarpul commented Jul 24, 2024

Hey @markpollack

To clarify the situation, it all comes to one thing, somwhere in the production code the System.lineSeparator() is used, which produces different result for each OS:

  • \n UNIX
  • \r\n WINDOWS

BeanOutputConverterTest

In the method BeanOutputConverter.generateSchema the System.lineSeparator() is used:

ObjectWriter objectWriter = new ObjectMapper().writer(new DefaultPrettyPrinter()
.withObjectIndenter(new DefaultIndenter().withLinefeed(System.lineSeparator())));

ContentFormatterTests

This test runs the DefaultContentFormatter with a DEFAULT_METADATA_SEPARATOR set to System.lineSeparator().

private static final String DEFAULT_METADATA_SEPARATOR = System.lineSeparator();

PromptTemplateTest

PromptTemplate class is using StringTemplate library that probably is also using the System.lineSeparator() unde the hood, as the PromptTemplate.render metod is returning the results with Windows specific line separator.


Wrap up

We could either explicitly specify to have the \n in the production code (don't know exactly how to enforce that for the ST library, but lets assume there is a way) or make our test adjust the line separator to the system separator, like in the preposition in this PR (maybe the TextBlockAssertions could be changed to something else).

WDYT?

@markpollack markpollack modified the milestones: 1.0.0-M2, 1.0.0-RC1 Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants