Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpredictable Behavior of ui.textarea with Over a Million Characters #3410

Open
Xtreemrus opened this issue Jul 28, 2024 · 5 comments
Open
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Xtreemrus
Copy link

Description

Description

When ui.textarea contains more than a million characters, it starts to behave unpredictably without any visible warnings.

Steps to Reproduce

To demonstrate the issue from different perspectives, I wrote the following code:

from nicegui import ui

@ui.page('/')
def main():
    text_area = ui.textarea('Type something').props('rounded outlined input-class=mx-3') \
                .classes('w-full self-center')

    text_area.value = '123' + 'A'*1_000_000

    len_label = ui.label('Length').classes('mt-2')
    
    def text_area_length():
        print('Length:', len(text_area.value))
        len_label.set_text(f'Length: {len(text_area.value)}')
        print('first_character:', text_area.value[0])

    def text_area_delete_first():
        text_area.value = text_area.value[1:]
        text_area_length()

    ui.button('Calculate', on_click=text_area_length)
    ui.button('Delete first character', on_click=text_area_delete_first)

    text_area_length()

ui.run(title='Text Input Max')

When you run this code, it creates a textarea filled with over a million characters, plus three more to make the behavior more evident in different scenarios.

  1. If you click the Calculate button immediately after launching (without interacting with the ui.textarea), everything is calculated correctly, and the console prints the appropriate values.
  2. If you add or delete a character using the ui.textarea (e.g., delete the digit 1 at the beginning or add a new character), the Calculate button will no longer calculate the new value of text_area correctly. The code executes without errors, and you see the print output, but it shows the wrong number of characters and the first character is also incorrect, as if the connection between the frontend and backend is lost.
  3. To further investigate, I created a button to delete the first character without using the cursor to interact with the ui.textarea. In this case, everything works as expected, the counter decreases, and the first characters are read correctly.

Additional Investigation

To see if this is a Quasar issue, I added character counting to their textarea code:
https://codepen.io/Xtreemrus/pen/XWLNPyY?editors=101

<!--
Forked from:
https://quasar.dev/vue-components/input#example--textarea
-->
<div id="q-app" style="min-height: 100vh;">
  <div class="q-pa-md" style="max-width: 300px">
    <q-input
      v-model="text"
      filled
      type="textarea"
    ></q-input>
    <div class="q-mt-md">
      <p>Calculate len: {{ characterCount }}</p>
      <p>first_character: {{ firstCharacter }}</p>
    </div>
  </div>
</div>

JS part:

const { ref, computed } = Vue

const app = Vue.createApp({
  setup () {
    const text = ref('')

    const characterCount = computed(() => text.value.length)
    
    const firstCharacter = computed(() => {
      if (text.value.length > 0) {
        return text.value[0]
      } else {
        return 'No symbols'
      }
    })

    return {
      text,
      characterCount,
      firstCharacter
    }
  }
})

app.use(Quasar, { config: {} })
app.mount('#q-app')

In this case, working with a million characters is handled correctly.

Impact

A couple of years ago, cases requiring a million characters in a textarea were rare. However, with the development of LLMs, users are inserting increasingly large chunks of information into input fields. One million characters are approximately 250,000 tokens, which is now a feasible amount.

Difficulty

The challenge with this bug is that no visible errors occur, neither on the user side nor the developer side. There are no exceptions thrown, making it hard to find a workaround.

Environment

  • Browser: Chrome
  • NiceGUI version: 1.4.30
@falkoschindler falkoschindler added the bug Something isn't working label Jul 29, 2024
@falkoschindler
Copy link
Contributor

Hi @Xtreemrus,

Thanks for reporting this issue! It looks like Socket.IO (internally using Engine.IO) has a default maximum frame size of 1,000,000 bytes. When we change it in

core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json)

to something like

core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json,
                                      max_http_buffer_size=10_000_000)

your example works. So we need to think about whether to somehow detect an overflow and warn the user about it, or to provide a configuration option.

What do you think?

@falkoschindler falkoschindler added the help wanted Extra attention is needed label Jul 31, 2024
@Xtreemrus
Copy link
Author

Hi @falkoschindler,

Thank you for your reply. I'm glad that the core issue has been identified.

While I'm not a professional developer, it seems the problem consists of two parts:

  1. Providing developers with a way to catch and identify the issue. An exception should be raised when the buffer overflow occurs.

  2. After notifying the developer about the error, we need to offer a way to fix it.

Possible solutions:
For point 2, it's relatively straightforward and could potentially be addressed by adding an extra parameter to ui.run.

The solution for point 1 is less obvious, at least from my perspective. Since nothing is propagated to the backend during overflow, we could potentially add a JS script to each ui.input field where overflow might occur. This script would check text.value.length, and if it exceeds max_http_buffer_size (which could be passed as a constant to JS at site startup), it would trigger an exception on the Python side.

Simply setting max_http_buffer_size to a very large number and hoping users won't exceed it seems less ideal, as we wouldn't have visibility into how often this event occurs.

In an ideal scenario, it would be great if all input fields where possible had an option to specify the maximum number of characters allowed. If a developer sets an input character count higher than max_http_buffer_size, they would receive a warning during ui.run. This wouldn't be an error, but a caution that it could occur. Additionally, giving developers the ability to call a custom function when buffer overflow happens could help them implement user-friendly messages like "Warning: This field is limited to 1 million characters."

These approaches would provide more robust error handling and improve the developer and end-user experience.

@me21
Copy link
Contributor

me21 commented Aug 4, 2024

According to the engineio sources, AsyncSocket class raises an exception if it receives too much data:

    async def _websocket_handler(self, ws):
        """Engine.IO handler for websocket transport."""
        async def websocket_wait():
            data = await ws.wait()
            if data and len(data) > self.server.max_http_buffer_size:
                raise ValueError('packet is too large')
            return data

This exception is silently caught elsewhere.

@me21
Copy link
Contributor

me21 commented Aug 4, 2024

How about core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json, max_http_buffer_size=float('inf'))? Would it break something?

@falkoschindler
Copy link
Contributor

How about core.sio = sio = socketio.AsyncServer(async_mode='asgi', cors_allowed_origins='*', json=json, max_http_buffer_size=float('inf'))? Would it break something?

I guess this would allow any client to send arbitrarily large packages to the server, which could exhaust the memory and kill the server.

This exception is silently caught elsewhere.

That's bad. But where could this be? Here is the line emitting the socket message:

await core.sio.emit(message_type, data, room=target_id)

Any exception caught in loop() should be logged to the console. 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants