Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flaky CI tests (maybe term.reset related) #5184

Open
jerch opened this issue Oct 5, 2024 · 2 comments
Open

flaky CI tests (maybe term.reset related) #5184

jerch opened this issue Oct 5, 2024 · 2 comments
Labels
area/build Regarding the build process type/bug Something is misbehaving type/debt Technical debt that could slow us down in the long run

Comments

@jerch
Copy link
Member

jerch commented Oct 5, 2024

The CI sometimes produces test failures for no obvious reasons. Those failures are not reproducible locally and seem to be more likely, when the CI machine is under heavy load.

Example:
image

Thats from weblinks tests, where all neighboring tests take a moderate time of ~130ms, while one test runs into a timeout. Those tests actually poll for a certain change to happen on the DOM, but for some reason that change never happened.

Our playwright test modules run all in the same page on the same terminal instance, but separate tests by a

  test.beforeEach(async () => {
    await ctx.page.evaluate(`
      window.term.reset();
      window._some_addon?.dispose();
      window._some_addon = new SomeAddon();
      window.term.loadAddon(window._some_addon);
    `);
  });

to reset the terminal and the tested addon to initial state. This raised my suspicion, whether there might be something off with the reset handling here. This is further backed by the fact, that introducing a wait after term.reset solves the issue:

  test.beforeEach(async () => {
    await ctx.page.evaluate(`
      window.term.reset();
      window._linkaddon?.dispose();
    `);
    await timeout(10);
    await ctx.page.evaluate(`
      window._linkaddon = new WebLinksAddon();
      window.term.loadAddon(window._linkaddon);
    `);
  });
@jerch jerch mentioned this issue Oct 5, 2024
@jerch jerch added type/bug Something is misbehaving type/debt Technical debt that could slow us down in the long run labels Oct 5, 2024
@jerch
Copy link
Member Author

jerch commented Oct 5, 2024

Some digging on term.reset reveals, that it is indeed not 100% synchronous. Repro:

  • run demo
  • generate some terminal output, e.g. run ls
  • open console.log
  • run term.reset(); while (true) {}
    --> terminal gets not cleared before running into the busy loop, thus there is some task in a queue involved

Is it a microtask?

  • do the same as above, but run term.reset(); Promise.resolve().then(() => {while (true) {}}) instead
    --> nope, terminal still not cleared

Is it a macrotask?

  • do the same as above, but run term.reset(); setTimeout(() => {while (true) {}},0) instead
    --> yepp, terminal gets cleared before entering the busy loop (Edit: thats still not quite correct, as I found out below, requestAnimtionFrame is the real culprit)

So yepp, we have here the infamous nextTick-issue, that many nodejs devs should know, but with a pending task on the browser's macrotask queue.

So why is this a problem during test execution?
Because the way we use it, tests get chained on the microtask queue under the hood:

  await beforeEach();
  await test1();
  await beforeEach();
  await test2();
  ...

So term.reset() in beforeEach places its cleanup macrotask, but is not awaited on the microtask queue itself (promises are microtasks, setTimeout functions are macrotasks). So the microtask queue will happily progress without ever calling the output cleanup.
Adding the timeout above helps here, since it introduces a wait condition as macrotask by relying on setTimeout:

export async function timeout(ms: number): Promise<void> {
  return new Promise<void>(r => setTimeout(r, ms));
}

Solution:
Best solution would be to make term.reset fully synchronous, thus to cleanup output with sync code.

Second best solution is to fix all playwright tests with a timeout and dont advertise term.reset as fully synchronous anymore.

@jerch
Copy link
Member Author

jerch commented Oct 5, 2024

So the issue is in fact a bit more complicated. I added this test snippet to an addon test:

  test.describe.only('buggy', async () => {
    const logs: string[] = [];
    const MAX = 19;
    new Promise<string[]>(r => {
      for (let i = 0; i <= MAX; ++i) {
        test(''+i, async () => {
          const content: string = await ctx.page.evaluate(`document.querySelector('.xterm-rows > div').innerHTML`);
          logs.push(content.slice(0, 50));
          if (i === MAX) r(logs);
        });
      }
    }).then(logs => console.log(logs));
  });

and run the test with

$> yarn test-integration --workers=50% --suite=addon-image

If resetting is perfectly sync, i'd expect this result for every browser:

[
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>'
]

which means, the DOM repr of the terminal buffer contains only one cell for the cursor.

I actually get this from Chromium (and to a much lesser degree from Webkit, Firefox seems fine locally):

[
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '',
  '<span> </span>', '',
  '<span> </span>', '<span> </span>',
  '<span> </span>', '',
  '<span> </span>', '<span> </span>',
  '',               '<span> </span>',
  '<span> </span>', ''
]

So the reset is sometimes not finished on certain browsers. Turns out, that not setTimeout fixes it, but requestAnimationFrame as wait condition in beforeEach:

  test.beforeEach(async () => {
    await ctx.page.evaluate(`
    window.__f = async () => {
      window.term.reset();
      return new Promise(r => requestAnimationFrame(r));
    }
    window.__f();
    `);
  });

This now reliably fixes the reset handling between tests, even under high load (tested locally up to a load of 20).

@jerch jerch added the area/build Regarding the build process label Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Regarding the build process type/bug Something is misbehaving type/debt Technical debt that could slow us down in the long run
Projects
None yet
Development

No branches or pull requests

1 participant