Message ID | 20220613171258.1905715-4-alex.bennee@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | testing/next pre-PR (docker, gitlab, tcg) | expand |
On 6/13/22 10:12, Alex Bennée wrote: > From: Thomas Huth <thuth@redhat.com> > > The job definitions recently got a second "variables:" section by > accident and thus are failing now if one tries to run them. Merge > the two sections into one again to fix the issue. > > And while we're at it, bump the timeout here (70 minutes are currently > not enough for the aarch64 job). The jobs are marked as manual anyway, > so if the user starts them, they want to see their result for sure and > then it's annoying if the job timeouts too early. > > Fixes: e312d1fdbb ("gitlab: convert build/container jobs to .base_job_template") > Signed-off-by: Thomas Huth <thuth@redhat.com> > Acked-by: Richard Henderson <richard.henderson@linaro.org> > Message-Id: <20220603124809.70794-1-thuth@redhat.com> > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > .gitlab-ci.d/buildtest.yml | 22 ++++++++++------------ > 1 file changed, 10 insertions(+), 12 deletions(-) > > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > index 544385f5be..cb7cad44b5 100644 > --- a/.gitlab-ci.d/buildtest.yml > +++ b/.gitlab-ci.d/buildtest.yml > @@ -357,16 +357,15 @@ build-cfi-aarch64: > --enable-safe-stack --enable-slirp=git > TARGETS: aarch64-softmmu > MAKE_CHECK_ARGS: check-build > - timeout: 70m > - artifacts: > - expire_in: 2 days > - paths: > - - build > - variables: > # FIXME: This job is often failing, likely due to out-of-memory problems in > # the constrained containers of the shared runners. Thus this is marked as > # skipped until the situation has been solved. > QEMU_JOB_SKIPPED: 1 > + timeout: 90m > + artifacts: > + expire_in: 2 days > + paths: > + - build FWIW, 90 minutes was close, but insufficient: https://gitlab.com/qemu-project/qemu/-/jobs/2584472225 But certainly, let us fix the job definition: Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On 13/06/2022 23.46, Richard Henderson wrote: > On 6/13/22 10:12, Alex Bennée wrote: >> From: Thomas Huth <thuth@redhat.com> >> >> The job definitions recently got a second "variables:" section by >> accident and thus are failing now if one tries to run them. Merge >> the two sections into one again to fix the issue. >> >> And while we're at it, bump the timeout here (70 minutes are currently >> not enough for the aarch64 job). The jobs are marked as manual anyway, >> so if the user starts them, they want to see their result for sure and >> then it's annoying if the job timeouts too early. >> >> Fixes: e312d1fdbb ("gitlab: convert build/container jobs to >> .base_job_template") >> Signed-off-by: Thomas Huth <thuth@redhat.com> >> Acked-by: Richard Henderson <richard.henderson@linaro.org> >> Message-Id: <20220603124809.70794-1-thuth@redhat.com> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> --- >> .gitlab-ci.d/buildtest.yml | 22 ++++++++++------------ >> 1 file changed, 10 insertions(+), 12 deletions(-) >> >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml >> index 544385f5be..cb7cad44b5 100644 >> --- a/.gitlab-ci.d/buildtest.yml >> +++ b/.gitlab-ci.d/buildtest.yml >> @@ -357,16 +357,15 @@ build-cfi-aarch64: >> --enable-safe-stack --enable-slirp=git >> TARGETS: aarch64-softmmu >> MAKE_CHECK_ARGS: check-build >> - timeout: 70m >> - artifacts: >> - expire_in: 2 days >> - paths: >> - - build >> - variables: >> # FIXME: This job is often failing, likely due to out-of-memory >> problems in >> # the constrained containers of the shared runners. Thus this is >> marked as >> # skipped until the situation has been solved. >> QEMU_JOB_SKIPPED: 1 >> + timeout: 90m >> + artifacts: >> + expire_in: 2 days >> + paths: >> + - build > > FWIW, 90 minutes was close, but insufficient: > > https://gitlab.com/qemu-project/qemu/-/jobs/2584472225 Hmm, it was working at least once for me while I was working on the patch. But as I already wrote here: https://lists.gnu.org/archive/html/qemu-devel/2022-06/msg00463.html I think nobody really used this build-cfi-aarch64 in month ... so we should maybe have a try with the 90 min timeout first (maybe the CI servers were just a little bit overloaded when you tried), but if the test continues to hit the 90 minutes timeout, I'd say we rather delete it instead of bumping the timeout even further. 90 minutes are really very close to the pain level already - at least for me. > But certainly, let us fix the job definition: > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Thanks! Thomas
On Tue, Jun 14, 2022 at 06:30:47AM +0200, Thomas Huth wrote: > On 13/06/2022 23.46, Richard Henderson wrote: > > On 6/13/22 10:12, Alex Bennée wrote: > > > From: Thomas Huth <thuth@redhat.com> > > > > > > The job definitions recently got a second "variables:" section by > > > accident and thus are failing now if one tries to run them. Merge > > > the two sections into one again to fix the issue. > > > > > > And while we're at it, bump the timeout here (70 minutes are currently > > > not enough for the aarch64 job). The jobs are marked as manual anyway, > > > so if the user starts them, they want to see their result for sure and > > > then it's annoying if the job timeouts too early. > > > > > > Fixes: e312d1fdbb ("gitlab: convert build/container jobs to > > > .base_job_template") > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > Acked-by: Richard Henderson <richard.henderson@linaro.org> > > > Message-Id: <20220603124809.70794-1-thuth@redhat.com> > > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > > > --- > > > .gitlab-ci.d/buildtest.yml | 22 ++++++++++------------ > > > 1 file changed, 10 insertions(+), 12 deletions(-) > > > > > > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > > > index 544385f5be..cb7cad44b5 100644 > > > --- a/.gitlab-ci.d/buildtest.yml > > > +++ b/.gitlab-ci.d/buildtest.yml > > > @@ -357,16 +357,15 @@ build-cfi-aarch64: > > > --enable-safe-stack --enable-slirp=git > > > TARGETS: aarch64-softmmu > > > MAKE_CHECK_ARGS: check-build > > > - timeout: 70m > > > - artifacts: > > > - expire_in: 2 days > > > - paths: > > > - - build > > > - variables: > > > # FIXME: This job is often failing, likely due to > > > out-of-memory problems in > > > # the constrained containers of the shared runners. Thus this > > > is marked as > > > # skipped until the situation has been solved. > > > QEMU_JOB_SKIPPED: 1 > > > + timeout: 90m > > > + artifacts: > > > + expire_in: 2 days > > > + paths: > > > + - build > > > > FWIW, 90 minutes was close, but insufficient: > > > > https://gitlab.com/qemu-project/qemu/-/jobs/2584472225 > > Hmm, it was working at least once for me while I was working on the patch. > But as I already wrote here: > > https://lists.gnu.org/archive/html/qemu-devel/2022-06/msg00463.html > > I think nobody really used this build-cfi-aarch64 in month ... so we should > maybe have a try with the 90 min timeout first (maybe the CI servers were > just a little bit overloaded when you tried), but if the test continues to > hit the 90 minutes timeout, I'd say we rather delete it instead of bumping > the timeout even further. 90 minutes are really very close to the pain level > already - at least for me. The CFI jobs seem to massively slow down and timeout waaaaaaay more often than any other job. I've seen the CFI jobs run successfully in 45 minutes, and yet they frequently take so long that they can't even complete in double that. CFI is certainly slower at compile but not in a non-deterministic manner that would randomly double compilation time. I would be willing to blame CI overload if all our other jobs were showing similar magnitude of slow down, but AFAIK, they are not showing this. I worry that there are genuine problems with the CFI builds that result in non-deterministic runtime problems in functional testing. IOW not merely running slowly, but genuine hang With regards, Daniel
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml index 544385f5be..cb7cad44b5 100644 --- a/.gitlab-ci.d/buildtest.yml +++ b/.gitlab-ci.d/buildtest.yml @@ -357,16 +357,15 @@ build-cfi-aarch64: --enable-safe-stack --enable-slirp=git TARGETS: aarch64-softmmu MAKE_CHECK_ARGS: check-build - timeout: 70m - artifacts: - expire_in: 2 days - paths: - - build - variables: # FIXME: This job is often failing, likely due to out-of-memory problems in # the constrained containers of the shared runners. Thus this is marked as # skipped until the situation has been solved. QEMU_JOB_SKIPPED: 1 + timeout: 90m + artifacts: + expire_in: 2 days + paths: + - build check-cfi-aarch64: extends: .native_test_job_template @@ -398,16 +397,15 @@ build-cfi-ppc64-s390x: --enable-safe-stack --enable-slirp=git TARGETS: ppc64-softmmu s390x-softmmu MAKE_CHECK_ARGS: check-build - timeout: 70m - artifacts: - expire_in: 2 days - paths: - - build - variables: # FIXME: This job is often failing, likely due to out-of-memory problems in # the constrained containers of the shared runners. Thus this is marked as # skipped until the situation has been solved. QEMU_JOB_SKIPPED: 1 + timeout: 80m + artifacts: + expire_in: 2 days + paths: + - build check-cfi-ppc64-s390x: extends: .native_test_job_template