Can we remove llvmbb from IRC?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev

+cfe-dev again 😊

 

From: Keane, Erich
Sent: Tuesday, September 1, 2020 12:10 PM
To: Nico Weber <[hidden email]>; llvm-dev <[hidden email]>
Subject: RE: [cfe-dev] Can we remove llvmbb from IRC?

 

Check out the #llvm-build channel.  In themain IRC channel, I block llvmbb, then just also listen in for my name in #llvm-build.  The bot has a different name there, so it is still possible to block in #llvm.

 

From: cfe-dev <[hidden email]> On Behalf Of Nico Weber via cfe-dev
Sent: Tuesday, September 1, 2020 12:08 PM
To: llvm-dev <[hidden email]>; cfe-dev <[hidden email]>
Subject: [cfe-dev] Can we remove llvmbb from IRC?

 

Hi,

 

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

 

Does anyone think llvmbb is useful?

 

The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

 

Nico


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
Hi Nico,

On Tue, Sep 01, 2020 at 03:07:37PM -0400, Nico Weber via cfe-dev wrote:

> Hi,
>
> llvmbb's job is to inform people of build breaks. However, it seems to trigger
> for a big list of bots, and at least one of them seems to always be broken, and
> the broken bots tend to have cycle times of several hours. So if you're on IRC
> and you commit something, you get pinged by llvmbb for hours afterwards.
>
> Does anyone think llvmbb is useful?
>
> The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but
> if that's what everybody does then why not not have it in the first place?

I find it useful to /ignore llvmbb it *and* /join #llvm-build

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
Hi,

> On Sep 1, 2020, at 20:10, Serge Guelton via llvm-dev <[hidden email]> wrote:
>
> Hi Nico,
>
> On Tue, Sep 01, 2020 at 03:07:37PM -0400, Nico Weber via cfe-dev wrote:
>> Hi,
>>
>> llvmbb's job is to inform people of build breaks. However, it seems to trigger
>> for a big list of bots, and at least one of them seems to always be broken, and
>> the broken bots tend to have cycle times of several hours. So if you're on IRC
>> and you commit something, you get pinged by llvmbb for hours afterwards.
>>
>> Does anyone think llvmbb is useful?
>>
>> The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but
>> if that's what everybody does then why not not have it in the first place?
>
> I find it useful to /ignore llvmbb it *and* /join #llvm-build


That’s great, I wasn’t aware of the #llvm-build channel.

If we already have a dedicated #llvm-build channel, IMO it would make sense to remove llvmbb from the main channel, to cut down on the noise. It does not seem to add much because the false-positive rate is high and people who are interested can use #llvm-build.

Cheers,
Florian
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)
 
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn't something that sometimes happens, it's something that always happens.
 
 
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev


On Tue, Sep 1, 2020 at 12:42 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn't something that sometimes happens, it's something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?
  
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
On Tue, Sep 1, 2020 at 3:57 PM David Blaikie <[hidden email]> wrote:


On Tue, Sep 1, 2020 at 12:42 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn't something that sometimes happens, it's something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :) Or look at IRC scrollback. Given how easy it is to find these problems, it doesn't seem like there's a lot of appetite for improving this. Hence me asking about removing llvmbb (...and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there's more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1]  Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975  blamelist: LLVM GN Syncbot <[hidden email]>, Nico Weber <[hidden email]>

build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler]  Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132  blamelist: Nico Weber <[hidden email]>, Matt Arsenault <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina

 build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all]  Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255  blamelist: LLVM GN Syncbot <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina Sbirlea <[hidden email]>, Nico Weber <[hidden email]>, Amara

I also got email with pointers to:

Chances are that there's something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there's something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.
 
  
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
I assume you're getting emails in addition to the chat spam? Or are you not/are these bots sending chat spam but not email? If that's the case, yeah, I'd rather have a consistent notification experience - and disable all notifications from a bot if some notifications are disabled (eg: if it's not good enough to be sending email, then it shouldn't be spamming the IRC channel either)

On Tue, Sep 1, 2020 at 1:20 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:57 PM David Blaikie <[hidden email]> wrote:


On Tue, Sep 1, 2020 at 12:42 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn't something that sometimes happens, it's something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :) Or look at IRC scrollback. Given how easy it is to find these problems, it doesn't seem like there's a lot of appetite for improving this.

I think there's apetite for changing it in some way - no one enjoys the current state of things. But often people assume it's not changeable, whereas I think it is - and I think it's important that it be changed because if we silence all the bots, then quality is likely to go down. Silencing the IRC bot may still be good - folks should be getting buildbot fail email which is more targeted and not spamming the channel for people who aren't to blame (heck, the bots could send private messages instead, I guess?).

But improving signal/noise should benefit the email, and the bot spam (whichever channel it's in).
 
Hence me asking about removing llvmbb (...and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there's more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1]  Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975  blamelist: LLVM GN Syncbot <[hidden email]>, Nico Weber <[hidden email]>

That doesn't look like the "always be broken" case. It was green on the build prior to this one ( http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974 ) 

Looks like the buildbot triggered correctly, only took the 2 revisions you committed. The test did pass at the prior revision and did fail at that revision - perhaps either the buildbot or the test is flakey? (interestingly the test failed in stage 1 at 13975, then failed in stage 2 at 13976 - then passed again in 13977. Both failures for the same reason "/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script: line 5: /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang: No such file or directory" - perhaps some problem with creating the symlink?

Started an llvm-dev thread to discuss that separately in more detail.
 
build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler]  Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132  blamelist: Nico Weber <[hidden email]>, Matt Arsenault <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina

Also green on the prior build ( http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131 ).
Went green again after a revert here: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which matches the commit that made the bot go red - so this looks to be a bot doing what it's meant to do. (varying levels of quality, and 2 hour cycle time isn't ideal by any means, though it found this failure in 5 minutes once it started (but that could be 2 hours after a commit))

What do you think we should do with bots like this? Should long cycle time/long blame list bots (not always the same thing) produce no notifications, and require them to be triaged by the bot owner who then manually sends email/follow-up once a rough guess of blame has been made & checked that it hasn't already been possibly diagnosed, discussed and fixed due to a faster bot or other means?
 
build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all]  Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255  blamelist: LLVM GN Syncbot <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina Sbirlea <[hidden email]>, Nico Weber <[hidden email]>, Amara

Also green on the prior build ( http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2254 ), and went back to green on the following build.
Possibly this was related to the same commit/revert as in the previous bot in this list. It's a fairly fast bot, went red on a build including the revision that committed the xor issue, and green on the next build that included a revert of that patch. I couldn't say for sure, though.


Was red for a few builds then green again here: http://green.lab.llvm.org/green/job/clang-stage1-RA/14183/

Looks like the build that went red and the build that went green (& the fact that the failure was related to libfuzzer) correlates well with this commit: https://github.com/llvm/llvm-project/commit/2665425908e00618074e42155ec922a37f7c9002 and this revert: https://github.com/llvm/llvm-project/commit/7139736261e047e9cca030e2ee5912bf2a16f816
 
Chances are that there's something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there's something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.

They also contain information about failures - yeah, they might not be yours, but they are often/usually someone's, not just flakey bot failures. If you're suggesting all the bots are unactionable - then perhaps we should turn off all notifications on all of them? I have certainly considered that - and then only enabling bots that are fast/high signal-to-noise/small blame list. Though I imagine that's a bigger discussion.
   
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
On Tue, Sep 1, 2020 at 9:13 PM David Blaikie <[hidden email]> wrote:
I assume you're getting emails in addition to the chat spam? Or are you not/are these bots sending chat spam but not email? If that's the case, yeah, I'd rather have a consistent notification experience - and disable all notifications from a bot if some notifications are disabled (eg: if it's not good enough to be sending email, then it shouldn't be spamming the IRC channel either)

I received a single email for the greendragon bot. The rest was IRC only. (The greendragon bot didn't send an IRC ping I think.)
 

On Tue, Sep 1, 2020 at 1:20 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:57 PM David Blaikie <[hidden email]> wrote:


On Tue, Sep 1, 2020 at 12:42 PM Nico Weber <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 3:32 PM David Blaikie <[hidden email]> wrote:
On Tue, Sep 1, 2020 at 12:07 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken,

If a bot is always broken it shouldn't be sending email/notifications - generally they are configured only to send email on green>red and red>green transitions, so if it's already broken you shouldn't be blamed for it. If you are seeing bot spam or emails from a bot that's already red, please email llvm-dev and the bot maintainer and ask the bot to be reconfigured or disabled.

If a bot is regularly flakey (& thus sending email/notifications that are false-positives/that no one can act on) please also send email asking for the bot to be reconfigured or disabled. (or, if you want to be a bit more punchy - send a patch to the zorg repository to have the bot disabled & explain why you're proposing that)

I agree with this in the abstract, but I get pinged completely reliably at least twice after every single of my commits. This isn't something that sometimes happens, it's something that always happens.

Could you point to specific buildbots/email when that comes up to help improve things both on IRC and email/mailing lists, etc?

Just land a change :) Or look at IRC scrollback. Given how easy it is to find these problems, it doesn't seem like there's a lot of appetite for improving this.

I think there's apetite for changing it in some way - no one enjoys the current state of things. But often people assume it's not changeable, whereas I think it is - and I think it's important that it be changed because if we silence all the bots, then quality is likely to go down. Silencing the IRC bot may still be good - folks should be getting buildbot fail email which is more targeted and not spamming the channel for people who aren't to blame (heck, the bots could send private messages instead, I guess?).

But improving signal/noise should benefit the email, and the bot spam (whichever channel it's in).
 
Hence me asking about removing llvmbb (...and so far everyone seems to be in favor).

In this case, from my IRC scrollback (there's more people on the blamelist, spread over several follow-on IRC messages):

build #13975 of clang-ppc64le-linux-multistage is complete: Failure [failed ninja check 1]  Build details are at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13975  blamelist: LLVM GN Syncbot <[hidden email]>, Nico Weber <[hidden email]>

That doesn't look like the "always be broken" case. It was green on the build prior to this one ( http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/13974 ) 

Looks like the buildbot triggered correctly, only took the 2 revisions you committed. The test did pass at the prior revision and did fail at that revision - perhaps either the buildbot or the test is flakey? (interestingly the test failed in stage 1 at 13975, then failed in stage 2 at 13976 - then passed again in 13977. Both failures for the same reason "/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/target-override.c.script: line 5: /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/tools/clang/test/Driver/Output/testbin/i386-clang: No such file or directory" - perhaps some problem with creating the symlink?

Started an llvm-dev thread to discuss that separately in more detail.
 
build #24132 of clang-with-thin-lto-ubuntu is complete: Failure [failed test-stage1-compiler]  Build details are at http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24132  blamelist: Nico Weber <[hidden email]>, Matt Arsenault <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina

Also green on the prior build ( http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24131 ).
Went green again after a revert here: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/24140 which matches the commit that made the bot go red - so this looks to be a bot doing what it's meant to do. (varying levels of quality, and 2 hour cycle time isn't ideal by any means, though it found this failure in 5 minutes once it started (but that could be 2 hours after a commit))

What do you think we should do with bots like this? Should long cycle time/long blame list bots (not always the same thing) produce no notifications, and require them to be triaged by the bot owner who then manually sends email/follow-up once a rough guess of blame has been made & checked that it hasn't already been possibly diagnosed, discussed and fixed due to a faster bot or other means?

My personal opinion is that we shouldn't have any bots that take more than an hour to cycle send any notifications.
 
 
build #2255 of lld-x86_64-win is complete: Failure [failed test-check-all]  Build details are at http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2255  blamelist: LLVM GN Syncbot <[hidden email]>, Eric Astor <[hidden email]>, Craig Topper <[hidden email]>, Alina Sbirlea <[hidden email]>, Nico Weber <[hidden email]>, Amara

Also green on the prior build ( http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/2254 ), and went back to green on the following build.
Possibly this was related to the same commit/revert as in the previous bot in this list. It's a fairly fast bot, went red on a build including the revision that committed the xor issue, and green on the next build that included a revert of that patch. I couldn't say for sure, though.


Was red for a few builds then green again here: http://green.lab.llvm.org/green/job/clang-stage1-RA/14183/

Looks like the build that went red and the build that went green (& the fact that the failure was related to libfuzzer) correlates well with this commit: https://github.com/llvm/llvm-project/commit/2665425908e00618074e42155ec922a37f7c9002 and this revert: https://github.com/llvm/llvm-project/commit/7139736261e047e9cca030e2ee5912bf2a16f816
 
Chances are that there's something genuinely broken somewhere (maybe compiler-rt?), but asking for concrete bots distracts from the point that there's something broken on every single commit, which makes the bot just let you know that you committed something in the last few hours.

They also contain information about failures - yeah, they might not be yours, but they are often/usually someone's, not just flakey bot failures. If you're suggesting all the bots are unactionable - then perhaps we should turn off all notifications on all of them? I have certainly considered that - and then only enabling bots that are fast/high signal-to-noise/small blame list. Though I imagine that's a bigger discussion.
   
and the broken bots tend to have cycle times of several hours.

Long cycle times are a real problem - that might be best left to another discussion about buildbot maintenance - I would be for a policy that says bot windows shouldn't be longer than, say, an hour or maybe less. (so, eg: if you have a bot that's just going to take 5 hours to run - then you need 5 machines that each pickup work every hour, so the blame lists are smaller) this doesn't solve the problem of being notified 5 hours later about a breakage that was caused by someone else who committed a few minutes before or after you. Solving that problem will require a much greater investment in infrastructure to chain buildbots, possibly use built artefacts from one buildbot to another, etc.
 
So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

I sometimes find it useful, but happy to move to llvm-build to get those notifications. Other folks might not know to do that, though.
 
The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
I'm not on IRC anymore, so my opinion matters less, but I think it's time to shut it down. It is a relic from a different time when there were fewer bots, fewer contributors, and more IRC users. These days it generates too many notifications and the audience isn't as well targeted.

On Tue, Sep 1, 2020 at 12:08 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
Fair enough - it is very noisy.

I'm surprised these aren't all producing corresponding emails, though (at least those only go to the people on the blame list - but that seemed to be the case Nico was citing - though the general "I'm not even to blame but this is adding a lot of noise to the channel/making it hard to have conversations" is a broad/broader problem). And still seems important to improve the signal/noise ratio.

Sent https://reviews.llvm.org/D87100 to do that. & also looking into the config about IRC V email notification configuration. Hopefully those configurations can be unified. I don't think it's any more appropriate to send IRC notifications than email notifications. (I guess now that the IRC notifications will be fairly opt-in, maybe - but I still would rather there be less noise to make the signal stand out, so if a bot isn't producing accurate enough info to send mail, then maybe not IRC either)

On Thu, Sep 3, 2020 at 8:47 AM Reid Kleckner via cfe-dev <[hidden email]> wrote:
I'm not on IRC anymore, so my opinion matters less, but I think it's time to shut it down. It is a relic from a different time when there were fewer bots, fewer contributors, and more IRC users. These days it generates too many notifications and the audience isn't as well targeted.

On Tue, Sep 1, 2020 at 12:08 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Can we remove llvmbb from IRC?

Hubert Tong via cfe-dev
Thanks, David for proposing the patch.

> but I still would rather there be less noise to make the signal stand out, so if a bot isn't producing accurate enough info to send mail, then maybe not IRC either

Fair enough.

IRC notifier is a bot, so if somebody is interested in notifications from a particular bot they could subscribe to those. And by default we could send notifications to the #llvm-build only from faster bots with shorter blame lists. That would discriminate heavier and slower bots, but it seems those get buried under noise anyway.

David has already said this, but I want to repeat. The bots report only state changes. If a bot is red, it does not report failures following the first one till it gets green again. If anyone sees otherwise, please let me know, so I could troubleshoot.

Thanks

Galina


On Thu, Sep 3, 2020 at 10:34 AM David Blaikie <[hidden email]> wrote:
Fair enough - it is very noisy.

I'm surprised these aren't all producing corresponding emails, though (at least those only go to the people on the blame list - but that seemed to be the case Nico was citing - though the general "I'm not even to blame but this is adding a lot of noise to the channel/making it hard to have conversations" is a broad/broader problem). And still seems important to improve the signal/noise ratio.

Sent https://reviews.llvm.org/D87100 to do that. & also looking into the config about IRC V email notification configuration. Hopefully those configurations can be unified. I don't think it's any more appropriate to send IRC notifications than email notifications. (I guess now that the IRC notifications will be fairly opt-in, maybe - but I still would rather there be less noise to make the signal stand out, so if a bot isn't producing accurate enough info to send mail, then maybe not IRC either)

On Thu, Sep 3, 2020 at 8:47 AM Reid Kleckner via cfe-dev <[hidden email]> wrote:
I'm not on IRC anymore, so my opinion matters less, but I think it's time to shut it down. It is a relic from a different time when there were fewer bots, fewer contributors, and more IRC users. These days it generates too many notifications and the audience isn't as well targeted.

On Tue, Sep 1, 2020 at 12:08 PM Nico Weber via cfe-dev <[hidden email]> wrote:
Hi,

llvmbb's job is to inform people of build breaks. However, it seems to trigger for a big list of bots, and at least one of them seems to always be broken, and the broken bots tend to have cycle times of several hours. So if you're on IRC and you commit something, you get pinged by llvmbb for hours afterwards.

Does anyone think llvmbb is useful?

The best thing about llvmbb I've heard it's easy to just "/ignore llvmbb", but if that's what everybody does then why not not have it in the first place?

Nico
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev