LLVM Releases: Upstream vs. Downstream / Distros

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
Folks,

There has been enough discussion about keeping development downstream
and how painful that is. Even more so, I think we all agree, is having
downstream releases while tracking upstream releases, trunk and other
branches (ex. Android).

I have proposed "en passant" a few times already, but now I'm going to
do it to a wider audience:

Shall we sync our upstream release with the bulk of other downstream
ones as well as OS distributions?


This work involves a *lot* of premises that are not encoded yet, so
we'll need a lot of work from all of us. But from the recent problems
with GCC abi_tag and the arduous job of downstream release managers to
know which patches to pick, I think there has been a lot of wasted
effort by everyone, and that generates stress, conflicts, etc.

I'd also like to reinforce the basic idea of software engineering: to
use the same pattern for the same problem. So, if we have one way to
link sequences of patches and merge them upstream, we ought to use the
same pattern (preferably the same scripts) downstream, too. Of course
there will be differences, but we should treat them as the exception,
not the rule.

So, a list of things will need to be solved to get to a low waste
release process:


  1. Timing

Many downstream release managers, as well as distro maintainers have
complained about the timing of our releases, and how unreliable they
are, and how that makes it hard for them to plan their own branches,
cherry-picks and merges. If we release too early, they miss out
important optimisations, if we do too late, they'll have to branch
"just before" and risk having to back-port late fixes to their own
modified trees.

Products that rely on LLVM already have their own life cycles and we
can't change that. Nor we can make all downstream products align to
our quasi-chaotic release process. However, the important of the
upstream release for upstream developers is *a lot* lower than for the
downstream maintainers, so unless the latter group puts their weight
(and effort) in the upstream process, little is going to happen to
help them.

A few (random) ideas:

 * Do an average on all product cycles, pick the least costly time to
release. This would marginalise those beyond the first sigma and we'd
make their lives much harder than those within one sigma.
 * Do the same average on the projects that are willing to lend a
serious hand to the upstream release process. This has the same
problem, but it's based on actual effort. It does concentrate bias on
the better funded projects, but it's also easier for low key projects
to change their release schedules.
 * Try to release more often. The current cost of a release is high,
but if we managed to lower it (by having more people, more automation,
shared efforts), than it could be feasible and much fairer than
weighted averages.


  2. Process

Our release process is *very* lean, and that's what makes it
quasi-chaotic. In the beginning, not many people / companies wanted to
help or cared about the releases, so the process was what whomever was
doing, did. The major release process is now better defined, but the
same happened to the minor releases.

For example, we have no defined date to start, or to end. We have no
assigned people to do the official releases, or test the supported
targets. We still rely on voluntary work from all parties. That's ok
when the release is just "a point in time", but if downstream releases
and OS distributions start relying on our releases, we really should
get a bit more professional.

A few (random) ideas:

 * We should have predictable release times, both for starting it and
finishing it. There will be complications, but we should treat them as
the exception, not the rule.
 * We should have appointed members of the community that would be
responsible for those releases, in the same way we have code owners
(volunteers, but no less responsible), so that we can guarantee a
consistent validation across all relevant targets. This goes beyond
x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
etc.
 * The upstream release should be, as much as possible, independent of
which OS they run on. OS specific releases should be done in the
distributions themselves, and people interested should raise the
concern in their own communities.
 * Downstream managers should be an integral part of the upstream
release process. Whenever the release manager sends the email, they
should test on their end and reply with GREEN/RED flags.
 * Downstream managers should also propose back-ports that are
important to them in the upstream release. It's likely that a fix is
important to a number of downstream releases but not many people
upstream (since we're all using trunk). So, unless they tell us, we
won't know.
 * OS distribution managers should test on their builds, too. I know
FreeBSD and Mandriva build by default with Clang. I know that Debian
has an experimental build. I know that RedHat and Ubuntu have LLVM
packages that they do care. All that has to be tested *at least* every
major release, but hopefully on all releases. (those who already do
that, thank you!)
 * A number of upstream groups, or downstream releases that don't
track upstream releases, should *also* test them on their own
workloads. Doing so, will get the upstream release in a much better
quality level, and in turn, allow those projects to use it on their
own internal releases.
 * Every *new* bug found in any of those downstream tests should be
reported in Bugzilla with the appropriate category (critical / major /
minor). All major bugs have to be closed for the release to be out,
etc. (the specific process will have to be agreed and documented).


  3. Automation

As exposed in the timing and process sections, automation is key to
reducing costs for all parties. We should collate the encoded process
we have upstream with the process projects have downstream, and
convert upstream everything that we can / is relevant.

For example, finding which patches revert / fix another one that was
already cherry-picked is a common task that should be identical to
everyone. A script that would sweep the commit logs, looking for
clues, would be useful to everyone.

A few (random) ideas:

 * We should discuss the process, express the consensus on the
official documentation, and encode it in a set of scripts. It's a lot
easier to tell a developer "please do X because it helps our script
back-port your patch" than to say "please do X because it's nice" or
"do X because it's in the 'guideline'".
 * There's no way to force (via git-hook) developers to add a bugzilla
ID or a review number on the commit message (not all commits are
equal), so the scripts that scan commits will have to be smart enough,
but that'll create false-positives, and they can't commit without
human intervention. Showing why a commit wasn't picked up by the
script, or was erroneously picked up, is a good way to educate people.
 * We could have a somewhat-common interface with downstream releases,
so some scripts that they use could be upstreamed, if many of them
used the same entry point for testing, validating, building,
packaging.
 * We could have the scripts that distros use for building their own
packages in our tree, so they could maintain them locally and we'd
know which changes are happening and would be much easier to warn the
others, common up the interface, etc.


In the end, we're a bunch of people from *very* different communities
doing similar work. In the spirit of open source, I'd like to propose
that we share the work and the responsibility of producing high
quality software with minimal waste.

I don't think anyone receiving this email disagrees with the statement
that we can't just take and not give back, and that being part of this
community means we may have to work harder than our employers would
think brings direct profit, so that they can profit even more
indirectly later, and with that, everyone that uses or depends on our
software.

My personal and very humble opinion is that coalescing the release
process will, in the long term, actually *save* us a lot of work, and
the quality will be increased. Even if we don't reach perfection, and
by no means I think we will, at least we'll have something slightly
better. If anything, at least we tried.

I'd hate to continue doing an inefficient process without even trying
an alternative...

Comments?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
This is a long email :-) I've made some comments inline, but I'll
summarize my thoughts here:

- I like to think that the major releases have been shipped on a
pretty reliable six-month schedule lately. So we have that going for
us :-)

- It seems hard to align our upstream schedule to various downstream
preferences. One way would be to release much more often, but I don't
know if that's really desirable.

- I would absolutely like to see more involvement in the upstream
release processes from downstream folks and distros.

- I think we should use the bug tracker to capture issues that affect
releases. It would be cool if a commit hook could update bugzilla
entries that refer to it.

Cheers,
Hans


On Wed, May 11, 2016 at 7:08 AM, Renato Golin <[hidden email]> wrote:

> Folks,
>
> There has been enough discussion about keeping development downstream
> and how painful that is. Even more so, I think we all agree, is having
> downstream releases while tracking upstream releases, trunk and other
> branches (ex. Android).
>
> I have proposed "en passant" a few times already, but now I'm going to
> do it to a wider audience:
>
> Shall we sync our upstream release with the bulk of other downstream
> ones as well as OS distributions?
>
>
> This work involves a *lot* of premises that are not encoded yet, so
> we'll need a lot of work from all of us. But from the recent problems
> with GCC abi_tag and the arduous job of downstream release managers to
> know which patches to pick, I think there has been a lot of wasted
> effort by everyone, and that generates stress, conflicts, etc.
>
> I'd also like to reinforce the basic idea of software engineering: to
> use the same pattern for the same problem. So, if we have one way to
> link sequences of patches and merge them upstream, we ought to use the
> same pattern (preferably the same scripts) downstream, too. Of course
> there will be differences, but we should treat them as the exception,
> not the rule.
>
> So, a list of things will need to be solved to get to a low waste
> release process:
>
>
>   1. Timing
>
> Many downstream release managers, as well as distro maintainers have
> complained about the timing of our releases, and how unreliable they
> are, and how that makes it hard for them to plan their own branches,
> cherry-picks and merges. If we release too early, they miss out
> important optimisations, if we do too late, they'll have to branch
> "just before" and risk having to back-port late fixes to their own
> modified trees.
>
> Products that rely on LLVM already have their own life cycles and we
> can't change that. Nor we can make all downstream products align to
> our quasi-chaotic release process. However, the important of the
> upstream release for upstream developers is *a lot* lower than for the
> downstream maintainers, so unless the latter group puts their weight
> (and effort) in the upstream process, little is going to happen to
> help them.
>
> A few (random) ideas:
>
>  * Do an average on all product cycles, pick the least costly time to
> release. This would marginalise those beyond the first sigma and we'd
> make their lives much harder than those within one sigma.
>  * Do the same average on the projects that are willing to lend a
> serious hand to the upstream release process. This has the same
> problem, but it's based on actual effort. It does concentrate bias on
> the better funded projects, but it's also easier for low key projects
> to change their release schedules.
>  * Try to release more often. The current cost of a release is high,
> but if we managed to lower it (by having more people, more automation,
> shared efforts), than it could be feasible and much fairer than
> weighted averages.

My random thoughts:

At least for the major releases, I think we're doing pretty well on
timing in terms of predictability: since 3.6, we have release every
six months: first week of March and first week of September (+- a few
days). Branching has been similarly predictive: mid-January and
mid-July.

If there are many downstream releases for which shifting this schedule
would be useful, I suppose we could do that, but it seems unlikely
that there would be agreement on this, and changing the schedule is
disruptive for those who depend on it.

The only reasonable way I see of aligning upstream releases with
downstream schedules would be to release much more often. This works
well in Chromium where there's a 6-week staged release schedule. This
would mean there's always a branch going for the next release, and
important bug fixes would get merged to that. In Chromium we drive
this from the bug tracker -- it would be very hard to scan each commit
for things to cherry-pick. This kind of process has a high cost
though, there has to be good infrastructure for it (buildbots on the
branch for all targets, for example), developers have to be aware, and
even then it's a lot of work for those doing the releases. I'm not
sure we'd want to take this on. I'm also not sure it would be suitable
for a compiler, where we want the releases to have long life-time.


>   2. Process
>
> Our release process is *very* lean, and that's what makes it
> quasi-chaotic. In the beginning, not many people / companies wanted to
> help or cared about the releases, so the process was what whomever was
> doing, did. The major release process is now better defined, but the
> same happened to the minor releases.
>
> For example, we have no defined date to start, or to end.

For the major releases, I've tried to do this. We could certainly
formalize it by posting it on the web page though.

> We have no
> assigned people to do the official releases, or test the supported
> targets. We still rely on voluntary work from all parties. That's ok
> when the release is just "a point in time", but if downstream releases
> and OS distributions start relying on our releases, we really should
> get a bit more professional.

Most importantly, those folks should get involved :-)

>
> A few (random) ideas:
>
>  * We should have predictable release times, both for starting it and
> finishing it. There will be complications, but we should treat them as
> the exception, not the rule.

SGTM, we pretty much already have this for major releases.

>  * We should have appointed members of the community that would be
> responsible for those releases, in the same way we have code owners
> (volunteers, but no less responsible), so that we can guarantee a
> consistent validation across all relevant targets. This goes beyond
> x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
> etc.

In practice, we kind of have this for at least some of the targets.
Maybe we should write this down somewhere instead of me asking for
(the same) volunteers each time the release process starts?

>  * The upstream release should be, as much as possible, independent of
> which OS they run on. OS specific releases should be done in the
> distributions themselves, and people interested should raise the
> concern in their own communities.
>  * Downstream managers should be an integral part of the upstream
> release process. Whenever the release manager sends the email, they
> should test on their end and reply with GREEN/RED flags.
>  * Downstream managers should also propose back-ports that are
> important to them in the upstream release. It's likely that a fix is
> important to a number of downstream releases but not many people
> upstream (since we're all using trunk). So, unless they tell us, we
> won't know.
>  * OS distribution managers should test on their builds, too. I know
> FreeBSD and Mandriva build by default with Clang. I know that Debian
> has an experimental build. I know that RedHat and Ubuntu have LLVM
> packages that they do care. All that has to be tested *at least* every
> major release, but hopefully on all releases. (those who already do
> that, thank you!)
>  * A number of upstream groups, or downstream releases that don't
> track upstream releases, should *also* test them on their own
> workloads. Doing so, will get the upstream release in a much better
> quality level, and in turn, allow those projects to use it on their
> own internal releases.
>  * Every *new* bug found in any of those downstream tests should be
> reported in Bugzilla with the appropriate category (critical / major /
> minor). All major bugs have to be closed for the release to be out,
> etc. (the specific process will have to be agreed and documented).
>
>
>   3. Automation
>
> As exposed in the timing and process sections, automation is key to
> reducing costs for all parties. We should collate the encoded process
> we have upstream with the process projects have downstream, and
> convert upstream everything that we can / is relevant.
>
> For example, finding which patches revert / fix another one that was
> already cherry-picked is a common task that should be identical to
> everyone. A script that would sweep the commit logs, looking for
> clues, would be useful to everyone.
>
> A few (random) ideas:
>
>  * We should discuss the process, express the consensus on the
> official documentation, and encode it in a set of scripts. It's a lot
> easier to tell a developer "please do X because it helps our script
> back-port your patch" than to say "please do X because it's nice" or
> "do X because it's in the 'guideline'".
>  * There's no way to force (via git-hook) developers to add a bugzilla
> ID or a review number on the commit message (not all commits are
> equal), so the scripts that scan commits will have to be smart enough,
> but that'll create false-positives, and they can't commit without
> human intervention. Showing why a commit wasn't picked up by the
> script, or was erroneously picked up, is a good way to educate people.
>  * We could have a somewhat-common interface with downstream releases,
> so some scripts that they use could be upstreamed, if many of them
> used the same entry point for testing, validating, building,
> packaging.
>  * We could have the scripts that distros use for building their own
> packages in our tree, so they could maintain them locally and we'd
> know which changes are happening and would be much easier to warn the
> others, common up the interface, etc.
>
>
> In the end, we're a bunch of people from *very* different communities
> doing similar work. In the spirit of open source, I'd like to propose
> that we share the work and the responsibility of producing high
> quality software with minimal waste.
>
> I don't think anyone receiving this email disagrees with the statement
> that we can't just take and not give back, and that being part of this
> community means we may have to work harder than our employers would
> think brings direct profit, so that they can profit even more
> indirectly later, and with that, everyone that uses or depends on our
> software.
>
> My personal and very humble opinion is that coalescing the release
> process will, in the long term, actually *save* us a lot of work, and
> the quality will be increased. Even if we don't reach perfection, and
> by no means I think we will, at least we'll have something slightly
> better. If anything, at least we tried.
>
> I'd hate to continue doing an inefficient process without even trying
> an alternative...
>
> Comments?
>
> cheers,
> --renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
On 11 May 2016 at 17:16, Hans Wennborg <[hidden email]> wrote:
> This is a long email :-) I've made some comments inline, but I'll
> summarize my thoughts here:

Thanks Hans!

I'll respond them inline, below.


> - I think we should use the bug tracker to capture issues that affect
> releases. It would be cool if a commit hook could update bugzilla
> entries that refer to it.

That seems like a simple hook.


> At least for the major releases, I think we're doing pretty well on
> timing in terms of predictability: since 3.6, we have release every
> six months: first week of March and first week of September (+- a few
> days). Branching has been similarly predictive: mid-January and
> mid-July.

Indeed, we got a lot better more recently (last 2y), and mostly thanks
to you. :)

We used to vary 3 months +-, and now we're down to a few days.
Whatever we decide, I think we should make it official but putting it
out somewhere, so people can rely on that.

Right now, even if you're extra awesome, there's nothing telling the
distros and LLVM-based products that it will be so if someone else
takes over the responsibility, so they can't adapt.

That's what I meant by "quasi-chaotic".


> If there are many downstream releases for which shifting this schedule
> would be useful, I suppose we could do that, but it seems unlikely
> that there would be agreement on this, and changing the schedule is
> disruptive for those who depend on it.

That's the catch. If we want them to participate, the process has to
have some meaning to them. The fact that not many people do, is clear
to me what it means.

We also need to know better how many other releases already depend on
the upstream process (not just Chromium, for obvious reasons), to be
able to do an informed choice of dates and frequency.

The more well positioned and frequent we are, the more people will
help, but there's a point where the curve bends down, and the cost is
just too great. We need to find the inflection point, and that will
require some initial investigations and guesses, and a lot of fine
tuning later. But if we're all on the same page, I think we can do
that, even if it takes time.

I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:

 * They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
 * They already had their own release schedule, so aligning with ours
brought no extra benefit
 * We always expected people to work off trunk, and everyone had to
create their own process

I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)



> The only reasonable way I see of aligning upstream releases with
> downstream schedules would be to release much more often. This works
> well in Chromium where there's a 6-week staged release schedule. This
> would mean there's always a branch going for the next release, and
> important bug fixes would get merged to that.

Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.



> In Chromium we drive
> this from the bug tracker -- it would be very hard to scan each commit
> for things to cherry-pick. This kind of process has a high cost
> though, there has to be good infrastructure for it (buildbots on the
> branch for all targets, for example), developers have to be aware, and
> even then it's a lot of work for those doing the releases. I'm not
> sure we'd want to take this on. I'm also not sure it would be suitable
> for a compiler, where we want the releases to have long life-time.

This works because you have a closed system. As you say, Chromium is
mostly final product, not a tool to develop other products, and the
validation is a lot simpler.

With Clang, we'd want to involve external releases into it, and it
simply wouldn't scale.



> For the major releases, I've tried to do this. We could certainly
> formalize it by posting it on the web page though.

I think that'd be the first step, yes. But I wanted to start with a
good number. 2 times a year? Would 3 times improve things that much
for the outsiders? Or just moving the dates would be enough for most
people?

That's why I copied so many outsiders, so they can chime in and let us
know what would be good for *them*.


> Most importantly, those folks should get involved :-)

Indeed!


> In practice, we kind of have this for at least some of the targets.
> Maybe we should write this down somewhere instead of me asking for
> (the same) volunteers each time the release process starts?

I give consent to mark me as the ARM/AArch64 release tester for the
foreseeable future. :)

I can also help Sylvestre, Doko, Ed, Jeff, Bero etc. to test on their
system running on ARM/AArch64 hardware.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
On Wed, May 11, 2016 at 11:10 AM, Renato Golin <[hidden email]> wrote:
> We also need to know better how many other releases already depend on
> the upstream process (not just Chromium, for obvious reasons), to be
> able to do an informed choice of dates and frequency.

Just a small note: Chromium doesn't use the releases, but instead
picks a good revision of the trunk every other week or so.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
Hi,
for those who don't know me, I'm an AOSP developer at work and an OpenMandriva developer (including maintainer of its toolchains) outside of work, so I'm looking at this from (at least) 2 different perspectives and may jump around between them.
Replies inline...

On 11 May 2016 at 16:08, Renato Golin <[hidden email]> wrote:
There has been enough discussion about keeping development downstream
and how painful that is. Even more so, I think we all agree, is having
downstream releases while tracking upstream releases, trunk and other
branches (ex. Android).

In the OpenMandriva world, we usually try to have clang (our primary compiler) as close as possible to the latest upstream stable release.
We're currently following the release_38 branch, and expect to jump on trunk as soon as our distro release has happened (because we expect 3.9 to be ready before we'll make our subsequent release - better to get on the branch we'll be using for the next release early than to suddenly face problems when updating to the next release).

In the AOSP world, we obviously have to (somewhat) follow what Google does, which is typically pick a trunk snapshot and work from there - but we have some work underway to extract their patches so we can apply them on top of a release or snapshot of our choice (current thought is mostly nightly builds for testing).

This work involves a *lot* of premises that are not encoded yet, so
we'll need a lot of work from all of us. But from the recent problems
with GCC abi_tag and the arduous job of downstream release managers to
know which patches to pick, I think there has been a lot of wasted
effort by everyone, and that generates stress, conflicts, etc.

From both perspectives, it would be great to have a common set of "known good" and relevant patches like gcc abi_tag, or fixes for bugs commonly encountered.
Ideally, I'd like to see those patches just backported on the release_38 branch to keep the number of external patches low.

gcc abi_tag is a bit of a headache in the OpenMandriva world, while we build just about everything with clang these days, of course it would be good to restore binary compatibility with the bigger distributions (almost all of which are using current gcc with the new abi enabled).

Many downstream release managers, as well as distro maintainers have
complained about the timing of our releases,

The timing has been quite predictable lately -- but of course the website still says "TBD" for both 3.8.1 and 3.9.0, maybe communicating the (likely) plan could use some improvement.
 
 * Do the same average on the projects that are willing to lend a
serious hand to the upstream release process.

What can we (this time being OpenMandriva) do? We don't have any great compiler engineers, but we're heavy users - would it help to run a mass build of all packages for all supported architectures (at this time: i586, x86_64, armv7hnl, aarch64) to detect errors on a prerelease builds? We have the infrastructure in place, even showing a fairly nice list of failed builds along with build logs. (But of course we there will be false positives caused by e.g. a library update that happened around the same time as the compiler update.)

 * Try to release more often. The current cost of a release is high,
but if we managed to lower it (by having more people, more automation,
shared efforts), than it could be feasible and much fairer than
weighted averages.

That would be a good idea IMO, we've run into "current trunk is much better than the last stable release anyway" situations more than once (in both projects).
 
For example, we have no defined date to start, or to end. We have no
assigned people to do the official releases, or test the supported
targets. We still rely on voluntary work from all parties. That's ok
when the release is just "a point in time", but if downstream releases
and OS distributions start relying on our releases, we really should
get a bit more professional.

Backporting some more fixes to the stable branches would be great too (but of course I realize that's a daunting and not very interesting task).
 
 * Downstream managers should be an integral part of the upstream
release process. Whenever the release manager sends the email, they
should test on their end and reply with GREEN/RED flags.
 * Downstream managers should also propose back-ports that are
important to them in the upstream release. It's likely that a fix is
important to a number of downstream releases but not many people
upstream (since we're all using trunk). So, unless they tell us, we
won't know.

Sounds good to me, volunteering to participate in both.

 * We could have the scripts that distros use for building their own
packages in our tree, so they could maintain them locally and we'd
know which changes are happening and would be much easier to warn the
others, common up the interface, etc.

While interesting from an upstream perspective, I doubt that will happen reliably -- there's too many people working on the build scripts who would not automatically have write access to the tree etc. and most distro build farms rely on having the build scripts in a common place, so duplication would be unavoidable.

ttyl
bero 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
> -----Original Message-----
> From: Renato Golin [mailto:[hidden email]]
> Sent: Wednesday, May 11, 2016 11:11 AM
> To: Hans Wennborg
> Cc: LLVM Dev; Clang Dev; Quentin Colombet; Tom Stellard; Robinson, Paul;
> Jim Grosbach; Kristof Beyls; Frédéric Richez; Reid Kleckner; Philip
> Reames; Matthias Braun; Bernhard Rosenkränzer; Sylvestre Ledru; Matthias
> Klose; Stephen Hines; Jeff Law; Ed Maste; Behan Webster
> Subject: Re: LLVM Releases: Upstream vs. Downstream / Distros
>
> On 11 May 2016 at 17:16, Hans Wennborg <[hidden email]> wrote:
> > This is a long email :-) I've made some comments inline, but I'll
> > summarize my thoughts here:
>
> Thanks Hans!
>
> I'll respond them inline, below.
>
>
> > - I think we should use the bug tracker to capture issues that affect
> > releases. It would be cool if a commit hook could update bugzilla
> > entries that refer to it.
>
> That seems like a simple hook.
>
>
> > At least for the major releases, I think we're doing pretty well on
> > timing in terms of predictability: since 3.6, we have release every
> > six months: first week of March and first week of September (+- a few
> > days). Branching has been similarly predictive: mid-January and
> > mid-July.
>
> Indeed, we got a lot better more recently (last 2y), and mostly thanks
> to you. :)

Absolutely.  It's enough of a track record to allow reasonable planning.

>
> We used to vary 3 months +-, and now we're down to a few days.
> Whatever we decide, I think we should make it official but putting it
> out somewhere, so people can rely on that.

A public commitment to the future release schedule would be that much
more justification for planning to participate.

>
> Right now, even if you're extra awesome, there's nothing telling the
> distros and LLVM-based products that it will be so if someone else
> takes over the responsibility, so they can't adapt.
>
> That's what I meant by "quasi-chaotic".
>
>
> > If there are many downstream releases for which shifting this schedule
> > would be useful, I suppose we could do that, but it seems unlikely
> > that there would be agreement on this, and changing the schedule is
> > disruptive for those who depend on it.
>
> That's the catch. If we want them to participate, the process has to
> have some meaning to them. The fact that not many people do, is clear
> to me what it means.
>
> We also need to know better how many other releases already depend on
> the upstream process (not just Chromium, for obvious reasons), to be
> able to do an informed choice of dates and frequency.
>
> The more well positioned and frequent we are, the more people will
> help, but there's a point where the curve bends down, and the cost is
> just too great. We need to find the inflection point, and that will
> require some initial investigations and guesses, and a lot of fine
> tuning later. But if we're all on the same page, I think we can do
> that, even if it takes time.
>
> I'm particularly concerned with Android, because they not only have
> their own tree with heavily modified LLVM components (ex.
> Compiler-RT), but they also build differently and so their process are
> completely alien to ours. One of the key reasons why these things
> happened is because:
>
>  * They couldn't rely on our releases, as fixing bugs and back-porting
> wasn't a thing back then
>  * They already had their own release schedule, so aligning with ours
> brought no extra benefit
>  * We always expected people to work off trunk, and everyone had to
> create their own process
>
> I don't want to change how people work, just to add one more valid way
> of working, which is most stable for upstream releases. :)
>
>
>
> > The only reasonable way I see of aligning upstream releases with
> > downstream schedules would be to release much more often. This works
> > well in Chromium where there's a 6-week staged release schedule. This
> > would mean there's always a branch going for the next release, and
> > important bug fixes would get merged to that.
>
> Full validation every 6 weeks is just not possible. But a multiple of
> that, say every 3~4 months, could be much easier to work around.
>
>
>
> > In Chromium we drive
> > this from the bug tracker -- it would be very hard to scan each commit
> > for things to cherry-pick. This kind of process has a high cost
> > though, there has to be good infrastructure for it (buildbots on the
> > branch for all targets, for example), developers have to be aware, and
> > even then it's a lot of work for those doing the releases. I'm not
> > sure we'd want to take this on. I'm also not sure it would be suitable
> > for a compiler, where we want the releases to have long life-time.
>
> This works because you have a closed system. As you say, Chromium is
> mostly final product, not a tool to develop other products, and the
> validation is a lot simpler.
>
> With Clang, we'd want to involve external releases into it, and it
> simply wouldn't scale.
>
>
>
> > For the major releases, I've tried to do this. We could certainly
> > formalize it by posting it on the web page though.
>
> I think that'd be the first step, yes. But I wanted to start with a
> good number. 2 times a year? Would 3 times improve things that much
> for the outsiders? Or just moving the dates would be enough for most
> people?
>
> That's why I copied so many outsiders, so they can chime in and let us
> know what would be good for *them*.

Data point: At Sony we ship our stuff every 6 months and that isn't
going to change.  Our policy has been to base our releases on upstream
releases, and given our current lead times, the current upstream release
schedule is actually not bad, especially if it has stopped drifting.
Having four releases pretty consistently follow the current schedule is
extremely positive, thanks!

It would help our internal planning to publish the schedule for future
releases.  You never know, we might even be able to fork an internal
branch in time to help with the release testing, although that is not
in any way a lightweight process (so no promises).

There has been some talk about moving toward releasing from upstream
trunk, or more precisely to streamlining our internal testing process
in order to allow us to seriously contemplate releasing from upstream
trunk.  I can't argue with the streamlining part, for sure.  Whether
the rest of it works out must remain to be seen.
--paulr

>
>
> > Most importantly, those folks should get involved :-)
>
> Indeed!
>
>
> > In practice, we kind of have this for at least some of the targets.
> > Maybe we should write this down somewhere instead of me asking for
> > (the same) volunteers each time the release process starts?
>
> I give consent to mark me as the ARM/AArch64 release tester for the
> foreseeable future. :)
>
> I can also help Sylvestre, Doko, Ed, Jeff, Bero etc. to test on their
> system running on ARM/AArch64 hardware.
>
> cheers,
> --renato

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
Thanks Bero! Comments inline.


On 11 May 2016 at 23:47, Bernhard Rosenkränzer
<[hidden email]> wrote:
> In the OpenMandriva world, we usually try to have clang (our primary
> compiler) as close as possible to the latest upstream stable release.

It would be good to know what's stopping you from using a fully
upstream process (like patch trunk, back-port, pull).

I'm sure you're not the only one having the same problems, and even if
the only take-out of this thread is to streamline our back-port
process, all downstream releases will already benefit.



> We're currently following the release_38 branch, and expect to jump on trunk
> as soon as our distro release has happened (because we expect 3.9 to be
> ready before we'll make our subsequent release - better to get on the branch
> we'll be using for the next release early than to suddenly face problems
> when updating to the next release).

That's a very good strategy, indeed. And somewhat independent on how
good the release is.

Early help can also increase downstream participation pre-release, and
will eventually considerably reduce the need to back-ports.



> In the AOSP world, we obviously have to (somewhat) follow what Google does,
> which is typically pick a trunk snapshot and work from there - but we have
> some work underway to extract their patches so we can apply them on top of a
> release or snapshot of our choice (current thought is mostly nightly builds
> for testing).

This sounds awful, and is the very thing I'm trying to minimise. I can
certainly understand the need to local patches, but using different
trunks makes the whole thing very messy.

If the releases are good and timely, and if the back-porting process
is efficient, I hope we'll never have the need to do that.


> From both perspectives, it would be great to have a common set of "known
> good" and relevant patches like gcc abi_tag, or fixes for bugs commonly
> encountered.
> Ideally, I'd like to see those patches just backported on the release_38
> branch to keep the number of external patches low.

Indeed, my point exactly. Downstream releases and distros can easily
share sets of known "good" patches for this or that, but I'd very much
prefer to have as much as possible upstream.


> gcc abi_tag is a bit of a headache in the OpenMandriva world, while we build
> just about everything with clang these days, of course it would be good to
> restore binary compatibility with the bigger distributions (almost all of
> which are using current gcc with the new abi enabled).

Ubuntu LTS has just released with LLVM 3.8, which doesn't have the
abi_tag fixes.

If we don't back-port them to 3.8.1 or 3.8.2, Ubuntu will have to do
that on their own, and that's exactly what I'm trying to solve here.

I also feel like they're not the only one with that problem...



> The timing has been quite predictable lately -- but of course the website
> still says "TBD" for both 3.8.1 and 3.9.0, maybe communicating the (likely)
> plan could use some improvement.

Right, Hans was saying how we should improve in that area, too.

I think that's a very easy consensus to reach, but we still need all
require people to commit to a more rigorous schedule.


> What can we (this time being OpenMandriva) do? We don't have any great
> compiler engineers, but we're heavy users - would it help to run a mass
> build of all packages for all supported architectures (at this time: i586,
> x86_64, armv7hnl, aarch64) to detect errors on a prerelease builds?

YES PLEASE!! :)


> We have
> the infrastructure in place, even showing a fairly nice list of failed
> builds along with build logs. (But of course we there will be false
> positives caused by e.g. a library update that happened around the same time
> as the compiler update.)

Can you compare the failures of two different builds?

We don't want to fix *all* the problems during the releases, we just
want to know if the new release breaks more stuff than the previous.

How we deal with the remaining bugs is irrelevant to this discussion...


> That would be a good idea IMO, we've run into "current trunk is much better
> than the last stable release anyway" situations more than once (in both
> projects).

I expect that, if distros chime in during the release process, this
will be a lot less of a problem.

It also seems that the timing is not that bad, so maybe the best
course of action now is to streamline the process and only if the
pressure is still great, we change the timings.


> Backporting some more fixes to the stable branches would be great too (but
> of course I realize that's a daunting and not very interesting task).

I think that's crucial to keeping the releases relevant.


> Sounds good to me, volunteering to participate in both.

Thanks Bero! If you haven't yet, please subscribe to the
[hidden email] mailing list.


> While interesting from an upstream perspective, I doubt that will happen
> reliably -- there's too many people working on the build scripts who would
> not automatically have write access to the tree etc. and most distro build
> farms rely on having the build scripts in a common place, so duplication
> would be unavoidable.

I was sceptical about the shared scripts, too. It was an idea that
came to my mind in the last minute, but I'm not sure how much that
would help anyway.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
On 12 May 2016 at 01:10, Robinson, Paul <[hidden email]> wrote:
> A public commitment to the future release schedule would be that much
> more justification for planning to participate.

Indeed, I think we're all in agreement there.

Not only having people committed to validate the release, or having a
"known" ballpark time, but a public commitment: web pages updates,
emails sent to the appropriate lists, deadlines exposed, etc.

For all those interested, [hidden email] is the list
we discuss about the validation and plans. Regardless, the web pages
should also be changed as part of the planning and release processes.


> Data point: At Sony we ship our stuff every 6 months and that isn't
> going to change.  Our policy has been to base our releases on upstream
> releases, and given our current lead times, the current upstream release
> schedule is actually not bad, especially if it has stopped drifting.
> Having four releases pretty consistently follow the current schedule is
> extremely positive, thanks!

Excellent, thanks! It seems that the timing is (so far) less of a
problem than I anticipated. This is good news.


> It would help our internal planning to publish the schedule for future
> releases.  You never know, we might even be able to fork an internal
> branch in time to help with the release testing, although that is not
> in any way a lightweight process (so no promises).

I'm betting on the fact that this will get easier the more often we
do, and the more of us that does it.

Filtering false positives is a local cost that cannot be avoided, and
only automated to a point, but filtering thousands of real bugs and
reporting them all should be, in no way, the responsibility of any
downstream release.

The more downstream releases and OS distributions we have doing the
testing (like Bero said), the less each one of us will have to track.
And the more of them that say they're affected by a bug, the higher
the priority is should have to the upstream release.

We've done something similar with the GCC abi tag, but that was a
separate email thread, started by Jeff, and external to our release
process. The number of people involved is staggering, but yet, the
patches are sitting there waiting for review for a very long time.

I think this expresses my point really well. Downstream releases and
OS distros can help us *a lot* with validation, not necessarily
implementing core functionality or even fixing those bugs themselves.
But if we don't fix those bugs or implement the features they need,
they won't have *any* incentive in spending a lot of time and
resources validating the upstream release, and then, all of them will
spend *more* time validating their own.

Increasing the importance of stable releases might get us more work to
do for other people with no real benefit to us, yes. But it'll also
bring us a massive validation network and transform Clang/LLVM into a
production compiler from start (upstream) and that will benefit
everyone, including all users of all downstream and upstream tools.



> There has been some talk about moving toward releasing from upstream
> trunk, or more precisely to streamlining our internal testing process
> in order to allow us to seriously contemplate releasing from upstream
> trunk.  I can't argue with the streamlining part, for sure.  Whether
> the rest of it works out must remain to be seen.

That's what Chromium does, and it's mainly because of the problems
Bero exposed (trunk is often much better than any release).

But also they are using the tool to compile a very small subset of
programs (mainly Chromium/Chrome), so it's *a lot* easier to validate
that.

When you're actually shipping a toolchain, you have to worry not only
with the programs you have, but also customers programs you don't (and
can't) have access to.

If the release process (including minor releases) ends up as frequent
as possible, wouldn't that be similar to merging to trunk every other
month?

In that case, the validation process will be minimal (almost the same
code), but you'd have to spend a bit more time sifting through patches
(which I want to automate) instead.

Hope that makes some sense.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
FWIW, for our ARM Compiler product, we follow top-of-trunk, not the releases.
Next to picking up new functionality quicker, it also allows us to detect regressions
in LLVM against our in-house testing quickly, not 6 months later. We find that when
we find a regression within 24 to 48 hours of the commit introducing it, it's much
cheaper to get it fixed.

In my opinion, it would be better overall for the LLVM project if top-of-trunk is
tested as much as possible, if testing resources are so scarce that a choice
has to be made between testing top-of-trunk or testing a release branch.

Thanks,

Kristof

On 12 May 2016, at 02:10, Robinson, Paul <[hidden email]> wrote:

-----Original Message-----
From: Renato Golin [[hidden email]]
Sent: Wednesday, May 11, 2016 11:11 AM
To: Hans Wennborg
Cc: LLVM Dev; Clang Dev; Quentin Colombet; Tom Stellard; Robinson, Paul;
Jim Grosbach; Kristof Beyls; Frédéric Richez; Reid Kleckner; Philip
Reames; Matthias Braun; Bernhard Rosenkränzer; Sylvestre Ledru; Matthias
Klose; Stephen Hines; Jeff Law; Ed Maste; Behan Webster
Subject: Re: LLVM Releases: Upstream vs. Downstream / Distros

On 11 May 2016 at 17:16, Hans Wennborg <[hidden email]> wrote:
This is a long email :-) I've made some comments inline, but I'll
summarize my thoughts here:

Thanks Hans!

I'll respond them inline, below.


- I think we should use the bug tracker to capture issues that affect
releases. It would be cool if a commit hook could update bugzilla
entries that refer to it.

That seems like a simple hook.


At least for the major releases, I think we're doing pretty well on
timing in terms of predictability: since 3.6, we have release every
six months: first week of March and first week of September (+- a few
days). Branching has been similarly predictive: mid-January and
mid-July.

Indeed, we got a lot better more recently (last 2y), and mostly thanks
to you. :)

Absolutely.  It's enough of a track record to allow reasonable planning.


We used to vary 3 months +-, and now we're down to a few days.
Whatever we decide, I think we should make it official but putting it
out somewhere, so people can rely on that.

A public commitment to the future release schedule would be that much
more justification for planning to participate.


Right now, even if you're extra awesome, there's nothing telling the
distros and LLVM-based products that it will be so if someone else
takes over the responsibility, so they can't adapt.

That's what I meant by "quasi-chaotic".


If there are many downstream releases for which shifting this schedule
would be useful, I suppose we could do that, but it seems unlikely
that there would be agreement on this, and changing the schedule is
disruptive for those who depend on it.

That's the catch. If we want them to participate, the process has to
have some meaning to them. The fact that not many people do, is clear
to me what it means.

We also need to know better how many other releases already depend on
the upstream process (not just Chromium, for obvious reasons), to be
able to do an informed choice of dates and frequency.

The more well positioned and frequent we are, the more people will
help, but there's a point where the curve bends down, and the cost is
just too great. We need to find the inflection point, and that will
require some initial investigations and guesses, and a lot of fine
tuning later. But if we're all on the same page, I think we can do
that, even if it takes time.

I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:

* They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
* They already had their own release schedule, so aligning with ours
brought no extra benefit
* We always expected people to work off trunk, and everyone had to
create their own process

I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)



The only reasonable way I see of aligning upstream releases with
downstream schedules would be to release much more often. This works
well in Chromium where there's a 6-week staged release schedule. This
would mean there's always a branch going for the next release, and
important bug fixes would get merged to that.

Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.



In Chromium we drive
this from the bug tracker -- it would be very hard to scan each commit
for things to cherry-pick. This kind of process has a high cost
though, there has to be good infrastructure for it (buildbots on the
branch for all targets, for example), developers have to be aware, and
even then it's a lot of work for those doing the releases. I'm not
sure we'd want to take this on. I'm also not sure it would be suitable
for a compiler, where we want the releases to have long life-time.

This works because you have a closed system. As you say, Chromium is
mostly final product, not a tool to develop other products, and the
validation is a lot simpler.

With Clang, we'd want to involve external releases into it, and it
simply wouldn't scale.



For the major releases, I've tried to do this. We could certainly
formalize it by posting it on the web page though.

I think that'd be the first step, yes. But I wanted to start with a
good number. 2 times a year? Would 3 times improve things that much
for the outsiders? Or just moving the dates would be enough for most
people?

That's why I copied so many outsiders, so they can chime in and let us
know what would be good for *them*.

Data point: At Sony we ship our stuff every 6 months and that isn't
going to change.  Our policy has been to base our releases on upstream
releases, and given our current lead times, the current upstream release
schedule is actually not bad, especially if it has stopped drifting.
Having four releases pretty consistently follow the current schedule is
extremely positive, thanks!

It would help our internal planning to publish the schedule for future
releases.  You never know, we might even be able to fork an internal
branch in time to help with the release testing, although that is not
in any way a lightweight process (so no promises).

There has been some talk about moving toward releasing from upstream
trunk, or more precisely to streamlining our internal testing process
in order to allow us to seriously contemplate releasing from upstream
trunk.  I can't argue with the streamlining part, for sure.  Whether
the rest of it works out must remain to be seen.
--paulr



Most importantly, those folks should get involved :-)

Indeed!


In practice, we kind of have this for at least some of the targets.
Maybe we should write this down somewhere instead of me asking for
(the same) volunteers each time the release process starts?

I give consent to mark me as the ARM/AArch64 release tester for the
foreseeable future. :)

I can also help Sylvestre, Doko, Ed, Jeff, Bero etc. to test on their
system running on ARM/AArch64 hardware.

cheers,
--renato


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
On 12 May 2016 at 15:56, Kristof Beyls <[hidden email]> wrote:
> We find that when we find a regression within 24 to 48 hours of the commit introducing it,
> it's much cheaper to get it fixed.

Hi Kristof,

Indeed, that is true.

But there are so much additional (internal) buildbots can do. If you
have any additional validation steps that you only do when you pick a
candidate for release, and there are failures in that set, you won't
see them during the trunk monitoring.

Also, distributions don't have as many compiler developers as most
groups releasing toolchains, and they have to rely on the public
buildbots (which are by no means comprehensive).

This is not about downstream *development*, but downstream *release*
process, which in most situations, require additional validation.

Following trunk for your development process is cheaper, as you said,
and most people agree. But if you release your product a few weeks
after the release is out, it may be better to get the release itself,
since it has been validated but many other groups (assuming everyone
join in), than trunk.


> In my opinion, it would be better overall for the LLVM project if top-of-trunk is
> tested as much as possible, if testing resources are so scarce that a choice
> has to be made between testing top-of-trunk or testing a release branch.

That's the balance I'm trying to get right. :)

There's also the other topic about the community.

LLVM is mostly a tool kit for compilers, yes, and most LLVM developers
are using it as such. But more and more projects are depending on the
actual compiler solution (Clang+LLVM+RT), and by the reaction of most
distros I've spoken to, we could improve our OSS community
interactions quite a lot.

I can certainly understand why most companies are worried about their
own processes, but that's undermining the ability of the OSS release
to achieve it's goal, which is to be used to build all kinds of
software in the wild.

We have FreeBSD, Mandriva and Android using it by default, which is a
*big* win. We have Debian, RedHat and Canonical worried about the
integration of LLVM in their packages, that's awesome.

I just want us to look at that picture, and see if we can do anything
better for them, that will in turn, make our processes slightly
cheaper because of the synergy it will create.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
Just some relatively quick thoughts, since I am still on vacation. +Luis from ChromeOS toolchain.

On Thu, May 12, 2016 at 7:56 AM, Kristof Beyls <[hidden email]> wrote:
FWIW, for our ARM Compiler product, we follow top-of-trunk, not the releases.
Next to picking up new functionality quicker, it also allows us to detect regressions
in LLVM against our in-house testing quickly, not 6 months later. We find that when
we find a regression within 24 to 48 hours of the commit introducing it, it's much
cheaper to get it fixed.

In my opinion, it would be better overall for the LLVM project if top-of-trunk is
tested as much as possible, if testing resources are so scarce that a choice
has to be made between testing top-of-trunk or testing a release branch.

I am 100% in agreement here. TOT regressions cost more than people think. I have seen firsthand how this hurts the non-LLVM parts of Android. In Android, more testing resources are devoted to release branches, and thus the development branch picks up bizarre regressions that are harder to track down and fix than they should be. Since I don't update the compiler in the release branch (at least not without good reason), this means that compiler regressions are harder for me to track down, because nobody pays attention to dev results until it is hurting them in the next release. :( I have some ideas to make this better in the future, but nothing to share just yet.
 

Thanks,

Kristof

On 12 May 2016, at 02:10, Robinson, Paul <[hidden email]> wrote:

-----Original Message-----
From: Renato Golin [[hidden email]]
Sent: Wednesday, May 11, 2016 11:11 AM
To: Hans Wennborg
Cc: LLVM Dev; Clang Dev; Quentin Colombet; Tom Stellard; Robinson, Paul;
Jim Grosbach; Kristof Beyls; Frédéric Richez; Reid Kleckner; Philip
Reames; Matthias Braun; Bernhard Rosenkränzer; Sylvestre Ledru; Matthias
Klose; Stephen Hines; Jeff Law; Ed Maste; Behan Webster
Subject: Re: LLVM Releases: Upstream vs. Downstream / Distros

On 11 May 2016 at 17:16, Hans Wennborg <[hidden email]> wrote:
This is a long email :-) I've made some comments inline, but I'll
summarize my thoughts here:

Thanks Hans!

I'll respond them inline, below.


- I think we should use the bug tracker to capture issues that affect
releases. It would be cool if a commit hook could update bugzilla
entries that refer to it.

That seems like a simple hook.


At least for the major releases, I think we're doing pretty well on
timing in terms of predictability: since 3.6, we have release every
six months: first week of March and first week of September (+- a few
days). Branching has been similarly predictive: mid-January and
mid-July.

Indeed, we got a lot better more recently (last 2y), and mostly thanks
to you. :)

Absolutely.  It's enough of a track record to allow reasonable planning.


We used to vary 3 months +-, and now we're down to a few days.
Whatever we decide, I think we should make it official but putting it
out somewhere, so people can rely on that.

A public commitment to the future release schedule would be that much
more justification for planning to participate.


Right now, even if you're extra awesome, there's nothing telling the
distros and LLVM-based products that it will be so if someone else
takes over the responsibility, so they can't adapt.

That's what I meant by "quasi-chaotic".


If there are many downstream releases for which shifting this schedule
would be useful, I suppose we could do that, but it seems unlikely
that there would be agreement on this, and changing the schedule is
disruptive for those who depend on it.

That's the catch. If we want them to participate, the process has to
have some meaning to them. The fact that not many people do, is clear
to me what it means.

We also need to know better how many other releases already depend on
the upstream process (not just Chromium, for obvious reasons), to be
able to do an informed choice of dates and frequency.

The more well positioned and frequent we are, the more people will
help, but there's a point where the curve bends down, and the cost is
just too great. We need to find the inflection point, and that will
require some initial investigations and guesses, and a lot of fine
tuning later. But if we're all on the same page, I think we can do
that, even if it takes time.

I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:

I am not sure why you think Android's compiler-rt is an example of a "heavily modified" component. As I see it, our compiler-rt matches upstream almost exactly (with one minor mistake from a duplicate merge that results in extra copies of some static functions that we don't even build). We do have 3 cherry-picks for some MIPS ASan patches, but all of those come directly from TOT master.

There are some Android-only patches in LLVM and Clang (thanks to RenderScript, and not being able to easily upstream these - i.e. the RS frontend doesn't live as an LLVM upstream project, so it is hard to write tests for these modifications). None of those patches should impact things greatly. Assuming you sync to the same CL (plus <10 cherry-picks to all projects), any C/C++ regression should be reproducible with a normal TOT build. This has been true since we did upstream our most divergent patch (re: calling conventions for ARM vectors) a few months ago.
 

* They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
The real problem is retaining history. Release branches don't make this very nice, and necessitate that we swap out an entire chunk of history for a different chunk of history every time we change releases. That is pretty obnoxious for keeping a good idea of what has happened, and thus following master makes this easier for us (and I assume others too). We did try using a release branch back for LLVM 3.5, but the maintenance cost for then updating things when we wanted to move past 3.5 showed me why I don't want to really look 
 
* They already had their own release schedule, so aligning with ours
brought no extra benefit
It is not 100% clear that Android will want to be dependent on LLVM's release schedule. I think that there are definitely benefits to having everyone do extra validation, but I am unconvinced that it is the "same" validation for everyone that is valuable, hence this might not make things go much smoother/faster for those groups.

Thanks,
Steve
* We always expected people to work off trunk, and everyone had to
create their own process

I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)



The only reasonable way I see of aligning upstream releases with
downstream schedules would be to release much more often. This works
well in Chromium where there's a 6-week staged release schedule. This
would mean there's always a branch going for the next release, and
important bug fixes would get merged to that.

Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.



In Chromium we drive
this from the bug tracker -- it would be very hard to scan each commit
for things to cherry-pick. This kind of process has a high cost
though, there has to be good infrastructure for it (buildbots on the
branch for all targets, for example), developers have to be aware, and
even then it's a lot of work for those doing the releases. I'm not
sure we'd want to take this on. I'm also not sure it would be suitable
for a compiler, where we want the releases to have long life-time.

This works because you have a closed system. As you say, Chromium is
mostly final product, not a tool to develop other products, and the
validation is a lot simpler.

With Clang, we'd want to involve external releases into it, and it
simply wouldn't scale.



For the major releases, I've tried to do this. We could certainly
formalize it by posting it on the web page though.

I think that'd be the first step, yes. But I wanted to start with a
good number. 2 times a year? Would 3 times improve things that much
for the outsiders? Or just moving the dates would be enough for most
people?

That's why I copied so many outsiders, so they can chime in and let us
know what would be good for *them*.

Data point: At Sony we ship our stuff every 6 months and that isn't
going to change.  Our policy has been to base our releases on upstream
releases, and given our current lead times, the current upstream release
schedule is actually not bad, especially if it has stopped drifting.
Having four releases pretty consistently follow the current schedule is
extremely positive, thanks!

It would help our internal planning to publish the schedule for future
releases.  You never know, we might even be able to fork an internal
branch in time to help with the release testing, although that is not
in any way a lightweight process (so no promises).

There has been some talk about moving toward releasing from upstream
trunk, or more precisely to streamlining our internal testing process
in order to allow us to seriously contemplate releasing from upstream
trunk.  I can't argue with the streamlining part, for sure.  Whether
the rest of it works out must remain to be seen.
--paulr



Most importantly, those folks should get involved :-)

Indeed!


In practice, we kind of have this for at least some of the targets.
Maybe we should write this down somewhere instead of me asking for
(the same) volunteers each time the release process starts?

I give consent to mark me as the ARM/AArch64 release tester for the
foreseeable future. :)

I can also help Sylvestre, Doko, Ed, Jeff, Bero etc. to test on their
system running on ARM/AArch64 hardware.

cheers,
--renato



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
On 12 May 2016 at 16:56, Kristof Beyls <[hidden email]> wrote:
In my opinion, it would be better overall for the LLVM project if top-of-trunk is
tested as much as possible, if testing resources are so scarce that a choice
has to be made between testing top-of-trunk or testing a release branch.

I agree that trunk is more important, with both of my hats on.

But releases are not completely irrelevant - one thing making them important is the fact that there's other projects out there using the LLVM libraries - and as a distro, we have to make sure they all work (so they agree on the same API), preferably without having to ship multiple versions of LLVM and preferably without having to patch external code too much to adjust to API changes.

In OpenMandriva, we have to keep Mesa, creduce, emscripten and the LLVMified Qt moc working (list expected to grow -- ultimately we'd also like to use the system LLVM libraries for the swift compiler).
In AOSP, RenderScript relies on the LLVM API, but there's nothing else using it, so there's currently no need to force a common version of the API between different projects there.

ttyl
bero

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
On 12 May 2016, at 16:27, Bernhard Rosenkränzer via cfe-dev <[hidden email]> wrote:

>
> On 12 May 2016 at 16:56, Kristof Beyls <[hidden email]> wrote:
>> In my opinion, it would be better overall for the LLVM project if top-of-trunk is
>> tested as much as possible, if testing resources are so scarce that a choice
>> has to be made between testing top-of-trunk or testing a release branch.
>>
> I agree that trunk is more important, with both of my hats on.
>
> But releases are not completely irrelevant - one thing making them important is the fact that there's other projects out there using the LLVM libraries - and as a distro, we have to make sure they all work (so they agree on the same API), preferably without having to ship multiple versions of LLVM and preferably without having to patch external code too much to adjust to API changes.
>
> In OpenMandriva, we have to keep Mesa, creduce, emscripten and the LLVMified Qt moc working (list expected to grow -- ultimately we'd also like to use the system LLVM libraries for the swift compiler).
> In AOSP, RenderScript relies on the LLVM API, but there's nothing else using it, so there's currently no need to force a common version of the API between different projects there.
I think that our API stability policy is really hurting us here.  Downstream users of clang have few problems - clang is basically backwards compatible and so you only need to test for regressions.  We periodically build the entire FreeBSD ports collection (around 25,000 open source packages) with the clang-devel port (which is a periodically updated trunk snapshot) and report regressions.  That’s very easy to do.  It’s easy for me to test for clang regressions by just setting CC=clang-devel CXX=clang++-devel, rebuilding my own projects and running their test suites.

In contrast, for projects that I maintain that use LLVM as a library, I generally hop from release to release.  We’ve had several instances of APIs changing multiple times between releases, so updating the code to each one is more effort than just updating to the newest one.  APIs often go in long before documentation (which is often nonexistent or rushed in for the release), so you’re more likely to see useful documentation if you pick a release.  Finally, most distributions package releases so if you depend on LLVM X.Y then your code is easy to package, whereas if you depend on LLVM svn rxzy then it’s basically impossible.  Even between releases, expecting everyone who might use / test your project’s trunk to build the specific LLVM revision that you depend on is too much of a barrier to entry for many potential contributors and so following trunk reduces the number of contributors and the amount of testing.

The end result is that shortly after a release (sometimes every alternate release) is branched a load of downstream projects update to the new APIs, test things, and find a bunch of regressions that have been sitting in the tree for months.  We then have to scrabble to bisect and try to track them down.

TL;DR version: If we want downstream people to test ToT, then we need to make updating LLVM library consumers to ToT far less painful than it is now.

David



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
On 12 May 2016 at 16:22, Stephen Hines <[hidden email]> wrote:
> I am 100% in agreement here. TOT regressions cost more than people think. I
> have seen firsthand how this hurts the non-LLVM parts of Android.

Hi Steve,

I think we're all in agreement of ToT development and testing. This is
more about releases and upstream users, including OS distributions.


> I am not sure why you think Android's compiler-rt is an example of a
> "heavily modified" component. As I see it, our compiler-rt matches upstream
> almost exactly (with one minor mistake from a duplicate merge that results
> in extra copies of some static functions that we don't even build). We do
> have 3 cherry-picks for some MIPS ASan patches, but all of those come
> directly from TOT master.

Sorry, that was a bit heavy-handed... I meant that it's hard to change
the Android's copy of RT because of how it's built.

This patch is an example: https://android-review.googlesource.com/#/c/125910/1

It introduces a set of nice changes that cannot go as in into LLVM's
compiler-rt because of how RT is built in LLVM.

This is not Android's fault per se, but it's an example of how
proliferation of patches can happen if one downstream repo depends on
another, as is the case for AOSP+Android+(anyone else that develops
Android).

That change should have been developed on upstream RT to begin with,
but the merge would be hard to control on the third-generation copy.

That's the reason why I want to bring all downstream repos to only
depend on upstream LLVM.


> The real problem is retaining history. Release branches don't make this very
> nice, and necessitate that we swap out an entire chunk of history for a
> different chunk of history every time we change releases.

That's interesting... I haven't heard that before, so I don't know
exactly what you mean. :)

Can you give an example?


> It is not 100% clear that Android will want to be dependent on LLVM's
> release schedule. I think that there are definitely benefits to having
> everyone do extra validation, but I am unconvinced that it is the "same"
> validation for everyone that is valuable, hence this might not make things
> go much smoother/faster for those groups.

That's a very good point, and one that I was considering while
thinking about this.

Will Sony's validation be any relevant to Chromium builds? Probably
not. But ARM's extra validation will be relevant to anyone using ARM
targets, and that in turn will bring more adoption to ARM. Sony's
validation, will be interesting to the CPUs and GPUs they validate on,
so that's a bonus to anyone who uses the same hardware.

I can't put a concrete value on any of this, but having people chiming
in is the best way to know what kind of things people do, and how
valuable they are to each other.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev

I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:


Errr, Stephen has spoken up here, but my folks are in contact with android folks pretty much every week, and I don't think what you are stating is correct on a lot of fronts.

I'm also a little concerned about you speaking for android here, instead of android speaking for android here :P.  I'll be frank: I don't think you know enough details of internal history of android to state, affirmatively, why these things happened for android, and what you are suggesting  is, AFAIK, not correct or accurate.

So if android is your particular concern here, i can pretty much state that android LLVM is on a release process close to the rest of Google, which is 'follow TOT very closely'.

I don't think changing how stable works would change that, for a lot of reasons (mostly around cost of ToT regressions, etc).

So i think you need to find a new motivating example :)


 * They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then

This is, AFAIK,  not accurate. Renderscript has its own history not worth getting into, but outside of the renderscript version, LLVM in android is very close to TOT. It has been as long as someone really cared about it. 

 
 * They already had their own release schedule, so aligning with ours
brought no extra benefit

This was not a concern. 

 * We always expected people to work off trunk, and everyone had to
create their own process

Android mostly shares a process with the rest of Google these days, in the sense that they rely on  validation we do in other contexts.

I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)


Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.

FWIW: Full validation is already done on a faster-than-6 week time schedule, so i'm also going to suggest that your "just not possible" claim is false :)


 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
I'll bite:

Do you really believe you could ever get these folks to choose consistent llvm versions anyway?

Or will you always have to do the work yourself anyway?

In my talks with a number of these projects, they pretty much don't care what anyone else does, and plan to stick to their own import/etc schedules no matter what LLVM does with stable releases :)

(For reference, Google *ships* the equivalent of about 13-16 linux distributions in products, uses about 5-6x that internally, and we have a single monolithic source repository for the most part.  I have the joy of owning the third party software policies/etc for it, and so  end up responsible for trying to deal with maintaining single versions of llvm for tens to hundreds of packages).


On Thu, May 12, 2016 at 8:27 AM, Bernhard Rosenkränzer <[hidden email]> wrote:
On 12 May 2016 at 16:56, Kristof Beyls <[hidden email]> wrote:
In my opinion, it would be better overall for the LLVM project if top-of-trunk is
tested as much as possible, if testing resources are so scarce that a choice
has to be made between testing top-of-trunk or testing a release branch.

I agree that trunk is more important, with both of my hats on.

But releases are not completely irrelevant - one thing making them important is the fact that there's other projects out there using the LLVM libraries - and as a distro, we have to make sure they all work (so they agree on the same API), preferably without having to ship multiple versions of LLVM and preferably without having to patch external code too much to adjust to API changes.

In OpenMandriva, we have to keep Mesa, creduce, emscripten and the LLVMified Qt moc working (list expected to grow -- ultimately we'd also like to use the system LLVM libraries for the swift compiler).
In AOSP, RenderScript relies on the LLVM API, but there's nothing else using it, so there's currently no need to force a common version of the API between different projects there.

ttyl
bero

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
On 12 May 2016 at 16:57, Daniel Berlin <[hidden email]> wrote:
> Errr, Stephen has spoken up here, but my folks are in contact with android
> folks pretty much every week, and I don't think what you are stating is
> correct on a lot of fronts.

I obviously don't speak for Android and have already apologised to
Steve about my choice of words.


> So if android is your particular concern here, i can pretty much state that
> android LLVM is on a release process close to the rest of Google, which is
> 'follow TOT very closely'.

Isn't this what I said?

Following ToT very closely is only good for groups that have high
involvement in LLVM, like Google and Android.

And for that reason (and others), Android doesn't use the upstream
releases. I was wondering if we could make anything so they would.

The major benefit wouldn't be, as I explained, specifically for
Google/Android, but for Android users, Linux users, Linux distros,
LLVM library users (including Renderscript), etc.

--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev
On 12 May 2016 at 17:07, Daniel Berlin via llvm-dev
<[hidden email]> wrote:
> In my talks with a number of these projects, they pretty much don't care
> what anyone else does, and plan to stick to their own import/etc schedules
> no matter what LLVM does with stable releases :)

Is there anything we can do to make they care?

What I heard from them is that the upstream process wasn't clear
enough with regards to fixes, API stability and process (which were
pretty much echoed in this thread).

Maybe, if we fix most of those problems, they would care more?



> (For reference, Google *ships* the equivalent of about 13-16 linux
> distributions in products, uses about 5-6x that internally, and we have a
> single monolithic source repository for the most part.  I have the joy of
> owning the third party software policies/etc for it, and so  end up
> responsible for trying to deal with maintaining single versions of llvm for
> tens to hundreds of packages).

You sound like the perfect guy to describe a better upstream policy to
please more users.

But I don't want to volunteer yourself. :)

--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev


On Thu, May 12, 2016 at 9:07 AM, Renato Golin <[hidden email]> wrote:
On 12 May 2016 at 16:57, Daniel Berlin <[hidden email]> wrote:
> Errr, Stephen has spoken up here, but my folks are in contact with android
> folks pretty much every week, and I don't think what you are stating is
> correct on a lot of fronts.

I obviously don't speak for Android and have already apologised to
Steve about my choice of words.


> So if android is your particular concern here, i can pretty much state that
> android LLVM is on a release process close to the rest of Google, which is
> 'follow TOT very closely'.

Isn't this what I said?

But your position seems to be "this is a bad thing for folks", and the position we take is that it's explicitly a good thing.
 

Following ToT very closely is only good for groups that have high
involvement in LLVM, like Google and Android.

And for that reason (and others), Android doesn't use the upstream
releases. I was wondering if we could make anything so they would.

 
The major benefit wouldn't be, as I explained, specifically for
Google/Android, but for Android users, Linux users, Linux distros,
LLVM library users (including Renderscript), etc.

There is a strong implicit assumption here that the current model they use is better for users than the model LLVM uses, and that aligning these models in *that* direction ends up better from usings than aligning models in the other direction.

IE make ToT more appealing to follow, have folks follow that.
Maybe that's true, maybe it's not, but it needs a lot more evidence :)

The evidence i see so far is that they spend time trying to get disparate projects to use a single version of LLVM, but i also have seen no evidence that any of the projects using stable releases would ever align their policies *anyway*, so they still have that problem no matter what you do to stable releases.

If that is the real concern,  i think the entire discussion is misplaced.   Because that problem is solely one of API compatibility between releases.

If there are other concerns, it'd be good to catalogue them :)

 

--renato


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Vassil Vassilev via cfe-dev
On 12 May 2016 at 17:15, Daniel Berlin <[hidden email]> wrote:
> But your position seems to be "this is a bad thing for folks", and the
> position we take is that it's explicitly a good thing.

Then I apologise again! :)

My point was that following ToT is perfect for developer teams working
*on* LLVM. Everyone should be doing that, and most people are. Check.

But for some people, including library users, LTS distributions and
some downstream releases (citation needed), having an up-to-date and
stable release *may* (citation needed) be the only stable way to
progress into newer LLVM technology.


> IE make ToT more appealing to follow, have folks follow that.
> Maybe that's true, maybe it's not, but it needs a lot more evidence :)

There were responses on this thread that said it's possible and
desirable to test ToT better, than only validate releases, and I think
this is great. Mostly because ultimately this will eventually benefit
the releases anyway.

Maybe, the solution to the always-too-old-release problem is to get
better trunk and give up at all on releases, like Arch Linux rolling
releases (which I use), so I'm ok with it, too.

As long as we make it a clear and simple process, so upstream users
can benefit too, whatever works. :)

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12