On Mon, 06 Jun 2011 17:01:59 -0000, Paul Sokolovsky <email address hidden> wrote:
> But let's first consider situation we used to have. It's the fact that
> upstream git servers can be overloaded/down, and even for longer than
> 12hrs. Potentially, during any such outages Google can made a code drop
> (pretty realistic scenario actually - Google did code drop and servers
> got DDOSed). So, would we want, in case of upstream server
> non-availability, to not build anything at all, on the basis that
> there's possibility that in place far, far away a new code has landed
> that we don't have?
No, I don't know where you get the idea that I am suggesting that. We
need to design a robust system that gives the possibility to have quick
turnaround when needed. We used to have a non-robust system with
quick-turnaround. We now have a robust system with slow turnaround.
> I guess, that's worse alternative than be able to still build what we
> have, especially when what we already have is exactly what we need.
> After all, we added mirroring service to minimize extra-cloud traffic,
> but it brings us extra, like allows to also improve our HA points.
>
> Now let's consider what risks are there. Building stale code will be
> problem for release builds, but release builds should really use only
> builds from tags. So, we either have that tag and can build it, or
> don't have, and can't (this relies on good upstream tagging policy,
> like not moving tags).
Right, if we can't get the code we need to build then we shouldn't
build, that much seems obvious to me.
> For daily builds for branches, we'd just normally have 12hrs average
> delay, the same as for builds themselves. But here's idea how to
> improve that: following previous patch, add also "soft_stale" and
> "hard_stale" settings. Upstream synced less than soft_stale time ago
> won't be synced at all. After that, sync will be attempted, but it's ok
> for it to fail w/o affecting a build. After hard_stale time passed,
> failed sync will fail the build. So, for android.git we could set
> soft_delay=2hrs and hard_delay=24hrs and be pretty good.
That sounds like a useful part of a solution to me, provided that the
sync is atomic and so a failed sync on the soft_delay period doesn't
corrupt the repos.
> Finally, for real-time developers' builds, we indeed could provide
> at first a script, later frontend UI to request unconditional sync.
This would be a "request sync" step to force a sync?
Why is it a separate step? The developer would then have to request a
sync and then wait until it was complete before submitting their build
to ensure that the build used the code that they wanted.
Could it just be a part of the build config which is translated to extra
info passed to the mirror service to force an override?
On Mon, 06 Jun 2011 17:01:59 -0000, Paul Sokolovsky <email address hidden> wrote:
> But let's first consider situation we used to have. It's the fact that
> upstream git servers can be overloaded/down, and even for longer than
> 12hrs. Potentially, during any such outages Google can made a code drop
> (pretty realistic scenario actually - Google did code drop and servers
> got DDOSed). So, would we want, in case of upstream server
> non-availability, to not build anything at all, on the basis that
> there's possibility that in place far, far away a new code has landed
> that we don't have?
No, I don't know where you get the idea that I am suggesting that. We
need to design a robust system that gives the possibility to have quick
turnaround when needed. We used to have a non-robust system with
quick-turnaround. We now have a robust system with slow turnaround.
> I guess, that's worse alternative than be able to still build what we
> have, especially when what we already have is exactly what we need.
> After all, we added mirroring service to minimize extra-cloud traffic,
> but it brings us extra, like allows to also improve our HA points.
>
> Now let's consider what risks are there. Building stale code will be
> problem for release builds, but release builds should really use only
> builds from tags. So, we either have that tag and can build it, or
> don't have, and can't (this relies on good upstream tagging policy,
> like not moving tags).
Right, if we can't get the code we need to build then we shouldn't
build, that much seems obvious to me.
> For daily builds for branches, we'd just normally have 12hrs average
> delay, the same as for builds themselves. But here's idea how to
> improve that: following previous patch, add also "soft_stale" and
> "hard_stale" settings. Upstream synced less than soft_stale time ago
> won't be synced at all. After that, sync will be attempted, but it's ok
> for it to fail w/o affecting a build. After hard_stale time passed,
> failed sync will fail the build. So, for android.git we could set
> soft_delay=2hrs and hard_delay=24hrs and be pretty good.
That sounds like a useful part of a solution to me, provided that the
sync is atomic and so a failed sync on the soft_delay period doesn't
corrupt the repos.
> Finally, for real-time developers' builds, we indeed could provide
> at first a script, later frontend UI to request unconditional sync.
This would be a "request sync" step to force a sync?
Why is it a separate step? The developer would then have to request a
sync and then wait until it was complete before submitting their build
to ensure that the build used the code that they wanted.
Could it just be a part of the build config which is translated to extra
info passed to the mirror service to force an override?
Thanks,
James