In a number of places we're now retrying counter mutations. This was seen as being better than failing hard and the client sending all the data again, causing disruption to column families beyond the counter-based CF that was being incremented.
This is actually okay so long as we do two things:
1) Keep an equal number of columns somewhere else in the database. For example, we have DayBuckets and DayBucketsCount, the latter being the length of each row in DayBuckets.
2) We periodically repair the potentially-overcounted column families by processing the columns from #1 into the counters.
In a number of places we're now retrying counter mutations. This was seen as being better than failing hard and the client sending all the data again, causing disruption to column families beyond the counter-based CF that was being incremented.
This is actually okay so long as we do two things: overcounted column families by processing the columns from #1 into the counters.
1) Keep an equal number of columns somewhere else in the database. For example, we have DayBuckets and DayBucketsCount, the latter being the length of each row in DayBuckets.
2) We periodically repair the potentially-