Improve 'clean' command

Bug #250645 reported by netmask
2
Affects Status Importance Assigned to Milestone
Smart Package Manager
Invalid
Wishlist
Gustavo Niemeyer

Bug Description

Proposal:

Improve 'clean' command by adding two switches:

  -c, --channel-data Clean channel data files
  -p, --package-cache Clean package cache

If no option provided, clean both.

Rationale:

Current 'clean' command only cleans broken packages, but not broken channel data (result of incomplete downloads).

netmask (netmask)
Changed in smart:
assignee: nobody → niemeyer
importance: Undecided → Wishlist
status: New → In Progress
Revision history for this message
netmask (netmask) wrote :

Patch attached with solution proposal.

Revision history for this message
Rehan Khan (rasker) wrote :

This is a good idea on the whole. However I think the channel cleaning should be implemented using a function in each channel's code. So for example using smart clean --c should look at each channel definition and clean just those files. Then the -c option can support -c=<channel name> for individual cleaning of channels. Doing it this way means that each channel can have fine control over what to clean and what not to.

The clean command should call the 'clean' function in each channel type to do the actual cleaning.

Although this is more work to implement it is better as the code is in the specific code is in the right places.

Finally there should also be a clean --all which basically deletes everything in channels and packages (and possibly also deletes the 'cache' file)

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

"smart clean" doesn't currently affect the running Smart in any way. It's just removing cached data which is automatically refetched if a command is run.

On the other hand, with your suggested patch, "clean" will break the current knowledge Smart has about channels, and must necessarily be followed by a "smart update".

I also miss the motivation for this option. You say "clean result of incomplete channel downloads", but what happens if the user just executes "update" again?

Changed in smart:
status: In Progress → Incomplete
Revision history for this message
netmask (netmask) wrote :

The following patch implements the ability to specify channel aliases as an option, as requested by Rehan.

As of Gustavo's comment, the idea is to selectively clean files. After channel data files are cleaned, the channels remain the same in the cache, because the digest is not affected by the clean process, it just helps reducing disk usage by removing unused files.

By 'incomplete' channel downloads, it helps workarounding download problems with channel type that do not provide metadata md5 (like YaST2 channels), which can break the loader if the download is incomplete.

I hope I am clear enough now. :)

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

I guess I'm still missing something. I don't see how the new patch answers or modifies any of the points I've made in the comment above.

Revision history for this message
netmask (netmask) wrote :

Erasing cached package metadata does not modify what's already stored in Smart's cache, which means, in rpm-md channels for example, if you clear the files, then run 'update', it will re-download repomod.xml, confirm digest hasn't change, and do nothing else. No other packages will get downloaded, thus storage space is spared.

OTOH, for YaST2 channels which do not have md5sum check on the channel metadata, if the download is corrupt, the channel loader will crash, forcing the user to rm the files in <datadir>/channels and run update again. In that situation, clean is handy because it erases the file without the need for the user to know it's location.

And lest, if you want the digest to change if the cached metadata files have been erased, I will have to work a bit more to make that happen.

I hope that answers your questions.

Revision history for this message
Rehan Khan (rasker) wrote :

Just a side note on this, some testing I did a while ago with repomd.xml files. If repomd.xml is deleted then all the files are re-downloaded and the cache updated. However if repomd.xml is not deleted but the other files are then an update just checks the repomd.xml file. I seem to vaguely remember that it actually checks the file date not the hash but I could be wrong about this.

The testing wasn't 'scientific' so maybe, netmask, you can confirm this behaviour.

In any case having a 'clean' function defined for each channel type (perhaps using a try/except clause to skip those channels without a clean function defined) will allow the clean function to behave differently/correctly for each channel type. The clean command could call the function for each defined channel.

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

The Smart cache is a cache. This may sound silly, but if you think about
it for a moment, you'll figure what I mean. By definition, you should be
able to remove a cache file/entry/whatever, and the application shouldn't
change its behavior in any way, except perhaps becoming slower, or trying
to rebuild the cache again.

The channel files are the definitive place where information is stored.
The "clean" command should not remove any information which will prevent
Smart from running correctly.

In other words, this must work:

    smart clean
    smart install somepackage

Even if the cache file doesn't exist at all.

Revision history for this message
netmask (netmask) wrote :

As decided on an IRC chat, this request is invalid, and will be closed.

Changed in smart:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.