Friday, 16 July 2010

GSoC Report Week 8: Dealing with bad entries in the cache

During this week I have been working in the implementation to deal with the bad entries in the cache, to get a bit of background of how the cache works read my previous post.

When we have remote entries, we consider conflictive the ones that 1) Are not accessible ( give time-out each time we try to access) and 2) are accessible but the repository doesn't exist.

We consider a bad local entry, the one which points to a repository which doesn't exist.

At the beginning the idea was to automatically remove those entries from the cache, but as I mentioned in the previous post, sometimes there are externals factors which won't allow me to reach an entry, so the approach taken to solve this problem was just not using that entries during the rest of the command, and at the end notify to the user about it and allow her to delete the bad entries interactively, something like:

$ There seems to be a problem with the following repos: repo1, repo2, repo3
Would you like to delete them from the cache ? Y/N

I haven't implemented the part of notifying to the user about the bad entries, but I introduced the changes to stop using an entry if we have a problem with it, so for the remote repositories which give time out, they get added immediately to the list of bad caches and we don't try to fetch patches from them, if it's a remote entry which doesn't give timeout but throws an error it could be because a) the requested file wasn't there b) the repository doesn't exist, so in this case when I get an error of this type I verify what caused the error, if the reason is that the repository doesn't exist, it is added to the list of bad caches, if not, it is added to a list of "good entries", if we get again other error because the file doesn't exist, we won't need to check for any of the conditions mentioned before.

An approach similar to the remote repositories is taken for the local entries which fail.

After I introduce the changes, I did a lazy get of Tahoe-LAFS, then introduce a bogus entry for the timeout case and check the time difference calling the command darcs changes, I did it with my version and with the version in hackage.

For 2.4.4 (release) which is the version in hackage, it took a bit more than 13 minutes.

real 13m38.415s
user 0m4.148s
sys 0m1.060s

Then after I rebuild with my changes the result were really good, taking less than 1 minute to fetch the changes:

real 0m55.679s
user 0m1.092s
sys 0m0.240s

Before my changes Darcs will try to establish a connection with each of the entries in the cache for each patch ( waiting for timeout), now with the changes, if it founds a bad entry it doesn't try to use it in the rest of the command, which make faster the operations since we don't waste time trying to use bad resources.

Also I implement an environment variable "DARCS_CONNECTION_TIMEOUT" which set the waiting time for a request.
To implement such functionality if we are using libcurl, I setup the CURLOPT_TIMEOUT option, if we are not using libcurl but Network.HTTP, I wrapped the operation of simpleHTTP in the function timeout from System.Timeout. In Linux it works perfect both with libcurl or simpleHTTP, but in Windows I have a problem with the 'timeout' function from System.Timeout, it doesn't behave as expected and it seems like it gets ignored.

For the next week I will focus on finishing this, extending the haddock documentation of the cache and tests. Remember you can always check my advance in the wiki.

No comments:

Post a Comment