Last week I mentioned I almost had ready the implementation to deal with bad entries in the cache, finally during the week I sent a first version of the patch, I got some review and now I'm working to improve it.
One of the things I will change is the way we determinate which error we had when trying to use one of the entries in the cache, in my patch I was using the error string ( but it was because at the moment we don't have an ADT for some errors, for example the ones which are thrown from libcurl), so I will implement an new data type which will helps us to determinate in a safer way, which was the error that we got.
Other thing I will rewrite is the way to determinate if a ssh is bad or not, again we have the problem of the error type with ssh ( which we couldn't infer), so what I was doing was to do a request to the server and check if it was reachable or not, but then I realized that It wasn't correct, having a ssh server listening in port 22, doesn't necessarily mean that it does in port 80 too.
Also I got a first draft of the high level documentation doc, which aim to explain how the cache system works, I did a call in the darcs-users mailing list for feedback, which wasn't very successful, I would really appreciate if you can give a look at it and give me some feedback.
During the coming weeks I will finish this patch, and work on documentation and testing, more information of my progress can be found in the wiki.
Sunday, 25 July 2010
Friday, 16 July 2010
GSoC Report Week 8: Dealing with bad entries in the cache
During this week I have been working in the implementation to deal with the bad entries in the cache, to get a bit of background of how the cache works read my previous post.
When we have remote entries, we consider conflictive the ones that 1) Are not accessible ( give time-out each time we try to access) and 2) are accessible but the repository doesn't exist.
We consider a bad local entry, the one which points to a repository which doesn't exist.
At the beginning the idea was to automatically remove those entries from the cache, but as I mentioned in the previous post, sometimes there are externals factors which won't allow me to reach an entry, so the approach taken to solve this problem was just not using that entries during the rest of the command, and at the end notify to the user about it and allow her to delete the bad entries interactively, something like:
$ There seems to be a problem with the following repos: repo1, repo2, repo3
Would you like to delete them from the cache ? Y/N
I haven't implemented the part of notifying to the user about the bad entries, but I introduced the changes to stop using an entry if we have a problem with it, so for the remote repositories which give time out, they get added immediately to the list of bad caches and we don't try to fetch patches from them, if it's a remote entry which doesn't give timeout but throws an error it could be because a) the requested file wasn't there b) the repository doesn't exist, so in this case when I get an error of this type I verify what caused the error, if the reason is that the repository doesn't exist, it is added to the list of bad caches, if not, it is added to a list of "good entries", if we get again other error because the file doesn't exist, we won't need to check for any of the conditions mentioned before.
An approach similar to the remote repositories is taken for the local entries which fail.
After I introduce the changes, I did a lazy get of Tahoe-LAFS, then introduce a bogus entry for the timeout case and check the time difference calling the command darcs changes, I did it with my version and with the version in hackage.
For 2.4.4 (release) which is the version in hackage, it took a bit more than 13 minutes.
real 13m38.415s
user 0m4.148s
sys 0m1.060s
Then after I rebuild with my changes the result were really good, taking less than 1 minute to fetch the changes:
real 0m55.679s
user 0m1.092s
sys 0m0.240s
Before my changes Darcs will try to establish a connection with each of the entries in the cache for each patch ( waiting for timeout), now with the changes, if it founds a bad entry it doesn't try to use it in the rest of the command, which make faster the operations since we don't waste time trying to use bad resources.
Also I implement an environment variable "DARCS_CONNECTION_TIMEOUT" which set the waiting time for a request.
To implement such functionality if we are using libcurl, I setup the CURLOPT_TIMEOUT option, if we are not using libcurl but Network.HTTP, I wrapped the operation of simpleHTTP in the function timeout from System.Timeout. In Linux it works perfect both with libcurl or simpleHTTP, but in Windows I have a problem with the 'timeout' function from System.Timeout, it doesn't behave as expected and it seems like it gets ignored.
For the next week I will focus on finishing this, extending the haddock documentation of the cache and tests. Remember you can always check my advance in the wiki.
When we have remote entries, we consider conflictive the ones that 1) Are not accessible ( give time-out each time we try to access) and 2) are accessible but the repository doesn't exist.
We consider a bad local entry, the one which points to a repository which doesn't exist.
At the beginning the idea was to automatically remove those entries from the cache, but as I mentioned in the previous post, sometimes there are externals factors which won't allow me to reach an entry, so the approach taken to solve this problem was just not using that entries during the rest of the command, and at the end notify to the user about it and allow her to delete the bad entries interactively, something like:
$ There seems to be a problem with the following repos: repo1, repo2, repo3
Would you like to delete them from the cache ? Y/N
I haven't implemented the part of notifying to the user about the bad entries, but I introduced the changes to stop using an entry if we have a problem with it, so for the remote repositories which give time out, they get added immediately to the list of bad caches and we don't try to fetch patches from them, if it's a remote entry which doesn't give timeout but throws an error it could be because a) the requested file wasn't there b) the repository doesn't exist, so in this case when I get an error of this type I verify what caused the error, if the reason is that the repository doesn't exist, it is added to the list of bad caches, if not, it is added to a list of "good entries", if we get again other error because the file doesn't exist, we won't need to check for any of the conditions mentioned before.
An approach similar to the remote repositories is taken for the local entries which fail.
After I introduce the changes, I did a lazy get of Tahoe-LAFS, then introduce a bogus entry for the timeout case and check the time difference calling the command darcs changes, I did it with my version and with the version in hackage.
For 2.4.4 (release) which is the version in hackage, it took a bit more than 13 minutes.
real 13m38.415s
user 0m4.148s
sys 0m1.060s
Then after I rebuild with my changes the result were really good, taking less than 1 minute to fetch the changes:
real 0m55.679s
user 0m1.092s
sys 0m0.240s
Before my changes Darcs will try to establish a connection with each of the entries in the cache for each patch ( waiting for timeout), now with the changes, if it founds a bad entry it doesn't try to use it in the rest of the command, which make faster the operations since we don't waste time trying to use bad resources.
Also I implement an environment variable "DARCS_CONNECTION_TIMEOUT" which set the waiting time for a request.
To implement such functionality if we are using libcurl, I setup the CURLOPT_TIMEOUT option, if we are not using libcurl but Network.HTTP, I wrapped the operation of simpleHTTP in the function timeout from System.Timeout. In Linux it works perfect both with libcurl or simpleHTTP, but in Windows I have a problem with the 'timeout' function from System.Timeout, it doesn't behave as expected and it seems like it gets ignored.
For the next week I will focus on finishing this, extending the haddock documentation of the cache and tests. Remember you can always check my advance in the wiki.
Subscribe to:
Posts (Atom)