Wednesday 11 August 2010

GSoC Week 12

Wow, this is my last gsoc related entry, I'm publishing earlier because I won't be around during the weekend.

I mentioned last week that we had know a better mechanism to handle bad cache, during this week I worked mainly adding documentation, extending user manual and finally sending the patch to set the environment variable DARCS_CONNECTION_TIMEOUT.

Overall my experience with the people from Darcs was really good, giving that this was my first experience contributing to an open source project, sometimes I had moments when I felt really awkward, but my mentor and the rest of people on irc will kindly help me to understand the stuff I didn't know.

With Eric (my mentor) things went pretty neat, We would meet weekly and discuss about what I had done during the previous week, what I accomplished, my doubts, and what I would work next. He always tried to keep a culture of get things done, and at the beginning when I wasn't very familiar with Darcs he would help me to get more familiar doing some "quiz" questions, which would take me to an "aha!" moment and finding some answer by myself.

What's next ? I plan to continue contributing with Darcs, for me the most difficult part in an open source project is getting started, I think I have passed through that, and I want to keep the momentum, I will try to keep contributing as much as I can.

I would also like to say thanks to the following people on irc who somehow help me when I appeared there asking questions: kowey, mornfall, lispy, Heffalump, sm.

Thanks to the Darcs team and I hope to continue having fun and learning lots with you.

All the documentation of my project is in the wiki.

Sunday 8 August 2010

GSoC Week 11

In my last entry I mentioned I had already sent a patch, which would allow Darcs to have a better handling of bad caches, I'm happy to announce that my patch is now in the current head :).

Now we have a better handling of bad entries in the cache, some of the main benefits of this fix is that some operations that would take more than 10 minutes were now reduced to less than 1 minute, the time changes depending the number of patches in the repository( look my report of week 8 for a better idea of what was happening).

Also I mentioned I was having some issues trying to use the timeout function from System.Timeout in Windows, I wrote to haskell-cafe and someone claimed that It would work for him in Windows 7, and then other person claim the same behaviour I was having with Windows XP, then I tried in a friend's computer, and it'd work intermittently, so the conclusion is that this is a Windows systems related issue and not a GHC's as I though at the beginning, I would really appreciate if someone with more experience with Haskell and Windows systems could point out what is really happening.

For the coming week, I would focus on extending the documentation, sending my implementation for the timeout flag, and completing the section "Future work" in the high level document.

Sunday 25 July 2010

GSoC Week 9

Last week I mentioned I almost had ready the implementation to deal with bad entries in the cache, finally during the week I sent a first version of the patch, I got some review and now I'm working to improve it.

One of the things I will change is the way we determinate which error we had when trying to use one of the entries in the cache, in my patch I was using the error string ( but it was because at the moment we don't have an ADT for some errors, for example the ones which are thrown from libcurl), so I will implement an new data type which will helps us to determinate in a safer way, which was the error that we got.

Other thing I will rewrite is the way to determinate if a ssh is bad or not, again we have the problem of the error type with ssh ( which we couldn't infer), so what I was doing was to do a request to the server and check if it was reachable or not, but then I realized that It wasn't correct, having a ssh server listening in port 22, doesn't necessarily mean that it does in port 80 too.

Also I got a first draft of the high level documentation doc, which aim to explain how the cache system works, I did a call in the darcs-users mailing list for feedback, which wasn't very successful, I would really appreciate if you can give a look at it and give me some feedback.

During the coming weeks I will finish this patch, and work on documentation and testing, more information of my progress can be found in the wiki.

Friday 16 July 2010

GSoC Report Week 8: Dealing with bad entries in the cache

During this week I have been working in the implementation to deal with the bad entries in the cache, to get a bit of background of how the cache works read my previous post.

When we have remote entries, we consider conflictive the ones that 1) Are not accessible ( give time-out each time we try to access) and 2) are accessible but the repository doesn't exist.

We consider a bad local entry, the one which points to a repository which doesn't exist.

At the beginning the idea was to automatically remove those entries from the cache, but as I mentioned in the previous post, sometimes there are externals factors which won't allow me to reach an entry, so the approach taken to solve this problem was just not using that entries during the rest of the command, and at the end notify to the user about it and allow her to delete the bad entries interactively, something like:

$ There seems to be a problem with the following repos: repo1, repo2, repo3
Would you like to delete them from the cache ? Y/N

I haven't implemented the part of notifying to the user about the bad entries, but I introduced the changes to stop using an entry if we have a problem with it, so for the remote repositories which give time out, they get added immediately to the list of bad caches and we don't try to fetch patches from them, if it's a remote entry which doesn't give timeout but throws an error it could be because a) the requested file wasn't there b) the repository doesn't exist, so in this case when I get an error of this type I verify what caused the error, if the reason is that the repository doesn't exist, it is added to the list of bad caches, if not, it is added to a list of "good entries", if we get again other error because the file doesn't exist, we won't need to check for any of the conditions mentioned before.

An approach similar to the remote repositories is taken for the local entries which fail.


After I introduce the changes, I did a lazy get of Tahoe-LAFS, then introduce a bogus entry for the timeout case and check the time difference calling the command darcs changes, I did it with my version and with the version in hackage.

For 2.4.4 (release) which is the version in hackage, it took a bit more than 13 minutes.

real 13m38.415s
user 0m4.148s
sys 0m1.060s

Then after I rebuild with my changes the result were really good, taking less than 1 minute to fetch the changes:

real 0m55.679s
user 0m1.092s
sys 0m0.240s

Before my changes Darcs will try to establish a connection with each of the entries in the cache for each patch ( waiting for timeout), now with the changes, if it founds a bad entry it doesn't try to use it in the rest of the command, which make faster the operations since we don't waste time trying to use bad resources.



Also I implement an environment variable "DARCS_CONNECTION_TIMEOUT" which set the waiting time for a request.
To implement such functionality if we are using libcurl, I setup the CURLOPT_TIMEOUT option, if we are not using libcurl but Network.HTTP, I wrapped the operation of simpleHTTP in the function timeout from System.Timeout. In Linux it works perfect both with libcurl or simpleHTTP, but in Windows I have a problem with the 'timeout' function from System.Timeout, it doesn't behave as expected and it seems like it gets ignored.


For the next week I will focus on finishing this, extending the haddock documentation of the cache and tests. Remember you can always check my advance in the wiki.

Saturday 26 June 2010

GSoC Report: Week 5

In my last post I said I had complete the warm-up phase and that I will start to think about how to handle caches which are no longer available.

The caching mechanism relies in the files _darcs/prefs/sources and ~/.darcs/sources, basically the content of those file is used to generate the cache entries, each of the entries in that file indicate an alternative source to get files. If we want to specify global caches we put that in ~/.Darcs/sources but if we want an alternative repositories to pull from, we specify that in the repository sources file which is in _Darcs/prefs/sources, also each time we do a pull from an external repository it is added to the sources file
automatically.

The problem of expiring caches is given because sometimes it happens that repositories that were available, can become unavailable. For example if I had pulled from 3 different repositories and 2 of them stop being available, it could take up to 2 minutes to get each patch, because Darcs could try to fetch every patch it needs from those 2 not longer available repositories, it tries to establish a connection, an then waits for a time-out or a bad response code. After the problem I just mentioned, the idea is to design a mechanism which can help Darcs to establish which entries should be expired.


We can split the cache entries in two groups: locals and remotes. Dealing with local non-longer reachable repositories is not a big deal since if we don't find the local entries we can assume they don't exist and we can drop them from the cache and stop trying to fetch files from them. Remote repositories are more tricky, for example I can't eliminate an entry just because it gives a time-out when it tries to establish a connection with it, there are other external factors which could interfere with that particular entry in a given moment
(firewalls).

So as we seem this is not an easy task, handling remote repositories is out of our hands, we don't have control over the external sources, we don't have control over the network configuration and so on. So a first approach to this is to mark the entries which are not working and ignore them for the rest of the pulling since we don't want to try to establish a connection with an entry which we know is not available. If we try to establish a connection and fails we can mark it as a bad entry but also I think it could be awkward to wait for a 60 seconds time out, something we could implement is a default time for waiting for a connection to succeed (10-15 seconds maybe) if it doesn't happen between that time we can skip it, mark it as bad, and don't try for the rest of the patches that particular entry. Other approach suggested in the bug tracker was to try to establish a connection with all of the entries and use the one that responds first but then what if all the entries are bad entries?. I have to think more about it, I will discuss on irc and the mailing and then with a clearer idea I will start to code a patch to solve the issue.

Also I sent a first version of a failing test for the case of unreachable entries, but I need to amend, as there are some missing cases.

More of my progress can be found in the wiki.

Sunday 20 June 2010

GSoC Report: Week 4

I have completed my phase of warm up issues which was oriented to allow me to get familiar with the Darcs system, specifically the cache part. I have sent the patches and they have been applied.

During the week I worked in finishing issue 1176, continued with the documentation part, more specifically describing in a higher level how a patch is fetch with a given hash, making easier for someone with a non-technical background understand what is happening, fixing a test which was failing in Windows when it shouldn't and finding out why the IO operations over hashedRepos were put in a different module.

For fixing issue 1176 I did some modifications over some code I had already sent which was about keeping the caches sorted by locality, the initial idea was that anything which was local should be first in the list and then the remote sources, but we realized that between the remotes also exist a "wanted" hierarchy, basically we would prefer to access first http repos over ssh, so the new sorting keeps all the locals first, http in the middle and ssh repos at the end. One of the problems this solve is that weird behaviour of darcs trying to establish a ssh connection when pulling from a http or local repository.


While I was working in the test which was failing we redefine what gets saved in the _darcs/prefs/sources, the global caches were one of those things, it wasn't necessary to have them there because that's why a global cache configuration file exist (~/.darcs/sources), so basically we just drop anything related with global caches before saving the sources file of a repository.


For the next week I will start to work in the problem of how to deal with the unused cache, the idea is by the end of the week to have a work plan and write a test case.

You can check out more of my progress in the wiki.

Saturday 5 June 2010

GSoC Week 2

After last week's meeting with Eric ( my mentor) we got a better time line written down, and now I have clear goals for each week until the midterms evaluation, for this week my goals were:
* Complete issue1503
* Complete issue1210
* Description of cache usage for each module in Darcs.Repository

I'm happy to say that issue 1503 was closed, and I sent the patch for issue1210 but is waiting for revision, also I noticed that I'm more familiar with the darcs structure ( at least the part in which I'm working ) and I've started to know where I have to search for something each time I need a particular functionality.

When I first sent the patch for issue 1503, Eric and Petr did some comments about design and style which took me to rewrite the original patch, thanks to them I have learn to push myself more into thinking in a bigger view each time I plan to introduce changes somewhere, sometimes you just develop bad practices which you don't see unless someone else point it to you, so I feel more convinced that a good way to learn and become a better programmer is contributing to this kind of projects, where you interact with other people, where they are commenting on you code, making you think better about what you are doing, all this kind of stuff is something you won't learn from the university, but just getting involve in something in the real world.

Other thing that I learnt was that I had a wrong idea of the concept of tests, I thought the test case for a certain issue would be one were it used to fail before applying the changes, but I didn't thought about how should the test behave in case of failing, here thanks again to Eric for explaining it to me :).

For the next week I plan to work on issue1176, start to elaborate a test plan for issue 1599, write the test cases for issues 1503 and 1210 and continue my work in the document of the darcs cache.


You can always check out my advance and know more about my project in the darcs wiki where we are documenting everything.