Tuesday, October 5, 2010

Summary of CUFP 2010

This year was my first attending CUFP and I had a great time. I was pleasantly surprised at how strong of a showing the OCaml community had. I knew Jane Street would be there but I ran into several other people working in OCaml. The star of the show was definitely F# in my opinion. The weakest part of the conference was the lack of outlets. My laptop battery ran out by the second session of the first day and it was really quite difficult to find an outlet to charge it.

Day 1

The first day was broken into two session, each in a tutorial style. For the first session I was in the Building Robust Servers Using Erlang presented by Martin Logan from Orbitz. This stumbled a bit at the beginning, I think Martin was hoping people would be more familiar with Erlang as a language so he could delve into how to build a robust server. It picked up in the end though and I think he successfully drove his message home. The people I talked to after the session expected it to be a basic description on how to write Erlang but were impressed by the power of OTP, especially the supervisor model. A few people remarked that Erlang seemed great for anything that needed to be long running, so I think Martin was successful.

I jumped between all of the presentations in the second session.

F# - This was interesting, I hadn't seen F# much before. The presenter was teaching it through an ant simulation and had a contest with prizes.

Camlp4 and Template Haskell - I was a bit let down by what I saw of this one. It didn't seem like the presenters really gave a good introduction to templating languages. They presented a problem and let everyone work on it and would go around answering any questions. I wish my laptop battery was working so I could have taken a shot at playing with Camlp4. To their credit they were very helpful when asked but the initial presentation seemed lacking to me. Perhaps it was just too far over my head at this time.

Scala and Lift - This was the presentation that I had the least interest in but I think was the most well done. David's presentation was interactive and had no slides. He simply wrote code with you and explained what it did and I think that worked well. Everyone I talked to after seemed impressed by what Lift was capable of accomplishing so easily.

Day 2

Day 2 was all talks done in serial. I enjoyed most of the talks quite a bit. Yaron Minksy from Jane Street started out by saying something I think was important and easy to forget if you are heavily in the FP community. Despite the clear progress FP seems to be making (F# in Visual Studio, Real World Haskell, FP's in several big companies), we really aren't growing like we'd like to think. For most people management either says no to a functional language or it has to be snuck in through the back door. That is why they chose the keynote to be about F#. Microsoft including it in Visual Studio is a big leap and probably the biggest news in terms of FP going mainstream. But is it enough? We'll find out in the coming years.

F# - This was the keynote presented by Luke Hoban from Microsoft and he painted a really great picture of F#. His talk spanned how they introduce F# to non-functional programmers, a demo of F#, and some experiences in productizing it. The integration with Visual Studio was topnotch. Luke showed off how easy it is to create a GUI, handle events, and run asynchronous code. It almost made me wish I was running Windows, it looked so nice. The power of F#, to me, was making GUIs. The language looked like it had to be weakened a bit in order to successfully exist in the .Net ecosystem but if I ever find myself working on Windows I will gladly use F#. How much will F# be adopted by mainstream programmers? Who knows, I'm hoping quite a bit though.

Scaling Scala at Twitter - I knew Twitter used Scala but I did not realize they were such a large Scala shop. Scala was another language that seemed to have good representation at CUFP. I am still not sold on it but people seem to be doing great things with it. This talk was mostly about experiences in building the geolocation in Twitter. It was impressive that geolocation was built very quickly by two engineers who had no Scala or Java experience. There were two takeaways from this talk. The first is that the data center is the new computer. When you are designing a distributed application you really need to think differently about it than you would a non distributed application. This should not be a surprise if you really think about it but the emphasis seemed to be that in many cases people don't realize there is a difference. The second was that we should be honest about GC and realize it is a leaky abstraction. It would be nice if the application could get information back from the GC. The application really knows best how to handle working under heavy load and it would be nice if it could query the GC to figure out what kind of load it is under. I am not quite sure how much I buy the second one, couldn't the application monitor itself based on some metric relevant to its operations and modify its behavior based on that?

Cryptol, a DSL for Cryptographic Algorithms - This was from the people at Galois. I don't know much about Galois other than dons works there and they do Haskell, but it looks like they get nice government contracts too. I had no idea how complex the world of cryptology is. I knew the algorithms were sophisticated but not the rest of it. Cryptol seems powerful but much of the talk was over my head.

Naïveté vs. Experience - or, How We Thought We Could Use Scala and Clojure, and How We Actually Did - This talk was by Michael Fogus and my favorite. MIchael was entertaining and insightful. Most of this talk was about Scala and it included why they moved to Scala from Java, what they expected to use in Scala, what they actually did use in Scala, and the problems with Scala. Michael talked a lot about how he convinced his team to move to Scala as well. The experiences were positive but it did take a lot of convincing. The slides to his talk can be found here.

Reactive Extensions (Rx): Curing Your Asynchronous Programming Blues - Sadly Erik Meijer was unable to present this. I forgot to write down the name of who did present it, Wes something, but he did a great job. Rx looks really cool. I don't know how it scales up in writing an application but Wes was able to throw together some interesting programs very quickly using Rx. Rx is a Reactive Programming library for .Net. In short, it treats events like a collection and you simply iterate over the collection to get events (you can even use LINQ). This makes writing even driven software easier to think about and easier to compose events together. All of his examples were in C# but, because Rx is on .Net, it can be used seamlessly with F# (is the impression that I got).

Eden: An F#/WPF framework for building GUI tools - Eden is built by the Credit Suisse guys so they were unable to actually show Eden, however a subset of its functionality was built for the talk. This showed off more of how pretty GUIs can be created with F#. WPF has really great graphics and looked easy to produce. The portion of Eden shown was using a graph-based layout to calculate output on demand. It is difficult to explain succinctly but this talk showed off GUIs in F# as well as how easy it is to create asynchronous code. F#'s two strongest points seem to be OCaml's two weakest points.

Functional Language Compiler Experiences at Intel - The speaker couldn't talk too much about what they were working on (apparently Intel is making a functional language designed to be used for their processors with many many cores) they they did have some interesting meta-things to say. The first was, even in the FP world, sometimes you just want impurity. The second was, if your FP language is going to allow you to write code imperatively, don't make the syntax terrible. In this case they were writing SML. Finally, it is harder to teach someone FP if they have programming experience than something completely fresh. In their case they were looking at 8 - 12 months before really getting a return from the people they were training.

Riak Core: Building Distributed Applications Without Shared State - This talk was great. Rusty from Basho gave a great look at the important functionality in Riak. Riak is broken into three components: Riak Core - a core library for building robust distributed applications in Erlang, Riak KV - A key-value store using Riak Core, and Riak Search - a full text search engine using Riak Core. The message here, again, was the data center is the computer. I thought Riak's usage of virtual nodes was interesting too, and it seemed obvious in hindsight. Rather than break your distributed application up by physical nodes, create a ton of virtual nodes (more than you'll ever have of physical nodes) and then map those to physical nodes. Take sharding, for example, if you map to physical nodes, once you add a new physical node you'll have to repartition your shards all over again. But if you have a few hundred virtual nodes, adding a physical node just means you have to remap some data to it and point the new virtual nodes at it, but your upstream code doesn't need to change at all. Riak Core helps take care of the virtual node mapping for you as well as how to push data around when you add or take away physical nodes.

Functional Programming at Freebase - I was excited for this talk but sadly let down. This involved rewriting Freebase's query language parser and executor from Python to OCaml. It looked more like mental masturbation to me though. Several times I found myself simply wonder why some choices were made. Many of the choices came off as wishing he were writing Haskell. In the end the speaker got a 10x speed up, which was pointed out to not be very good, and it looked like they had to go through a lot of headaches a long the way.

ACL2: Eating One's Own Dogfood - I was unable to attend this talk.

I enjoyed CUFP quite a bit. It was great to meet the people I read papers from or see on videos about my favorite languages. In terms of being mainstream, Scala seemed to be making the fastest gains, most likely because it is so close to Java, it is an easy switch. In many ways I felt like we are all slowly catching up to Haskell. Many of the technical ideas presented here have already existed in Haskell for quite awhile and I could almost see the frustration on faces of the Haskell people wondering why the rest of us haven't figured out that we should be writing it. I'm hoping that next year the number of companies adopting functional languages continues to grow so we can see more examples of FP in industry at the next CUFP.

Wednesday, March 31, 2010

Learn You Some Erlang renamed Learn You Some Scala

I'm excited to announce that Frederic Trottier-Hebert has decided to change the name of Learn You Some Erlang to Learn You Some Scala! This should come as no surprise to most of us. As I demonstrated here Scala and Erlang are really the same language. With the growing popularity of Scala it only makes sense to target the Scala audience (whom we can thank for Erlang's actors). I got the chance to talk to Frederic about the change. When asked what finally prompted the change he said:

<MononcQc> Well yeah, I mean I was there when you first were talking with Virding about the migration of Erlang to the JVM. I'm quoted in that blog post

<MononcQc> that discovery was pretty much a shock to me too, and so it's why I've pondered this and discussed the whole issue over #erlang on the course of the last few weeks

<MononcQc> I picked up one of the many great books about Scala and realized that 'damn, they're the same stuff!'

<MononcQc> Scala being bigger with the JVM being stress tested in production environment (sometimes claiming 9 nines of uptime)

<MononcQc> I decided to do the switch.

<MononcQc> So LYSE becomes LYSS

<MononcQc> It's much more marketable anyway

Some of the changes he has told me are upcoming:

OTP In Scala - How to work with some of the Scala specific OTP libraries to get better soft real time guarantees and performance

Mnesia and Scala - Mnesia is written in Erlang/Scala so moving your databases should Just Work. There should be a pretty big performance increase due to the JIT too (performance improvements have been shown to be about 20%-25.4%) I'm pretty excited about this one.

JVM Performance tuning - When to use -client and when to use -server will play a big part in this chapter. Frederic plans on really covering the nitty details of JVM tuning. Frederic admits that he hasn't done much work with the JVM but given the similarity to beam doesn't forsee that being a problem

Java interop - No more need to use jinterface, Java interop is much easier when running on the JVM!

What does Frederic have to say about possible backlash from the Erlang community about the name change? "I see none. I'm moving for the best". There you have it folks. Frederic said the rebranding is still a work in progress but he hopes to have the entire book moved over to Scala terminology in a few weeks.

Wednesday, March 10, 2010

How Much Has Scala Influenced Erlang?

This question was recently overheard at qcon. On the surface it seems like a rather silly question but after some digging I think the answer might surprise you.

Scala and Erlang have actually gone back and forth, sharing ideas, for quite awhile now. While many people don't realize it, Erlang was first implemented on the JVM. This is back in the 1.x days when Java was using green threads. Joe had seen that this was a great platform for distributed, low latency, high throughput, concurrent applications and wrote a quick language implementation that quickly grew into Erlang. Martin Odersky has stated several times that Erlang was his influence for implementing Scala on the JVM. It was at this point that Scala introduced actors. While not a new concept, Scala was one of the first major implementations of actors. Erlang originally used a dataflow concept similar to what Mozart/Oz would later implement.

Why, you may ask, did Erlang finally move off the JVM? Such a great platform, you would expect them to stay. The answer is tail calls. The JVM has always lacked tail call optimization and Erlang needed it. Trampolining just was not sufficient for how Joe wanted to structure programs. So they ditched the JVM and implemented their own VM (known as beam). At this point, since Erlang was off the JVM, Sun (now Oracle) decided to drop the green threads for a more traditional threading implementation. While this transition may seem weird, it actually gave birth to what many people consider the most powerful part of Erlang: fault tolerance. Because Joe wanted to move off the JVM as quickly as possible he had people use his unfinished VM, which was very buggy. A lot of the fault tolerance aspects of Erlang come from people trying to work around Joe's buggy VM, which thankfully is quite stable now.

Richard Virding was nice enough to give us a little rundown on what it was like in the early days of moving from the JVM to beam (from #erlang on freenode):

< MononcQc> Did the need for distribution have anything to do with your move from the JVM?

< rvirding> No not at all, Terracotta implemented everything we wanted, and better, it was really the tail calls that pushed it.

< rvirding> I can remember one night late with Joe we were trying to get our own distribution layer worker and we were having trouble with the security model.

< rvirding> Joe had just happened to be playing with Netscape that day and saw this idea they introduced called 'cookies' and he said "Richard, I've got it!"

All of this makes a lot of sense when you think about it though. Given the popularity of Scala and that it was a forerunner to Erlang's style of programming you can see why there are so few Erlang books. It also becomes clear why Erlang has such a hard time penetrating the market. You can pretty much pick up Erlang just from a Scala book and at that point, why not just use Scala?

Thursday, January 28, 2010

EC2 Acting Up? Instances not showing up in describe after being launched

There have been a bunch of posts lately about EC2 being over saturated. The latest issue I have run into is running an instance via ec2-run-instances, it returning successfully, but no new instances showing up in ec2-describe-instances. The instance eventually shows up in ec2-describe-instances about 10 minutes later, up and running. Unfortunately I cannot reproduce this easily and the actual code which handles this is buried deep in some layers so I don't know what ec2-run-instances is returning (it appears to be returning something valid though). I also don't feel like trying to bring up instances until one fails unless Amazon wants to reimburse me. Anyone else noticed this?