Monday, November 20, 2006

Cryptogram solver

Newspaper puzzles beware. I have been working on soving newspaper cryptograms for the past few days in my free time. It is a bit more interesting of a problem to solve than sudoku I think. The algorithm I've implementing is nothing alltogether impressive and it relies on a good dictionary file to get solutions.

The algorithm works like so:
First you need an index which is a map of abstract words to lists of words.
An abstract word is taking a word and turning it into a pattern, for instance, 'cat' has the pattern '123', as does 'hat', and 'fat'. 'mom' has the pattern '121'.

Then it takes the sentence you have given it, splits it on spaces.
Popping the first word off the list, finds the list in the index of possible words it could be, iterates over it generating a map of letters to letters and recurses on the rest of the sentence, once it has reached the end of the sentence it puts the map it has generated into a list of possible solutions. If a generated map for a word conflicts with teh current map it is not a valid solution and moves onto the next possible solution.
The solve function returns a list of solution maps that can eb applied to any sentence to get the result.

Little things that make it helpful include being able to give an intial map, if you are sure some letters map to other letters you can give that as a hint. It can also remove any words whos pattern does not appear in the map.

Todo:
Solve only those subsets of words that, if solved, will result in the entire alphabet being solved. It should do this for all possible combinations of words in teh sentence that will result in this. This is not an optimizations, it should probably make it take longer to solve actually, however it allows it to solve sentences with words that might not exist in the dictionary but could come about as a result of solving the other words. I have a few ideas of how to do this but havn't had time to implement it yet.

The current code can be downloaded here.

Tuesday, November 14, 2006

Sudoku

I've got sudoku fever! Actually, no, I don't, I don't really like sudoku at all but I decided to write a solver in erlang. It was pretty trivial. The basic algorithm is as follows:
1) Create a list of every blank square and the possible values that can go into that square
2) Find the square with the least number of possible values
3) Iterate over the list of possible numbers that can be in that square, create a new board with that value in it and recurse on the new board
4) Continue until there are no more possible values (in which case it failed) or the puzzle is solved.

The code is not the prettiest stuff in the world but it appears to solve the problem (specifically the remove_* functions).

Usage looks like:

Eshell V5.5.1 (abort with ^G)
1> {solved, Res} = sudoku:solve([
1> [9, 5, b, b, b, 6, 4, 7, b],
1> [4, b, 8, 7, b, 2, b, b, b],
1> [6, 2, b, 4, b, b, b, 5, b],
1> [5, b, 2, b, 6, b, 3, b, b],
1> [b, b, b, 2, b, 7, b, b, b],
1> [b, b, 4, b, 1, b, 2, b, 8],
1> [b, 7, b, b, b, 9, b, 3, 4],
1> [b, b, b, 1, b, 3, 7, b, 5],
1> [b, 4, 3, 5, b, b, b, 2, 9]]).
...
2> sudoku:print(Res).
9 5 1 8 3 6 4 7 2
4 3 8 7 5 2 9 6 1
6 2 7 4 9 1 8 5 3
5 8 2 9 6 4 3 1 7
3 1 9 2 8 7 5 4 6
7 6 4 3 1 5 2 9 8
8 7 5 6 2 9 1 3 4
2 9 6 1 4 3 7 8 5
1 4 3 5 7 8 6 2 9
ok


Code (can be downloaded here):

-module(sudoku).

-export([solve/1, print/1]).

solve(Puzzle) when is_list(Puzzle) ->
solve_puzzle(dict_from_list(Puzzle)).

print(Puzzle) ->
lists:foreach(fun(X) ->
lists:foreach(fun(Y) ->
io:format("~w ", [dict:fetch({X, Y}, Puzzle)])
end, lists:seq(0, 8)),
io:format("~n", [])
end, lists:seq(0, 8)).

dict_from_list(List) ->
element(2, lists:foldl(fun(Elm, {X, Dict}) ->
{_, DDict} = lists:foldl(fun(Elem, {Y, NDict}) ->
{Y + 1, dict:store({X, Y}, Elem, NDict)}
end, {0, Dict}, Elm),
{X + 1, DDict}
end, {0, dict:new()}, List)).

solve_puzzle(Puzzle) ->
case generate_open_spots(Puzzle) of
[{{X, Y}, Set} | _] ->
try_value({X, Y}, Set, Puzzle);
[] ->
{solved, Puzzle}
end.

try_value(_, [], Puzzle) ->
print(Puzzle),
io:format("~n", []),
failed;
try_value({X, Y}, [H | R], Puzzle) ->
case solve_puzzle(dict:store({X, Y}, H, Puzzle)) of
{solved, RPuzzle} ->
{solved, RPuzzle};
failed ->
try_value({X, Y}, R, Puzzle)
end.

generate_open_spots(Puzzle) ->
OpenSquareList = dict:fold(fun(Key, b, Acc) ->
[Key | Acc];
(_Key, _Value, Acc) ->
Acc
end, [], Puzzle),
lists:sort(fun({{_X1, _Y1}, E1}, {{_X2, _Y2}, E2}) when length(E1) < length(E2) ->
true;
(_E1, _E2) ->
false
end, generate_open_values(OpenSquareList, Puzzle)).

generate_open_values(List, Puzzle) ->
generate_open_values(List, [], Puzzle).

generate_open_values([], Acc, _Puzzle) ->
Acc;
generate_open_values([{X, Y} | R], Acc, Puzzle) ->
generate_open_values(R, [{{X, Y}, remove_region_vals({X, Y},
remove_x_vals(Y,
remove_y_vals(X, lists:seq(1, 9),
Puzzle),
Puzzle),
Puzzle)} | Acc],
Puzzle).

remove_x_vals(Y, List, Puzzle) ->
lists:foldl(fun(Idx, Acc) ->
case dict:fetch({Idx, Y}, Puzzle) of
b ->
Acc;
E ->
lists:delete(E, Acc)
end
end,
List, lists:seq(0, 8)).

remove_y_vals(X, List, Puzzle) ->
lists:foldl(fun(Idx, Acc) ->
case dict:fetch({X, Idx}, Puzzle) of
b ->
Acc;
E ->
lists:delete(E, Acc)
end
end,
List, lists:seq(0, 8)).

remove_region_vals({X, Y}, List, Puzzle) ->
{RX, RY} = find_region(X, Y),
lists:foldl(fun(IX, AccX) ->
lists:foldl(fun(IY, AccY) ->
case dict:fetch({IX, IY}, Puzzle) of
b ->
AccY;
E ->
lists:delete(E, AccY)
end
end, AccX, lists:seq(RY, RY + 2))
end, List, lists:seq(RX, RX + 2)).

find_region(X, Y) ->
{find_region(X), find_region(Y)}.

find_region(V) when V >= 0, V < 3 ->
0;
find_region(V) when V >= 3, V < 6 ->
3;
find_region(V) when V >= 6, V < 9 ->
6.

Wednesday, July 19, 2006

Lack of activity

I have been busy looking for a job so I have had not had much time to do anything Erlang related. This, unfortunately, means I have not had much time to blog. I am hoping this will change by Sept. If anyone has any job offers and wants a resume, I am currently looking in the New York Ciy area for Financial Programming work or Bioinformatics work. Send an email to orbitz AT ortdotlove.net (that is really ortdotlove.net, not fancy spelling for ort.love.net). I'll gladly mail you back my resume.

I'm looking forward to getting back to programming for fun soon.

Tuesday, May 30, 2006

Shell

The other day, araujo suggested Erlang might make a good shell. Shell as in, bash. Hrm, I thought, maybe...
I don't think Erlang would be flexible enough to make a decent shell. Take something like this:


for i in *; do echo $i; done


What would the Erlang equivalent look like?


lists:map(fun(X) -> print([X]) end, ls())


I just don't see it? I should hope it doesn't look like that but I don't know. On the other hand, I seem to have it pretty set in my mind that Common Lisp would make an excellent shell language. A few people have told me that CL would probably be too verbose and not work out well but, I don't know, it just seems so succinct and powerful. You could make a lot of macros to make your life easier. I sent an email to the Lisp At Light Speed blog dude to get his opinion but I figured I'd post it here too to see if anyone has any thoughts.


What I have in mind is basically an interactive shell, using Common
Lisp as its programming language. For instance something like:

for i in *; do echo $i; done

Might look like:

(loop for i in (ls) do (print i))

The second thing I would love to see is Monad/Powershell. Rather
than using text to pass data around, uses lists/objects/etc. (ls)
for instance gives a list of file objects to iterate over, and you
can extract things like the size and permissions and what not.

This raises a question though. In order to do this, just about every
standard program becomes obsolete. For instance, why use 'cut' when
you can simply split a string with a CL function. However some
things would still be very useful, such as ls, or grep, but these
now need to communicate via objects rather than just text, so those
will have ot be rewritten. In which cause it seems like they could
simply be packages and loaded into the shell, meaning you don't
have to deal with program startup times or what not. Is this a lot
of work to rewrite all of these programs, and does it mean that we
cannot usefully use third party programs that people write? You
can always drop down to using strings as communication I suppose.

That is the basic idea. So far a few people have suggested that
sexprs really aren't any better for this sort of thing. sexpr's
won't be any better than say...bash. Someone suggested that Factor
might be a better language for this. But it seems to me, with
Common Lisps powerful macro system, you can easily make most
operations incredibly simple? The most useful aspect would be using
objects to communicate. But even if this is a lot of work to
accomplish, it seems to me that simply using CL as a shell programming
language seems like it would be nice. Am I wrong here? Am I missing
something?


If CL would make a poor langauge, what would make a good language? Riastrach has suggested Factor. I don't know enough about factor to say that but I think I like CL better for this.

All I know is:

This is not the answer

Saturday, May 6, 2006

Pastebin design - Mnesia tables

I have started to design the tables for this. The main goal is simple: make it easily extendable. That is to say I want to make it so I can easily develope it incrementally and add things to it that I didn't think of before. But that should be obvious to any developer. Given the spec of my previous post, here is the beginning of the database.

Table:
paste - This is the table that will hold the actual pastes.
pid | text | annotation | language | date

pid - unique id to identify a paste
text - the actual text
annotation - any annotation the paster wants to include
language - Plaintext/C++/Erlang, etc
date - when they pasted it


highlight - cache of pastes that have been put through the source highlighter
pid | text | last_viewed

pid - the same as paste.pid
text - result of being put through the highlighter
last_viewed - to keep track of when the entry should be removed

threads - paste threads
tid | [pid]

tid - the unique id for a thread
[pid] - list of pid's


I decided to have the thread table just contain an id and list so you don't have to do a bunch of queries to get every paste in a thread. This way a paste can be part of multiple threads (Although I don't see when this will ever happen). The paste table contains, I think, the minimalist amount of data in order to be useful. Using a highlight table means we can store, if we want, every single paste with a highlighted copy of the text, or just a few and use it as a cache style system. Later on we can also add the ability to associate a paste with an irc channel or a user if we wish to track that sort of information.

I'm not sure how to generate unique id's in mnesia. In a SQL database I would use a serial data type but I am unsure if mnesia supports this functionality. If not I suppose I could use newref maybe? I need to be able to convert it to a string to make url's, and also be valid between restarts.

As I think about it, I guess serials are just implemented as a table and store the integer in there, I can implement that in mnesia I suppose. It just needs to have some sort of get_and_increment functionality so two proceses don't get the same idea. If anyone has any ideas on how to implement this let me know. Bear in mind I have not looked at the mnesia documentation yet and am just brainstorming, so it is quite possible the solution is incredibly simple.

Next step - determine what pages pages will be needed.

Thursday, May 4, 2006

Erlang Pastebin

I have always wanted to make a pastebin for some reason. When I first started looking at nevow, I wanted to make one in that however mg beat me to it (and made a kickass one at that). So now I have been looking at yaws and think it would be a good project. Perhaps it can be packaged along with the wiki and chat client. I think my pastebin is a bit ambitious, but I would like to design it in such a way that it can all be added incrementally so I can have a functional pastebin and add onto it.

For starters, I would like a nice clean layout. I am going to do my best to use CSS and valid HTML for all of this. I am hoping petekaz can possibly help out with this. I am going to write it with everything in a div tags with proper ids so it should be easy to give this thing a skin eventually.

Technical Features:
Syntax highlighting - As with any pastebin this will most likely be used for code 90% of the time. For all of those languages we need syntax highlighting so they are easy to read. I have found a program, I beleive it is just called 'highlight'. It handles 100+ languages, including all of the ones I have an interst in supporting so I can simply interface to that.

Documentation links - People seem to want this. I am not sure how easy it is to support, especially if you want to support a lot of languages. It's a thought but not on the top of my list.

Annotation - Instead of putting suplimentery information in comments of the paste, just having a section for such information would be nice.

Threads for pastes - lisppaste does this I beleive. A paste should be like a discussion. "Here this doesn't compile" "You have a mistake here, this fix will work" "Ok, but now I have a problem here" "Ok do this". Each post in a paste thread will have the paste + annotation. To reply, since you are generally going to simply be making small changes to the previous post, the paste section should be populated with that already.

Compile code for your language - I am not sure how to do this safely, if at all. It would be nice to provide a gcc interface and an erlc interface and a ghc, yadda yadda. But I am not going to put my system at risk just for this feature that shouldn't be needed all too much in the first place.

Interface to any erlang app - By providing a node + a process name, this should be able to send a message that paste is ready to any erlang node. The obvious first usage of this would be an IRC bot. The main problem I see with this is, depending on what application the data is sent to, that will affect the look of the pastebin. For example, if the events are sent to an IRC bot, you want to be able to select what channel the notification gets sent to for the IRC bot. Should I just hardcode this into the pastebin or provide some way for the IRC bot to register information with the pastebin and somehow have the pastebin display the information on the web app? This sounds a bit harder, but works better with any application (But what other applications would even want notification of a paste?). Perhaps I will hard code it at first and then move towards a more dynamic system as I figure out how.

Mnesia configuration - I like to use mnesia for my configuration. I have a config module which provides functions to be used in setting/getting values from the mnesia config data base. I also want to use mnesia to store all of the information for the actual pastes. I hear mnesia falls apart after store a lot of data, I can imagine some of these pastes will grow to be a fair size, so I am considering using a cache for the syntax highlighted pastes. Running the application to highlight the text on every hit sounds inefficient, and storing it for every paste sounds like a waste of space, so a cache is probably a good inbetween. Right now I would like to store as many pastes as I can but will consider deleting those that are too old.

Download paste - Being able to download the paste is always very helpful. Providing a nice filename that ends in the proper extension for the language would be nice.

Browsing recent pastes - You paste to my site you loste all privacy, go figure.

File upload - Somtimes it is easier to just upload a file rather than pasting.

This sounds like a lot but I don't think it will be too bad. It seems like I should be able to do most of it fairly modular.

Step one - Come up with mnesia tables, do some research on mnesia in terms of foreign keys and possibly how to do decent QLC queries. I think learning QLC will be important, especially when a lot of pastes get put into this thing.

Step two - Get the basic form for uploading going

Step three - Come up with step three when I get there.

UPDATE
Some obvious ideas were brought to my attention

Indent - Running various languages through astyle and friends would be very helpful, some people just can't indent properly.

Customize with cookies - Store various color information in cookies so people can keep colors they enjoy. This probably won't be implemented until much later.

Differences - Highlight differences between pastes so you can show what changes have been made. This sounds kind of difficult, especially if I am outsourcing the highlighting to a third party.

Line numbers - This should be obvious

Non GUI Browsers - Yes some people use these. The download as text option should be helpful for these people, but the probably also want line numbers so a specific Non GUI version of the code might be nice, this will include line numbers.

RSS - This certainly isn't a need but might be nice, especialy as the maintainer I might want to keept track of who's pasting what.

Intelligent Mouseovers - Showing balanced paren when the mouse is over

Tuesday, April 4, 2006

Yaws 1.58 true_nozip bug

Yaws 1.58 has a bit of a bug. If you specify dir_listings = true_nozip, it poops out. This is because the if condition is written to only handle the situation where it is true or false.

Here is my little bug fix, the problem file is src/yaws_ls.erl


47c47,54
< if DoAllZip == true -> allzip() end,
---
> case DoAllZip of
> true ->
> allzip();
> true_nozip ->
> [];
> false ->
> []
> end,


klacke says he also has a fix but, as usual, sourceforge is down so he has not been able to commit it.

Friday, March 31, 2006

Fun with processes (Updated)

I really need to go through all of my Erlang documentation so people like petekaz can stop making me feel small.
I had an issue with some poor Erlang code I was running where it performed a file:open but not the appropriate file:close. Not wanting to restart my application I wanted to know how I could fix this problem without shutting it down. Erlang, of course, provides a solution.
One thing that made this workable in the first place was the fact that I had no other open files besides those that needed to be closed.

So imagine for a second you have opened a number of files and lost the reference to them. For example:

(ort_bot@blong)21> [file:open("/tmp/c2.log", read) || X <- lists:seq(1, 10)].
[{ok,<0.8974.7>},
{ok,<0.8975.7>},
{ok,<0.8976.7>},
{ok,<0.8977.7>},
{ok,<0.8978.7>},
{ok,<0.8979.7>},
{ok,<0.8980.7>},
{ok,<0.8981.7>},
{ok,<0.8982.7>},
{ok,<0.8983.7>}]


file:open causes a process to be created (this is Erlang afterall).
In the shell you can simply type i(). and get output like:


(ort_bot@blong)22> i().
Pid Initial Call Heap Reds Msgs
Registered Current Function Stack
<0.0.0> otp_ring0:start/2 377 6569 0
init init:loop/1 2
<0.2.0> erlang:apply/2 4181 179350 0
erl_prim_loader erl_prim_loader:loop/3 5
<0.4.0> gen_event:init_it/6 1597 1114 0
error_logger gen_event:loop/4 10
<0.5.0> erlang:apply/2 6765 8325 0
application_controlle gen_server:loop/6 7
<0.7.0> application_master:init/4 377 45 0
application_master:main_loop/2 8
<0.8.0> application_master:start_it/4 233 90 0
application_master:loop_it/4 5
<0.9.0> supervisor:kernel/1 610 1373 0
kernel_sup gen_server:loop/6 12
<0.10.0> rpc:init/1 610 17456475 0
rex gen_server:loop/6 12
<0.11.0> global:init/1 233 1246 0
global_name_server gen_server:loop/6 12
<0.12.0> erlang:apply/2 233 275 0
global:loop_the_locker/1 4
<0.13.0> erlang:apply/2 233 4 0
global:collect_deletions/2 6
<0.14.0> inet_db:init/1 233 137 0
inet_db gen_server:loop/6 12
<0.16.0> supervisor:erl_distribution/1 377 307 0
net_sup gen_server:loop/6 12
<0.17.0> erl_epmd:init/1 233 137 0
erl_epmd gen_server:loop/6 12
<0.18.0> auth:init/1 233 77 0
auth gen_server:loop/6 12
<0.19.0> net_kernel:init/1 233 3272 0
net_kernel gen_server:loop/6 12
<0.20.0> inet_tcp_dist:accept_loop/2 233 178 0
prim_inet:accept0/2 10


This list goes on for awhile longer, depending on what you have done. As you scan through this output you should see something like:

<0.8974.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8975.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8976.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8977.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8978.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8979.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8980.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8981.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8982.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3
<0.8983.7> erlang:apply/2 233 135 0
file_io_server:server_loop/1 3


All of those file_io_server's are the open files. And the Pid associated with it is the file handle. So to close all these files all you have to do is call file:close on those pids. You can create a pid give 3 numbers in the shell by using the pid function:


(ort_bot@blong)23> file:close(pid(0, 8974, 7)).
ok
(ort_bot@blong)24> file:close(pid(0, 8975, 7)).
ok
(ort_bot@blong)25> file:close(pid(0, 8976, 7)).
ok
(ort_bot@blong)26> file:close(pid(0, 8977, 7)).
ok
(ort_bot@blong)27> file:close(pid(0, 8978, 7)).
ok
(ort_bot@blong)28> file:close(pid(0, 8979, 7)).
ok
(ort_bot@blong)29> file:close(pid(0, 8980, 7)).
ok
(ort_bot@blong)30> file:close(pid(0, 8981, 7)).
ok
(ort_bot@blong)31> file:close(pid(0, 8982, 7)).
ok
(ort_bot@blong)32> file:close(pid(0, 8983, 7)).
ok


Now, call i() again and you'll notice all of those pesky file's have been closed. You can also do all of this from a remote machine if you have run your erl properly. For example my remote application is run on a machine called 'blong' and the node name is 'ort_bot'. I am on 'ooter' on a machine called 'osx'. So from 'osx' I can do this:


osx:~/projects/Erlang orbitz$ erl -sname ooter -setcookie cookiemonster
Erlang (BEAM) emulator version 5.4.13 [source] [threads:0]

Eshell V5.4.13 (abort with ^G)
(ooter@osx)1>
User switch command
--> r ort_bot@blong
--> j
1 {shell,start,[init]}
2* {ort_bot@blong,shell,start,[]}
--> c 2
Eshell V5.4.12 (abort with ^G)
(ort_bot@blong)1> i().
Pid Initial Call Heap Reds Msgs
Registered Current Function Stack
<0.0.0> otp_ring0:start/2 377 6579 0
init init:loop/1 2
<0.2.0> erlang:apply/2 610 161416 0
erl_prim_loader erl_prim_loader:loop/3 5
<0.4.0> gen_event:init_it/6 610 557 0
error_logger gen_event:loop/4 10
<0.5.0> erlang:apply/2 6765 8289 0
application_controlle gen_server:loop/6 7


Connecting to a remote shell is easy and we now have a shell on that node and can do anything from here we want (infact I've done halt(). quite a few times by accident not thinking oi!).

Alternativly, you can also use pman, which provides an interface much like i() but does more interesting stuff too. For instance you can see linked processes and kill various processe through pman. In order to use pman, on your local node simply run: pman:start().
This provide you with a window that looks like:



This looks just like i(). And from the 'node' menu you can switch nodes to view the processes on. As you can see from this screenshot I am viewing the processes on ort_bot@blong, which is a remote node.

As you can see, Erlang provides a very powerful set of tools to keep your application stable and perform various operations. It is things like this that make me glad to run code in a shell rather than a stand alone appliction. For situations like this, the power of the erlang shell is unparalleled.

Update:
Well in his infinite wisdom, after a bit of playing around petekaz came up with two functions that makes solving this particular problem all of the easier. My goal was mainy to show how to interact with the shell to get useful information about processes, but petekaz has shown me how to actually close a specific file. It seems file:open drops some useful information into ets.


[ file:close(P) || P <- processes(), {ok, "/tmp/test.erl"} == file:pid2name(P) ].


Where "/tmp/test.erl" is the file you are trying to close.
This makes use of two useful functinos. processes for starters evaluates to a list of all current proceses runninging in the node, and file:pid2name takes a pid and tells you what file it is associated with. If the pid is not a file it returns undefined.
So it seems Erlang has a complete solution to this problem.

Thanks petekaz for figuring this one out completely.

Thursday, March 23, 2006

Oort

Well it was pretty painful but I ripped out most of my terrible code and replaced it with pretty gen_server's and the likes. The code is still in a state of flux and needs to be polished a bit more, but it's a good start. You can gawk at it here:

svn://ortdotlove.net/Erlang/trunk/p1

Ignore the manderlbot source tree in there.

Monday, March 6, 2006

MEUG

MEUG is Massachusetts Erlang User Group.
For some reason, there seem to be a lot of Erlang users in MA. I would like to get us all together semi-regularly. We can enjoy some good food, some drinks, and talk about Erlang, or whatever else comes up. Concurrency in general is an interesting topic that that is much to talk about.
I will keep people updated. If there are any MA based Erlangers interested that I am not yet aware of feel free to email me at:
orbitz AT ortdotlove.net (That is really ortdotlove) or post a comment.

Monday, January 23, 2006

Re: Re: Re: Ruby Array class sucks

I have been meaning to respond to this post on the factor blog for awhile. Factor has grown a lot, and this blog tracks the development of it. Factor is the result of Forth and Lisp having babies. But the post I'm interested in doesn't have so much to do with factor but with ruby. It is in response to another post (URL at the factor site) about a complaint with Ruby's Array class. The complaint centers mainly around the Array object having far more methods than it should and quite a few that don't belong there. Slava's (factor creator) response suggests that the methods do deserve to be there and from a functional standpoint most of them make plenty of sense.

For a number of the methods in Array, I do agree with slava that they should be present. For instance, the authors complaint about Array.new being able to preset an array to a specific value is silly to me, I often want some sort of initial value to work from so I would imagine many other people are in my boat.

Around Array.transpose, I start to disagree though. transpose takes an Array and treats it like a matrix. Slava's response is:

It might surprise the author to learn that mathematically, a matrix is a two-dimensional array. In Factor, you can perform mathematical operations on vectors, such as vector addition, dot product, and so on.


Ok, so basicaly the argument is, a matrix resembles an array in some way and factor allows these sort of manipulations, thus ruby should. Well ok, let's say, yes, a matrix is a two-dimensional array, that's nice. But what does that make an Array then? An Array is also a two-dimensional array? That sounds a bit circular. Why should a method in Array depend on Array being thought of as a specific type of object. If we want to do matrix work, I suggest abstracting a matrix type, even if it's as simple as Matrix = Array. But really, it sounds like transpose should be in some sort of Matrix object.

Array.rassoc also seems a bit out of place to me. Slava says:

Yes, and there's nothing wrong with that. Association lists, while they suffer from linear lookup time, are occasionally useful where you want to have a mapping together with a certain intrinsic order. For example, Factor's generic word code takes the hashtable of methods defined on a generic, converts this to an association list, then sorts the association list according to the partial ordering of classes. Finally, the sorted association list is used to generate code.

Again, the argument of "This is ok because factor does it" does not quite fly with me. Now, when I write code and I'm not quite sure what data type I really want, but I need some sort of key->value association, I will often just use a list or array type to start out with, abstracted into some interface so when I decide what I finally need I can simply change the backend. Now, is this a good reason to put a function that treats a list like a map inside a list object? In my opinion, No.

In my opinion, I would use some sort of generic sequence object that has various methods that act on arrays and lists and other sequence objects, allowing you to perform some of these manipulations. In most languages, I would probably go as far to even make something like rassoc just be a plain function, no need to make it part of an object. But I have a feeling a lot of Ruby programmers don't feel content unless they have methods in some class.

This Array object seems to be similar to various classes in Python. Back in the day, Python developers could add methods to classes without any sort of vote. This is why the Python standard lib has lots of random modules that one would not normally think should be in a standard library, and also why Python has been making such an effort to clean up a lot of the standard classes and provide a more consistent interface to them.

Sorry this post does not actually contain anything on erlang. My next one will, I promise. Feel free to flame me in response to my Ruby thoughts, just be civil.