functional orbitz: Fun with processes (Updated)

I really need to go through all of my Erlang documentation so people like petekaz can stop making me feel small.
I had an issue with some poor Erlang code I was running where it performed a file:open but not the appropriate file:close. Not wanting to restart my application I wanted to know how I could fix this problem without shutting it down. Erlang, of course, provides a solution.
One thing that made this workable in the first place was the fact that I had no other open files besides those that needed to be closed.

So imagine for a second you have opened a number of files and lost the reference to them. For example:


(ort_bot@blong)21> [file:open("/tmp/c2.log", read) || X <- lists:seq(1, 10)].
[{ok,<0.8974.7>},
 {ok,<0.8975.7>},
 {ok,<0.8976.7>},
 {ok,<0.8977.7>},
 {ok,<0.8978.7>},
 {ok,<0.8979.7>},
 {ok,<0.8980.7>},
 {ok,<0.8981.7>},
 {ok,<0.8982.7>},
 {ok,<0.8983.7>}]

file:open causes a process to be created (this is Erlang afterall).
In the shell you can simply type i(). and get output like:


(ort_bot@blong)22> i().
Pid                   Initial Call                          Heap     Reds Msgs
Registered            Current Function                     Stack              
<0.0.0>               otp_ring0:start/2                      377     6569    0
init                  init:loop/1                              2              
<0.2.0>               erlang:apply/2                        4181   179350    0
erl_prim_loader       erl_prim_loader:loop/3                   5              
<0.4.0>               gen_event:init_it/6                   1597     1114    0
error_logger          gen_event:loop/4                        10              
<0.5.0>               erlang:apply/2                        6765     8325    0
application_controlle gen_server:loop/6                        7              
<0.7.0>               application_master:init/4              377       45    0
                      application_master:main_loop/2           8              
<0.8.0>               application_master:start_it/4          233       90    0
                      application_master:loop_it/4             5              
<0.9.0>               supervisor:kernel/1                    610     1373    0
kernel_sup            gen_server:loop/6                       12              
<0.10.0>              rpc:init/1                             610 17456475    0
rex                   gen_server:loop/6                       12              
<0.11.0>              global:init/1                          233     1246    0
global_name_server    gen_server:loop/6                       12              
<0.12.0>              erlang:apply/2                         233      275    0
                      global:loop_the_locker/1                 4              
<0.13.0>              erlang:apply/2                         233        4    0
                      global:collect_deletions/2               6              
<0.14.0>              inet_db:init/1                         233      137    0
inet_db               gen_server:loop/6                       12              
<0.16.0>              supervisor:erl_distribution/1          377      307    0
net_sup               gen_server:loop/6                       12              
<0.17.0>              erl_epmd:init/1                        233      137    0
erl_epmd              gen_server:loop/6                       12              
<0.18.0>              auth:init/1                            233       77    0
auth                  gen_server:loop/6                       12              
<0.19.0>              net_kernel:init/1                      233     3272    0
net_kernel            gen_server:loop/6                       12              
<0.20.0>              inet_tcp_dist:accept_loop/2            233      178    0
                      prim_inet:accept0/2                     10

This list goes on for awhile longer, depending on what you have done. As you scan through this output you should see something like:


<0.8974.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8975.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8976.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8977.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8978.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8979.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8980.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8981.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8982.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3              
<0.8983.7>            erlang:apply/2                         233      135    0
                      file_io_server:server_loop/1             3

All of those file_io_server's are the open files. And the Pid associated with it is the file handle. So to close all these files all you have to do is call file:close on those pids. You can create a pid give 3 numbers in the shell by using the pid function:


(ort_bot@blong)23> file:close(pid(0, 8974, 7)).                              
ok
(ort_bot@blong)24> file:close(pid(0, 8975, 7)).
ok
(ort_bot@blong)25> file:close(pid(0, 8976, 7)).
ok
(ort_bot@blong)26> file:close(pid(0, 8977, 7)).
ok
(ort_bot@blong)27> file:close(pid(0, 8978, 7)).
ok
(ort_bot@blong)28> file:close(pid(0, 8979, 7)).
ok
(ort_bot@blong)29> file:close(pid(0, 8980, 7)).
ok
(ort_bot@blong)30> file:close(pid(0, 8981, 7)).
ok
(ort_bot@blong)31> file:close(pid(0, 8982, 7)).
ok
(ort_bot@blong)32> file:close(pid(0, 8983, 7)).
ok

Now, call i() again and you'll notice all of those pesky file's have been closed. You can also do all of this from a remote machine if you have run your erl properly. For example my remote application is run on a machine called 'blong' and the node name is 'ort_bot'. I am on 'ooter' on a machine called 'osx'. So from 'osx' I can do this:


osx:~/projects/Erlang orbitz$ erl -sname ooter -setcookie cookiemonster
Erlang (BEAM) emulator version 5.4.13 [source] [threads:0]

Eshell V5.4.13  (abort with ^G)
(ooter@osx)1> 
User switch command
 --> r ort_bot@blong
 --> j
   1  {shell,start,[init]}
   2* {ort_bot@blong,shell,start,[]}
 --> c 2
Eshell V5.4.12  (abort with ^G)
(ort_bot@blong)1> i().
Pid                   Initial Call                          Heap     Reds Msgs
Registered            Current Function                     Stack              
<0.0.0>               otp_ring0:start/2                      377     6579    0
init                  init:loop/1                              2              
<0.2.0>               erlang:apply/2                         610   161416    0
erl_prim_loader       erl_prim_loader:loop/3                   5              
<0.4.0>               gen_event:init_it/6                    610      557    0
error_logger          gen_event:loop/4                        10              
<0.5.0>               erlang:apply/2                        6765     8289    0
application_controlle gen_server:loop/6                        7

Connecting to a remote shell is easy and we now have a shell on that node and can do anything from here we want (infact I've done halt(). quite a few times by accident not thinking oi!).

Alternativly, you can also use pman, which provides an interface much like i() but does more interesting stuff too. For instance you can see linked processes and kill various processe through pman. In order to use pman, on your local node simply run: pman:start().
This provide you with a window that looks like:

This looks just like i(). And from the 'node' menu you can switch nodes to view the processes on. As you can see from this screenshot I am viewing the processes on ort_bot@blong, which is a remote node.

As you can see, Erlang provides a very powerful set of tools to keep your application stable and perform various operations. It is things like this that make me glad to run code in a shell rather than a stand alone appliction. For situations like this, the power of the erlang shell is unparalleled.

Update:
Well in his infinite wisdom, after a bit of playing around petekaz came up with two functions that makes solving this particular problem all of the easier. My goal was mainy to show how to interact with the shell to get useful information about processes, but petekaz has shown me how to actually close a specific file. It seems file:open drops some useful information into ets.


[ file:close(P) || P <- processes(), {ok, "/tmp/test.erl"} == file:pid2name(P) ].

Where "/tmp/test.erl" is the file you are trying to close.
This makes use of two useful functinos. processes for starters evaluates to a list of all current proceses runninging in the node, and file:pid2name takes a pid and tells you what file it is associated with. If the pid is not a file it returns undefined.
So it seems Erlang has a complete solution to this problem.

Thanks petekaz for figuring this one out completely.

3 comments:

AnonymousApril 18, 2006 at 4:16 AM
A good post, but it makes me think about how best to ensure that a file that I open in Erlang will be closed, something like a try/catch/finally block in Java.
AnonymousMarch 15, 2007 at 10:00 AM
It's called
try/catch/after in erlang

RTFM
Daniel GoertzenOctober 9, 2008 at 5:41 PM
Thanks for the post, you taught me a lot about the shell.

Another option for cleaning up the files is to just terminate your process, or from the shell type "1=2." All the linked io processes go away.

That might be a good idiom to prevent file leaks; do all your file handling work from a temporary subprocess.

Friday, March 31, 2006

Fun with processes (Updated)

3 comments: