I have always wanted to make a pastebin for some reason. When I first started looking at nevow, I wanted to make one in that however mg beat me to it (and made a kickass one at that). So now I have been looking at yaws and think it would be a good project. Perhaps it can be packaged along with the wiki and chat client. I think my pastebin is a bit ambitious, but I would like to design it in such a way that it can all be added incrementally so I can have a functional pastebin and add onto it.
For starters, I would like a nice clean layout. I am going to do my best to use CSS and valid HTML for all of this. I am hoping petekaz can possibly help out with this. I am going to write it with everything in a div tags with proper ids so it should be easy to give this thing a skin eventually.
Technical Features:
Syntax highlighting - As with any pastebin this will most likely be used for code 90% of the time. For all of those languages we need syntax highlighting so they are easy to read. I have found a program, I beleive it is just called 'highlight'. It handles 100+ languages, including all of the ones I have an interst in supporting so I can simply interface to that.
Documentation links - People seem to want this. I am not sure how easy it is to support, especially if you want to support a lot of languages. It's a thought but not on the top of my list.
Annotation - Instead of putting suplimentery information in comments of the paste, just having a section for such information would be nice.
Threads for pastes - lisppaste does this I beleive. A paste should be like a discussion. "Here this doesn't compile" "You have a mistake here, this fix will work" "Ok, but now I have a problem here" "Ok do this". Each post in a paste thread will have the paste + annotation. To reply, since you are generally going to simply be making small changes to the previous post, the paste section should be populated with that already.
Compile code for your language - I am not sure how to do this safely, if at all. It would be nice to provide a gcc interface and an erlc interface and a ghc, yadda yadda. But I am not going to put my system at risk just for this feature that shouldn't be needed all too much in the first place.
Interface to any erlang app - By providing a node + a process name, this should be able to send a message that paste is ready to any erlang node. The obvious first usage of this would be an IRC bot. The main problem I see with this is, depending on what application the data is sent to, that will affect the look of the pastebin. For example, if the events are sent to an IRC bot, you want to be able to select what channel the notification gets sent to for the IRC bot. Should I just hardcode this into the pastebin or provide some way for the IRC bot to register information with the pastebin and somehow have the pastebin display the information on the web app? This sounds a bit harder, but works better with any application (But what other applications would even want notification of a paste?). Perhaps I will hard code it at first and then move towards a more dynamic system as I figure out how.
Mnesia configuration - I like to use mnesia for my configuration. I have a config module which provides functions to be used in setting/getting values from the mnesia config data base. I also want to use mnesia to store all of the information for the actual pastes. I hear mnesia falls apart after store a lot of data, I can imagine some of these pastes will grow to be a fair size, so I am considering using a cache for the syntax highlighted pastes. Running the application to highlight the text on every hit sounds inefficient, and storing it for every paste sounds like a waste of space, so a cache is probably a good inbetween. Right now I would like to store as many pastes as I can but will consider deleting those that are too old.
Download paste - Being able to download the paste is always very helpful. Providing a nice filename that ends in the proper extension for the language would be nice.
Browsing recent pastes - You paste to my site you loste all privacy, go figure.
File upload - Somtimes it is easier to just upload a file rather than pasting.
This sounds like a lot but I don't think it will be too bad. It seems like I should be able to do most of it fairly modular.
Step one - Come up with mnesia tables, do some research on mnesia in terms of foreign keys and possibly how to do decent QLC queries. I think learning QLC will be important, especially when a lot of pastes get put into this thing.
Step two - Get the basic form for uploading going
Step three - Come up with step three when I get there.
UPDATE
Some obvious ideas were brought to my attention
Indent - Running various languages through astyle and friends would be very helpful, some people just can't indent properly.
Customize with cookies - Store various color information in cookies so people can keep colors they enjoy. This probably won't be implemented until much later.
Differences - Highlight differences between pastes so you can show what changes have been made. This sounds kind of difficult, especially if I am outsourcing the highlighting to a third party.
Line numbers - This should be obvious
Non GUI Browsers - Yes some people use these. The download as text option should be helpful for these people, but the probably also want line numbers so a specific Non GUI version of the code might be nice, this will include line numbers.
RSS - This certainly isn't a need but might be nice, especialy as the maintainer I might want to keept track of who's pasting what.
Intelligent Mouseovers - Showing balanced paren when the mouse is over