Code friendly themes…

•January 8, 2009 • Leave a Comment

Being new to the blog writing world (if it wasn’t super-obvious), I’m having a hard time finding a theme where code doesn’t get cut off.  This one seems to be the only one.

If there’s a better one on wordpress, I’m open for suggestions.

Anonymous Functions in Erlang

•January 8, 2009 • Leave a Comment

Ok, so the first example will be the continuation from this post I mentioned before.

Here’s a (semi) real world example I had to build today for a real project.  This is a simpler form, but the purpose is still the same.

The problem I was trying to solve was that I wanted to use epgsql to access a postgres database.  I could have used Erlang’s odbc support, but it’s incredibly difficult to get odbc to function properly on some flavors of linux (near impossible, it seems, on CentOS 5).

Once I gave up on odbc in a fit of frustration I decided to build a Java/JDBC -> Erlang bridge (remember, I’m a java guy by nature), but before I finished typing my “public static void main([]){}”, an email showed up on the erlang mailing list from Will Glozer announcing epgsql.

Well, gee, that saved me a bunch of time.

It’s an excellent library, and I’m using Postgres.  So, problem solved.

The way this driver works is you can either give it a SQL statement and it returns a list of rows, or you can create a prepared statement, bind inputs (if any) and repeatedly call back to get a list of N rows at a time.

So basically, it’s a streaming result set driver.  Works very much like JDBC with the added benefit of being able to bulk process rows (rather than call resultSet.next() a million times).

In my case, I’m selecting around 15k rows, so I don’t really want them all at once.  So I needed a general means of iterating over the rows performing a transform on each row.

So I made a little utility class that wraps epgsql to give me a streaming iteration mechanism.

Here’s a simplified version of the code (the real code was a bit more complex due to iterating over 100 row blocks)

-module(funs).
-include_lib("eunit/include/eunit.hrl").
-export([lfold/3, map/2, each/2]).
-export([start/0]). 

map(Server,Fun) ->
    lfold(Server,[],fun(Val,Acc) -> [Fun(Val) | Acc] end).

map_test() ->
    ?assertEqual([2,3,4,5,6,7,8,9,10,11],
                  map(start(),fun(Val) -> Val+1 end)).

each(Server,Fun) ->
    {ok,lfold(Server,0,fun(Val,Counter) -> Fun(Val), Counter+1 end)}.

each_test() ->
    ?assertEqual({ok,10},
                 each(start(),fun(Val) -> io:format("Each: ~p~n",[Val]) end)).

lfold(Server,InitialVal,Fun) ->
    lfold_loop(Server,Fun,InitialVal,get_next(Server)).

lfold_test() -> % test for 10+9+8+7+6+5+4+3+2+1
    ?assertEqual(55,
                 lfold(start(),0,fun(V,Acc) -> V+Acc end)).

%% Actual stream loop, pulls values and calls the supplied fun() with
%% each value and the result of the last call.
lfold_loop(_Server,_Fun,Val,done) ->
    Val;
lfold_loop(Server,Fun,Val,Next) ->
    lfold_loop(Server,Fun,Fun(Next,Val),get_next(Server)).

%% Useless part just for the example
%% creates a "server" that will hand out values until it gets to zero
%% where it will return "done"
%%
%% It's here just to emulate a streaming service, like TCP/IP communications
%% More specifically, I'm using 'epgsql' to query a PostgreSQL database for a
%% real world project.
%% epgsql lets you setup a cursor and you repeatedly call back in to it
%% to get more rows. (they do this via a gen_fsm server talking to postgres)
start() ->
    spawn_link(fun() -> simple_server(10) end).

simple_server(0) -> % nothing left to return, signal 'done' on next query
    receive
        {From,next} ->
            From ! {self(),done}
    end;

simple_server(Count) -> % return next value and loop
    receive
        {From,next} ->
            From ! {self(),Count},
            simple_server(Count-1)
    end.

get_next(Server) -> % convenience function to get next val
    Server ! {self(),next},
    receive
        {Server,Val} ->
            Val
    end.

Due to the power of funs, you really only have to implement a left fold.   Basically, a fold is an iteration over a list of items where for each item you call a supplied function passing in the current item and the result of the last call.  You kick it off with a starting value.

So, if I want to add up the cost of all the widgets, I could do so by folding over the widget list, passing in an initial value of zero and a function that takes the current widget’s value + last value (which starts at zero).  Once I’ve completed the iteration, I have a total.

If I have just a list of values, ie: [1,2,3] I could also do this with lists:sum(MyList).  But a fold is a more generalized form that lets me call any arbitrary code.

Erlang’s lists module also offers fold operations, but for this case I couldn’t use it since I don’t have all my data in one big list.

Once you have a generic left fold function that takes a fun, you can easily implement map by just calling fold with a fun that wraps the supplied fun building a list from the results of the supplied fun.

map(List, Fun) -> lfold(List,[],fun(Item,Acc) -> [Fun(Item) | Acc] end).

this is taking each item, running Fun(Item) and capturing the result in to a list.  The end result is a backwards list.  If the order is important (often it isn’t), then a small change could be made:

map(List,Fun) -> lists:reverse(…from above…)

Which would give you a result list in the same order as the original input.

And if you are not interested in the result, but are iterating over your list (or stream in this case) for the side effects (say, printing them out, or sending them to a different stream), the same goes for building a for each function.

each(List,Fun) -> lfold(List,ok,fun(Item,Ignored) -> Fun(Item), Ignored end).

Like the map example, this uses lfold as well, but instead of building a result list, it discards the results of Fun(Item) and just returns the atom ‘ok’ in this case (could be anything).

My example above actually returns {ok,Count} just to be helpful in knowing how many iterations it did.  It’s entirely unnecessary.

So in 10 lines of actual code, I have a general abstraction that I can use to iterate over a stream of input data to perform various functions.  Building something similar in Java would be far harder many more lines of code, several files (unless you made everything inner classes) and no where near as generic.

Hopefully this helps someone.

hello() -> world.

•January 6, 2009 • Leave a Comment

I’ve been thinking about trying to put down my experiences with learning a new (to me) paradigm in programming.  Lately, I’ve seen a lot of people struggle with the same foreign concepts that I did.  So I suppose it’s as good a time as any.  The proverbial “straw” for me was this post with this followup.  I thought that it may be more useful for folks if I share whatever I’ve learned with a larger audience than my trolling around and doing drive-by comments on people’s blogs.

This blog will cover my interests in various languages, most of them functional and the ones that aren’t really functional languages I still like for the functional aspects (anon-functions / closures / etc).

Hopefully I can make that transition, or learning experience a little easier on my dys-functional brethren.

Note, this post clocks in at around 1400 words, feel free to skip the rest of it.  The purpose of the rest of the post is to shed some light on my background so you understand where my posts are coming from or, equally likely, decide to ignore the blog in it’s entirety due to my lack of formal training..

So here goes…

I’m a self taught software developer, no formal training in CS, didn’t attend college and did rather poor in school starting at 3rd grade.  I’m not so good at math, failed algebra 1 twice for not doing the homework.  This kept me from taking “advanced” programming in high school, which was Pascal.  That may have actually been a benefit, not sure since I never actually learned Pascal.  Ultimately, I took the California proficiency exam and left at school at the beginning of my junior year.

I started my computer career around 9 years old with Basic on an Atari 800, eventually upgraded to an Apple IIe where I had my first “paid” programming job at the age of 12.

It wasn’t a real job, mind you, it was a project my father had contracted for with a university that he subcontracted to me (for I think $3 an hour, big money at that age).  What was it?  The first version of a project called KidNet that used MCI Mail to send weather data around between elementary schools across the U.S. so they could see what the weather is like everywhere.  (this was pre-public internet)

It worked…on my machine.  We never got it working on the school’s computers.

So I suppose I started out my career by failing at 12.  But hey, at 12 years old, I really didn’t care if it worked for them.  I was more interested in playing games, running my ASCIIExpress download site and Star Wars.

After I got my first IBM PC w/expansion chasis, I was able to start “real” programming (quickbasic, heh).

Eventually, this turned in to Windows 3.11 and visual basic, and then, all of a sudden, it happened…

At around 17, a friend loaned me a video tape (VHS) from Borland by a guy named David Intersimone (Thank you David, I firmly believe that you helped kickstart my path towards software development as a career rather than a hobby).

Here’s the 3 minute intro to it, I’m hoping to find the whole thing for nostalgia reasons and that I truly believe it was the best OO programming “class” ever created.

I watched that video and it clicked, object oriented programming (C++ in particular) was the way to go.  Basic was all of a sudden lame.

Based on my new super-duper-C++ skills, I wrote “dcd” (David’s Change Directory) which was a straight ripoff of the concept behind “lcd” (Led’s Change Directory by Keith Ledbetter (Thank you Keith, ‘lcd’ was the inspiration for me to actually start writing useful code)).  Both were basically helper apps for a console based graphical (think curses) directory changer for DOS.  I wrote mine as more of an experiment and that lcd didn’t do exactly what I wanted.  It was released as freeware, not sure if anyone but me ever used it (but I used it a lot).  If you have a dos machine and are interested, my inspiration “lcd” is avaliable here.

After leaving school, I entered the wonderful world of QA.  Spent about 8 years doing that before becoming a real live paid professional programmer (Visual C++ / Windows NT) about 13 years ago.

Around that same time, i got my first new project assignment.  In a new language called Java (was version 1.02 at the time, I believe).  That project was to build a dynamic code generation / compile / reload system.  Basically, it was the hot deployment you get with JSP / J2EE but well before that was a common practice.

I was on a flight to Korea for my company to work on a C++ project, I took my printout of the language spec / tutorials from Sun with me.  By the time I landed, I knew Java.  It was such and easy conversion from C++ since a lot of the concepts were the same, it was basically just C++ minus a bunch of syntax and plus a ton of convenience libraries.  I mostly abandoned C++ at that time because I found Java to be a far more time efficient language to program in.  Basically, I could build more stuff in less time.  Performance was a non-issue for my type of apps, so that wasn’t a factor in choice of language at the time.

So for the last 12 or so years, I’ve been strictly working in Java.  Both professionally as well as contributing small bits and pieces to JBoss and XDoclet + random bugfix patches to other open source applications.

As someone who constantly strives to learn something new (be it technique, library, language, whatever), I eventually got bored.  Don’t get me wrong, I really like Java, and I’m extremely good at building systems with it.  But at some point, there’s nothing else to learn, other than the current flavor of the month library which is almost assuredly drowning in XML / Annotation based configuration.

At my current company, I found myself looking for a better way to deal with system configuration and the ability to rapidly release changes in emergency scenarios.  So I thought that an embeddable scripting language for java to build “live” configuration files would be a good approach.  Basically, I wanted a system that knew the context it was executing in and self adjusted configuration appropriately.

Annotation configs aren’t flexible, xml files require either having multiple environment files or a super-complicated and usually fragile xml config generation system.

It turned out that Groovy was the best fit at the time, it has a near-java syntax with the added benefit of List/Map literals.  It was also my first experience with a dynamically typed language other than Perl (which I never got and still don’t like).  The learning curve was zero and it gave me an executable configuration file.

The most interesting thing about Groovy (for me) was anon functions/closures (officially called closures in groovy land, which depending on which flame fest you read online, are or are not closures.  I get the difference, and even if I don’t use my enclosing scope’s free variables, I’m still going to call it a closure).

I found the style of Groovy code liberating, and the possibilities of making “simple” (relative term, you have to understand the syntax before it’s simple) code are much greater.  For a C++/Java guy, it was amazing to be able to do a one liner like: someList.filter({it % 2 == 0}).each { println “Even number: ${it}” }

At any rate, the closure flame fest I mentioned above is what actually turned me on to functional programming.  People kept throwing out Haskell this, ML that, blah, blah, blah.  So, it got me to start looking in to these other languages.  Haskell, of course, immediately turned me off to functional programming.  The Haskell barrier to entry is way too high for someone without a CS background that has been doing C++/Java most of their life.

While still trolling around a bit on functional languages, I found out about Scala and Erlang.  That’s when I fell in love with the concepts and simplicity of functional programming.

Which brings me back to why I’m starting this blog:

It’s for the “mainstream” developers (ie: c/c++/java/c#) that are either looking over at the functional side of the fence wondering what all the hubub is about, or people that just feel there has to be a better way to do things, but may still be struggling with the concepts.