comment 0

On Abstraction

Finally: a post about computer programming!  It’s taken a few weeks to get around to writing something new for this blog, but it’s been a busy few weeks full of programming.  So let me jot down my thoughts about it while it’s still fresh and I have something worthwhile to share on the matter.

Abstractions are how computers work.  Or rather, it’s how they do any real work.  Digital logic is how they physically work.  One of the more important abstractions in the last 30 years is the desktop metaphor.  You keep all your tools (pens, pencils, whatever) in a drawer.  Start menu.  You leave little sticky notes all over for temporary bits of info you want to keep handy but probably won’t save.  Notepad.  There’s a phone.  Skype.  There’s a virtual analog to anything real world you might keep near your desk to get work done.  Of course nowadays, probably the only thing sitting on your desktop is your desktop computer.  Well, that and a coffee cup.  We can’t virtualize that, thank God.

Large programs only get written because we can abstract away the minuscule details.  Languages are developed to give the programmer (still a human being!) a means of expressing what the computer is meant to do.  Computers are dumb.  They follow a script of instructions, whether those instructions lead to the expected result or not.  In order to minimize these errors in logic, we develop high level languages that translate our ideas into machine language, something the computer can follow.  Let’s use an example to show why we need abstractions to get most things working.

Say I’m figuring out if our department used all of its budget for this year, or if we’ve got a surplus we need to quickly spend on a kick-ass Christmas party.  Pretty important business stuff, right?  We know how much we are allowed to spend (our budget) but we don’t know exactly what we’ve already spent (our expenses).  We’ve kept a record throughout the year of purchases made and saved them in a database.  Sounds pretty easy, we’ll just add up all our expenses then look at the difference between that and our budget.

How do we get the computer to do these tasks?  We have to write out what we want in some type of language.  Let’s invent a new language for this purpose!  Say our language deals specifically with queries against this one database, just to keep things simple for now.  We write a statement:

$budget - (sum of each $expense) = $party

Easy enough.  But these words mean nothing to a computer.  We need an interpreter or a compiler that can turn this into something the computer runs.  We’ll go with an interpreter, which has to break this statement down into its components.  Lets say it looks at everything left to right.

$budget

This one’s easy.  $budget is what we call a variable.  It holds a number, any number.  Because our budget is (sadly) not defined by us, our interpreter knows where to get it.  It puts in the value 1,000 because that’s all our meager department gets.  The next part:

-

Ok, we have some math!  But what are we subtracting?  We’ll have to look at the next term to know.

(sum of each $expense)

This is where things get tricky.  Here we have an expression, and it’s up to our interpreter to figure out what it means.  Ultimately we need to get back a single quantity.  Our interpreter turns our expression into something like:

find all expense records
start with a total of zero
add the value of this expense to our total
is there another record?
yes, repeat
no, return the value we totaled

Now here, the interpreter runs some code that is part of its library to handle some of these tasks.  A library is a set of common routines that can be shared among many programs.  Finding all the expense records is a huge endeavor, and it probably relies on many libraries of its own.  We’ll abstract that away (like we’ve been doing) since we don’t care about details, as long as we get the result we wanted.

Let’s look deeper at this algorithm.  We’re accumulating our expenses, so we have to have some memory stashed away to remember our total.  We’re also looking at each expense, so we need to know how to find the next one.  A lot of this depends on the actual data structure used to hold each record, but we can imagine that this sequence looks something like:

    xor r1, r1 ; zero out accumulator
loop:
    mov r0, [rec_ptr+4] ; pointer to next record
    and r0, r0 ; check for end of list
    jz done
    move [rec_ptr], r0 ; setup record
    add r1, [rec_ptr+2] ; amount in dollars, rounded up
    jmp loop
done:
    return r1

This is all hypothetical.  In real programs, there would be checks to make sure our data is valid and we aren’t stepping into memory regions we don’t belong.  But here you see our algorithm translated into assembly language which is still a very human readable format, but much closer to the instruction set that the computer itself uses.  Assembly language itself is encoded into instructions that contain all the information the computer needs to read and write memory, add numbers, compare quantities, and jump to new execution.

Eventually, our routine runs and we find that our expenses total $998.  The interpreter uses this answer to finish the subtraction operation from before.  $1000 – $998 = $2.

Therefore, we have a measly $2 to spend on our holiday party and Christmas is cancelled this year.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s