CATEGORY: technology

July 13, 2005

neologism nausea

file under: technology

i've been having some issues with a few neologisms that have hit the internet and blogosphere (ahem) lately.

it's not so much that new words bother me (although some, like nucular, most definitely do). i came to the realization today that it's their origins that can bother me.

take AJAX as an example.

i'm not talking about the popular cleaning product that guarantees it will get your sink and tub pearly white with little to no elbow grease involved. no, i'm talking about the term coined by a notable person at a design firm in san francisco (name and link withheld to avoid unnecessary page rank bloat).

it was coined in an effort to describe a collection of technologies that have been around for a long time (in Internet years, at least). people have been using these technologies for a variety of things (google maps, for example; or even microsoft outlook web access), but they've done so without the comfort of a name to say what it was precisely that they were doing.

and so this design agency author made one up. it's an acronym, although he avoided the dreaded TLA (three–letter acronym) that is the focus of so many consultant jokes. it's also memorable. kinda catchy. almost sounds like marketing.

and that's what it is. marketing.

ever since i first heard the term, it was bugging me. it bothered me that they made up a new term for something that already existed. it bugged me that it wasn't, strictly speaking, technically correct. it bugged me in general, but i couldn't figure out just why. and then i realized why: because they stood to benefit financially from the creation of a new term, something that could become a meme in the internet world. something that everyone would pick up and say, "AJAX? oh yeah! that firm in ___ ___ invented it!"

bzzzzt. wrong. they didn't invent it. they just knew how to market it and their ability to explain it — intelligently and in a ready–for–publication way. i don't fault them for their insight regarding the patterns of usage of this particular technology combination. what i do fault them for is shameless self–promotion. one might say that they were just pointing something out for the benefit of the internet community, humanizing a technology to help it be better understood. i've been in this business long enough to know that's about as likely as a beautiful snowflake in the molten pits of hell.

the other term that caused me to get all twitchy was folksonomy. this is a conflation of "folks" and "taxonomy," meaning a classification system created by normal people (e.g., not librarians or those prone to organizing their socks by color, then texture, then projected lifetime). think of it as the dewey decimal system for crackers (a harsh and not–wholly–accurate analogy, but work with me here).

the idea is a very important one, but the term is just silly. just call it tagging and be done with it, ok? why was there a need to come up with a cutesy term?

***

if youre' gonna come up with a new word for something, make sure your motives are pure. do it because there really needs to be a new word, in my opinion. otherwise, you just wind up looking like a linguistic poseur, and we all know how much everyone hates linguistic poseurs.

Posted by docrpm on 07.13.05 at 12:50 AM | Comments (3) | TrackBack (0)

February 4, 2005

death by digital proxy

file under: technology

a few weeks ago, i ceased to exist.

...

at least, that's what one of my friends thought. like other content junkies who want maximum information with minimum distraction, he uses a syndicated content aggregator (like bloglines or feeddemon or netnewswire) to chow as many headlines as his brain can handle. (for a refresher, see my synopsis below: the wonderful world of blog syndication)

this is all fine and well, even good — possibly great. it's a useful strategy for handling information overload, while steering clear of many of the landmines on the info superhighway (e.g., spam from mailing lists you never read, web sites full of advertising you don't care about, etc.). the problem comes when the content oil stops coming down the syndicated pipeline, as it were.

let's take my site as an example. suppose you subscribed to my site. you go to your aggregator every day to check the latest from the blogosphere (ack), and you notice after awhile that i'm just not writing any more. it's been six months, and not a single post. hmm. interesting. looks like ryan stopped writing; i wonder why?

the problem with our brave new world of mediated experience is that we draw conclusions from unreliable digital proxies. if an RSS aggregator says i'm not writing, then to a lot of people, i'm not writing. maybe i moved to alaska and fell off the grid. who knows? a phone call or an email or a trip to my web site would clarify, but in a world where the sands of time are coated with teflon, it's just too much effort.

and so, from a limited digital perspective, i ceased to exist.

cause of death? carelessness

in my case, my feed died due to sheer carelessness: in my headlong rush to redesign, and to clean up the architectural mess that was causing me to lose sleep, i altered the directory structure on my site. two things resulted: (1) a wonderful simplification in the way my blog files were organized (which no one but me cares about), and (2) a dead RSS feed. dead simply because the file containing my feed moved from one place to another.

sorry...[geek shudder]...my bad.

for reference, here are the proper URLs for the syndicated version(s) of this site (pick your XML format of choice):

if there's any solution to the problem of moved and dead feeds, i couldn't find it. at present, it looks like a pretty messy problem (see technobabble discussion below).

the emergent properties of technology–mediated experience

this might seem like a problem that will affect only the weeniest of the techno weenies. i don't think it's that simple. mediated experience is giving birth to unexpected things; it will affect more and more people as time passes.

the Wikipedia defines an emergent property as follows:

An emergent behaviour or emergent property is shown when a number of simple entities (agents) operate in an environment, forming more complex behaviours as a collective. A system made of several things can host properties which the things themselves do not have...[snip]...Emergent properties arise when a complex system reaches a combined threshold of diversity, organisation, and connectivity. The property itself is often unpredictable and unprecedented, and represents a new level of the system's evolution. The complex behaviour or properties are not a property of any single such entity, nor can they easily be predicted or deduced from behaviour in the lower-level entities.

the internet and everything digital attached to it (e.g., browsers, blogging applications, and RSS aggregators to name just a few) can be viewed as a system of relatively simple (ahem) entities. unpredictable things are starting to happen as we combine and recombine all of the parts of this system, and as we use them in ways no one could have imagined. this is obvious. what i think is less obvious is that our human (social) experiences are becoming a part of this system, and they are being affected as a result.

as we rely more and more on technologies to mediate our experience, we subject ourselves to the vicissitudes of digital systems, and more importantly to what their agents tell us. quotidian changes (like moved files or dead servers) can have broad consequences (both visible and invisible). my RSS feed dies due to a moved file, a friend concludes i am no longer writing, and we lose touch for four months. how would our lives have been different if that hadn't happened? maybe he would have read a particular entry in my blog that sparked a thought that led to an action that caused an event that changed the course of his life (even in the simplest way). this wasn't possible, though, because his RSS aggregator led him to a wrong conclusion.

we suffer and benefit from our reliance on digital proxies. we suffer for their inaccuracies; we suffer because we can't always interpret what they're saying; we suffer for the laziness they engender. at the same time, they enable communication and interaction that wouldn't otherwise be possible; we are richer because of them.

whether or not someone reads my blog is a small, immaterial thing. how many of these small things does it take, though, to have broader social consequences? after all, great events may shape the world, but not without the million small events that make them.

where do we go from here?

we create the proxies; this isn't the matrix and there's no malevolent AI running around trying to do us in through addiction to technology. we are the ones actively mediating our experience. maybe we do it because the benefits seem to outweigh the costs; maybe it's just a matter of laziness. in either case, it seems we would be wise to really think about what we're doing, because at some point, the costs will be too high, and there will be no going back.

...

the wonderful world of blog syndication

a synopsis of syndication
syndication is a method of providing content on a periodic basis to a set of interested readers (or other content providers, who subsequently redistribute). this is usually done with news, but it translates quite nicely to other things. the application of syndication to blogs is simple — anybody can "subscribe" to this blog and get quick access to all the latest headlines (and maybe more).

how syndication is done with blogs
any syndicated blog provides one or more feeds. each feed is really just a Web link to a text file that contains various information about the blog in question (latest headlines, excerpts, author, etc.). every time the blog gets updated, so does the feed. all you need is the address (URL) of the feed, and something that knows how to read the feed, and you're in business, reading blog headlines and digesting the blogosphere like so much digital chicken.

really simple, right? that's why the most popular data format for feeds is called RSS (which stands for Really Simple Syndication, among other things). a few different formats exist (each with benefits and drawbacks); by the way people argue about this stuff, you'd think they were talking about religion. developers regularly get pissy about RSS 0.93 vs. RSS 1.0 vs. RSS 2.0 vs. Atom vs. snerd feebler's really cool syndication format (SFRCSF). you can safely ignore this discussion for the most part (i do; i'm hoping the people smarter than me eventually sort it out and then share their wisdom quietly in the form of stuff that just works).

why bother? who cares?
syndication and feeds make it easy to cover a lot of ground on the web. if you subscribe to 100 feeds (from 100 Web sites), you can pretty easily scan the headlines from all of those sites in 10-15 minutes, depending on how much they publish (this doesn't include any time you might spend reading complete articles you find along the way). so, getting lots of content is one benefit (although cable TV is a clear counterexample to the more–is–better way of thinking). the other nice thing about syndication is that (at this point) it contains no advertising; it also doesn't require you to share your email address to get the syndicated content (it's a pull technology, where you grab what you want, rather than a push technology, where content is pushed to you via email, for example).

technobabble about dead feeds

you'd think that this problem of dead feeds wouldn't be a big deal. we're smart; why not build a better RSS mousetrap so that when a feed disppears, the feed reader figures out if it just moved, or if it's actually dead? good question. there seem to be at least two technical problems:

problem 1: RSS auto-discovery is hard
it's not easy to automagically figure out what the feed is for any given Web domain. in some cases, like yahoo! news, there's more than one feed, which makes it pretty much impossible without human intervention to say which feed disappeared. as a result, if a feed disappears, there's just no simple, automated way to look around its parent domain to see if it moved or if it's indeed dead as a digital doornail. (jeremy zawodny has a good summary of RSS auto-discovery issues ).

problem 2: if you move your feed, telling everyone is hard
with a Web site, it's easy to put up the digital equivalent of a "We've moved!" sign. i wasn't able to locate a universal method for doing this for an RSS feed (read this discussion on feed redirection to see just how nuts this whole thing gets).

Posted by docrpm on 02.04.05 at 11:38 PM | Comments (1) | TrackBack (0)

November 1, 2004

the pain of cleaning computer slates

file under: technology

my new-ish computer (dual processor 1GhZ Mac G4, 1.2Gb RAM, two internal HDs, external HD, peripherals out the kazoo, yadda yadda yadda) started getting wobbly a few months after i brought it home. after an agonizing search for obvious hardware or software boojums, i finally concluded i had reached the point of last resort: complete system reinstall.

these words are enough to strike terror into most people who rely on magical boxes for their livelihood. i procrastinated for days, often sucking my thumb in the corner, rocking gently back and forth, before i summoned the courage to do it.

here, i recount the process of what was actually involved, if anything so that i can remember everything i did if i ever have to do it again <shiver>.

...

computer manufacturers will often tout the simplicity of their respective systems, how easy they are to install and maintain, blah blah blah. as my friend john would say, bollocks. reinstalling any of these systems is a monumental pain in the proverbial behind.

i have my system set up to do some things that your average user would never do, which makes the reinstall that much more protracted and byzantine. keep that in mind IF you decide to wade through the laundry list below...

TASK 1: back to hard disk basics
given that i didn't know what software was corrupt or how its dirty little fingers were fouling my system, i decided to completely wipe my primary hard disk to start with a clean slate.

  1. spend about three hours backing up all crucial information to external hard disk, including data for all user accounts, system-level application support, web server data, and MySQL tables
  2. double-check to make sure you got everything
  3. shutdown and reboot using upgrade install disks
  4. forget to eject external Firewire hard disk
  5. remember that disk can be ejected from install-disk disk utility software, and eject it
  6. encounter problem with installer disk, which says it can't install over an old version of the OS
  7. make the brilliant decision to wipe the disk so that it can install on clean partition
  8. realize while disk is being erased that the boot disk is an upgrade installer that will NOT install on an empty partition
  9. panic
  10. call ryan
  11. realize that i had another non-upgrade installer disk
  12. figure out how to manually eject a CD on reboot, which then interrupts boot cycle and allows me to insert the proper install disc
  13. install disc 2 is not recognized
  14. panic
  15. re-insert install disc, at which point it is recognized
  16. customize installation to bypass installing unnecessary fonts and language options
  17. begin installation
  18. restart half -way through at installer's request
  19. complete installation with annoying registration wizard

TASK 2: get up to date
apple releases numerous system patches and updates for every major rev of their OS. every time you reinstall, your install disks are no doubt out of date, which requires reinstalling all system updates...

  1. install base system updates (package updater + a few odds and ends)
  2. restart
  3. run update tool again to install more system updates that couldn't be installed until the OS patches were up to date (chicken and egg problem)
  4. restart

TASK 3: reinstall software
ok...now the operating system has been installed, and you've got a nice clean system with no user accounts, applications, or anything useful. you've got to reinstall all of your software.

  1. using a combination of install disks and about 20 different saved installers (backed up on external disk), reinstall all the stuff that you can
  2. spend at least an hour trying to dig up registration and serial numbers for all of the apps you just installed; many of these will be located in old email messages since you purchased things online...
  3. Reconfigure email to look up old email that has registration information
    • Copy email preferences to pull all POP server information for all email accounts
    • forget the password for most accounts – hey, why is it asking for this?
    • remember that all Apple passwords are stored in a keychain
    • copy keychain back to user account
    • verify email now works again....it does.
  4. enter all of your serial numbers so you can use what you just installed
  5. don't forget to install the developer tools, which you will need later to install super tech weenie stuff

TASK 4: reconsitute user accounts
there were three user accounts on my old system, at least one of which was no longer valid. gotta get everything back to normal here, too.

  1. copy core data back to my account, assiduously avoiding all those damn preference files that might be corrupt
  2. copy a few preference files that are necessary (e.g., terminal)
  3. copy necessary application support materials (iCal calendars, address books, cookies, bookmarks for Safari, etc)
  4. create new user account for elaine, and copy her data wholesale (since her account was not corrupt)

TASK 5: the devil is in the details
there's all sorts of stuff under the hood that you tend to forget about. you set it up once and you never look at it again. you will need to remember when you reinstall...

  1. set up desktop preferences, general settings, and all the other stuff i didn't copy over because i was paranoid that my system preferences were corrupt
  2. use NetInfo manager to enable root access; create root password
  3. as i use applications, copy over old preference files where necessary to restore things to a usable state
  4. turn on the Web server by enabling web sharing
  5. do about 27 other little things that i'm surely forgetting

TASK 6: don't forget about your peripherals
did you forget about your printer? your scanner? of course you did.

  1. reinstall printer drivers
  2. reinstall scanner drivers
  3. configure printer as default
  4. plug in scanner and discover that it's not being recognized after install of drivers
  5. restart
  6. scanner now recognized, and peripherals seem up to date...

TASK 7: oh yeah, super tech weenie odds and ends
this is the stuff that i always dread, because it usually involves getting down to the command line and reinstalling/reconfiguring a bunch of things that require lots of tweaks to get just right. most of this is stuff that would only be used if you needed to run a database-driven Web site on your machine with some kind of middle-tier scripting language (in my case, PHP).

  1. install MySQL 4.1.7
    • install base package
    • install startup item modifications
    • re-initialize grant tables, permissions, etc.
    • migrate backed up MySQL data
  2. install fink
  3. install libraries for PHP 5.0.2 (the latest and greatest)
    • libjpeg (with fink)
    • libpng (with fink)
    • libtiff (with fink)
    • zlib (by hand)
  4. install PHP
    • create configure script
    • twiddle knobs on configuration parameters so that make recognizes locations of libjpg and libpng
    • make and encounter numerous mysterious errors related to libxml
    • perform Google search
    • upgrade to libxml2 (with fink)
    • make and make install for PHP
    • copy php.ini to /usr/local/lib
    • modify httpd.conf
    • restart apache server to verify integrity of PHP install

***

boring boring boring. like i said, this entry was mostly for my benefit. i can never remember to do all of this stuff, and so each time i reinstall (and it's happened at least twice in the last 2 years due to system upgrades), i go through the same discovery process again. normally i like discovery, but in the case of OS installs, i prefer a complete lack of mystery.

Posted by docrpm on 11.01.04 at 10:26 AM | Comments (2) | TrackBack (0)

October 26, 2004

brain surgery made easy!

file under: technology , thoughts about things

"Yes! You too can be a brain surgeon, with the new Brain-O-Rama surgeon's helper, a revolutionary new tool from the makers of the incredible Gung-Ho knife!! For just $49.45, you get the Brain-O-Rama scalpel, a rubberized dummy to learn your way around the skull, and complete instructions with helpful anatomical diagrams. You'll be taking care of tumors in 30-days or less, or your money back!!!!!"

it seems like i'm being ridiculous. i am. and so are half the people trying to sell the latest [insert noun here] made easy products or books or tools or 12-day-tutorial-magic-or-your-money-back courses.

just because i know where your prefrontal cortex is, or because i've heard of broca's area, you wouldn't want me cutting into your brain with the best scalpel in the world. it wouldn't make any difference, even if i had read the Dummies book and had seen "Extreme Autopsies" on FOX last week.

and yet people keep talking about making hard things easy, and others keep falling for it. books keep selling that demystify the mystical and show how, gosh, well, it turns out that brain surgery is easy after all, and we were just foolin' ya so we could keep the money for ourselves (ha!).

i could make jokes all day long, but i believe this kind of behavior, and the thinking behind it, has consequences. it devalues the effort required to create things of value or utility, or to provide important services. in turn, it reduces the perceived value of the fruits of these labors. it cheapens the world and destroys our appreciation of people and the beauty they often create.

...

i saw a Web site this morning advertising a software product with the tagline, "Web application development made easy" (company and product name withheld, since i'm sure it's a fine product made by nice people). the use of the word "easy" implies that anyone could do it, even my Grandma. if they had used the term "easier," this would have implied that it might actually be hard in the first place, and their tool was here to help, by gum.

even though it was probably just a marketing decision to position their product as they did, it struck me that people often think that things should be easy, could be easy. well, sometimes they are and can be, and we make them harder than we should. sometimes, however, they aren't (easy) and we can't (make them easy), regardless of how we might try.

some people really need to face the music – a lot of things in life are hard and require effort. there are no shortcuts. the people who do these hard things have usually arrived at their skill after taking a long, bumpy road full of toll booths that don't make change. architects, craftspeople, engineers, doctors, teachers – they're all professionals who worked to get where they are (maybe even struggled). society benefits from their skills, and they in turn reap the rewards. they shouldn't give it away for free, because it's worth something.

on the flip side, there shouldn't be an expectation that anybody can pick up a book and suddenly wield the equivalent of a scalpel – it insults the craftspeople or engineers or doctors who do it for a living, and puts the scalpel–wielder in a pretty awkward position.

no one would claim, of course, that "Surgery for Dummies" would ever be a best-seller, and yet the thinking seems to be different when it comes to the digital world. somehow, because it's not tangible or because it's new or because your kids seem pretty good at it, it's something that anyone could just pick up and learn and Presto!, instant Web designer.

i keep working for clients who are under the mistaken impression that building Web sites is easy. while it's my job to disabuse them of this notion, to help them understand the bits and bytes, as it were, there are times when the process becomes frustrating. through it all, the "idea of ease" seems implicit in the hearts of many businesspeople — it's really quite straightforward and will just sort of "work out" in the end. 50-page web site in one week with two developers, one of whom is actually a technical writer in the marketing department? no problem!!!

it is a problem.

and yet companies do this, over and over and over. anyone in marketing who has ever surfed Google is suddenly an expert in online advertising strategy. ever heard of Dreamweaver? hellooooo, Web developer! ever cropped a picture in Photoshop? good – you're our graphic designer. budgets are stretched, and people are forced to "step up," which is just a corporate euphemism for doing a job for which you aren't qualified or trained.

i'm exaggerating slightly, but the scenario i've painted above isn't far from the truth in much of corporate America. people seem to think the Web is different, that it's easy, that no rules apply. wrong – Web design and development are crafts and skills like any other.

the problem is driving away much of the talent from the Web, maybe in the same way that the craftsmen of old were driven away by mass production of (lower quality) goods. based on discussions with friends in the business, the business-view of Web design and development is gradually crushing people under its profit-driven wheels. many people got into the business because they felt the excitement and the potential, because they loved designing and creating new things they believed were useful or cool or interesting. some people were in it for the money, too, but that doesn't negate other motives.

these days it seems that building web sites, in most cases, has very little to do with creativity. it has everything to do with cold, hard business reality, and the incomprehensible short-sightedness that often goes with it.

fine — it's a job. get over it, you say.

you're right, of course. it is just a job. as my friend Gene says, we're not saving lives here.

what we are doing, in my opinion, by falling for the "ideas of ease" described above, is thoroughly commoditizing the process of Web design and development, along with a lot of other things. people are squandering much of the Web's potential and reducing its ultimate value, instead aiming for what's perceived as good enough (the 40% solution, in most cases).

good design (in all of its forms) will hopefully never go away, as long as there are people passionate about practicing it. the stage on which good design plays, however, seems to be getting much, much smaller, on the Web and elsewhere.

Posted by docrpm on 10.26.04 at 7:57 AM | Comments (0) | TrackBack (0)

October 13, 2004

look ma, no tables!

file under: technology

exclamations of the general form, "look ma, no [insert noun here]" are invariably followed by disasters of one variety or another (e.g., broken limbs, scraped knees, poked out eyes, hindenburg-style vapor cloud explosions). they indicate a certain hubris on the part of the utterer, and mother nature is not one to let these sorts of things slip by unchecked.

...

yesterday, i said "look ma, no tables!" after i had built a nice, standards-compliant web page without the use of tables. today i experienced the concomitant disaster (although digital disasters are usually not as nasty as broken arms, at least not for the coder).

as my template was reviewed by those who needed to use it (read: the client), it became obvious that the tool with which it was going to be modified, Dreamweaver MX (read: lousy piece of @#$*), has a bad rendering engine (read: an old version of Opera) that fails to properly parse a lot of CSS. tricksy rendering engines - we hates them! we hates them all!!

and so, tail between legs and nice separation of structure and content out the window, i proceeded to soil my pretty page with tables to create the proper layout in DW MX so that it could be modified by the client's overworked, understaffed, laterally skilled web development collective.

my apologies to jeffrey zeldman. i didn't have the time (or software) required to debug my page in a "browser" that's not even a browser.

i can hardly wait for the day when i can say, "look ma, no browser!," but i know mother nature will be waiting for me...

Posted by docrpm on 10.13.04 at 5:58 PM | Comments (0) | TrackBack (0)

September 27, 2004

arcodology

file under: technology , thoughts about things

arcodology (n.): the black art of code examination and analysis, performed during software upgrades and/or web site refreshes. arcodologists sift through tangled code fragments, often (but not always) of unknown origin and authorship, in search of meaning, enlightenment, or any shred of code that can actually be re-used. See also frustration, laziness, and cruft.

...

i spend a fair amount of time writing code, and often have to re-write stuff that someone else has written. it keeps me awake at night, thinking about all the terrible code out there (including my own). it reminds me of one of my favorite computer geek quotes:

If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.
   – Weinberg's second law

the code you can't touch is the worst
here's the scenario – you're working on a web site redesign, and you've got a week until launch. you're only supposed to touch these pages (not those), and don't change any of the nav framework, ok?

ok. no problem. only a few pages to code? easy. until you look under the hood and see HTML riddled with more font tags than MicroSoft FrontPage from 1996 <shudder>. and let's not even talk about using single-column, single-row tables to do god-knows-what.

c'mon, people. it's 2004 (almost 2005). let's at least get rid of the font tags. please.

or is stuff you can change worse?
the code you can change might even be worse, because if you're an anal retentive code snob like me (i can see the comments already...), you just have to change it so you can sleep. time never permits, of course, so you struggle through the night, tossing and turning, thinking about those crufty CSS files still sitting around that you just didn't have time to fix <shudder, again>.

so what about this site, mr. code weenie?
the HTML for this site sucks. so does the CSS. i should know – i wrote it. it's not standards-compliant, it doesn't validate, the CSS is inelegant, and a lot of it is just a plain HTML–table–hack job. i want to rewrite it, now that i've read Zeldman's standards book. he has inspired me to make the time to do it.

it takes time. it takes effort. it's worth it.

Posted by docrpm on 09.27.04 at 5:51 PM | Comments (1) | TrackBack (0)

being the hydrant for technology's dog

file under: technology

some days you're the dog, some days you're the hydrant.

this piece of wisdom was passed on to me some time ago, and i've found it a useful mantra. it helps remind me about life's little ups and downs.

for the past few days, it's been technology that's the dog, and i've been the hydrant. so i'm just gonna vent the old spleen a bit, and move on to smaller and better things.

...

ansgt part I: the woes of software upgrades, aka "it used to work"
in my job as web [insert vague, new economy adjective here], i use a lot of software. sometimes it uses me, too, but we won't go into that. one tool that i use all the time is BBEdit, a great piece of text editing software if ever there was...

now, bbedit upgraded recently. they added some really essential, neato stuff (like a document drawer to mimic tabbed browsing), so i plunked down my upgrade dollars and bought the new version (my second paid upgrade this year, i might add). with a flutter in my heart and bumble bees in my fingers, i installed and started bbediting, and boy was i happy.

until i realized it doesn't quite work the way it used to. in fact, it's sort of broken. ok, let's call it the way it is – at least one major feature appears almost completely broken, downgrading me from a sledgehammer to a spoon.

i bought it last week, and there's already an update (read: patch). i'm suspecting there will be several others. fortunately, these will be free for awhile until we get beyond the we-rushed-to-release-a-product-and-didn't-QA-enough stage.

it just makes me mad. i know software development is hard, but don't break stuff that already worked...

ADDENDUM: i submitted a bug report to Barebones Software, and not only did they respond promptly, but they fixed the bug, came out with a new maintenance release (8.0.2), and then notified me. now that's what i call service!!

angst part II - the saga of the sick mac
i upgraded my mac recently. it's pretty swell, except for when it crashes, or when all my Apple applications stop working for no apparent reason.

the culprit? i have no idea about the crashes. i'll take that as a total mystery and just live with it (for now). as far as the apps, i have yet to discover the true problem, but i know what fixes it – dumping my font cache. now there's a logical connection if ever there was one: applications won't open? clean your font cache!! grrr.

deal with it, or....??
that's the only resolution i can see: deal with it.

i work with computers all day, every day, and can deal, for the most part (present rant = part of dealing with it, thank you). but what about people who don't? it's no wonder my mom won't get a computer – she's terrified of doing something wrong. look at your computer sideways and over the edge to meltdown it goes.

computers have come an amazing distance in the last 20 years...unbelievable, actually. one might think that over time, things will get more stable, more solid, and nothing will go wrong. one might be right, but i suspect one is more likely wrong.

computers (and home-computer environments, in general) seem to be systems with emergent properties. it's impossible to imagine (and test for) all possible permutations of computer hardware and software (built with very little quality control or standards). let's not even talk about all the possibilities for user behavior. as a result, crazy things happen that no one can predict, and this often means that things crash, applications stop working, bad (and often expensive) sh*t happens.

programmers at the application layer can try to build fault-tolerant apps that can handle exceptions and errors, but they can't control anything that happens with the hardware. similarly, some guy at Intel designing a chip can only do so much when it comes to stopping programmers from writing bad code (another post coming about this soon).

my personal prediction is that computers will remain fragile as they get more and more complex, and as we ask them to be more powerful, more central, multifaceted tools. it's pretty hard to imagine don norman's invisible computer, or the time when we come to think of computers like we do the telephone.

don't get me wrong – computers are transformative tools, and our lives are enriched because of them. but they still really piss me off sometimes.

i think i'll go read a book.

Posted by docrpm on 09.27.04 at 5:30 PM | Comments (0) | TrackBack (0)

June 10, 2004

browser compatibility - theory and practice

file under: technology

the following entry is a (sanitized and expanded) version of an email i recently sent to a client explaining some of the issues surrounding browser compatibility and web developement. it's amazing these issues persist after years of slowly grinding towards a world of web standards...someday, i hope these ideas will seem quaint: "oh, how cute! they used to have to worry about those things..."

...



<BEGIN EMAIL>

browser compatibility is a complex issue. it involves balancing the desired user experience with the realities of development timelines (often short), testing facilities (often unavailable). designer desires (usually extravagant) and the wild, wild west of web technology (where there are no rules or standards that anyone has to adhere to).

A definition of browser support
"Browser support" means that a user coming to the site will have the desired experience: the look and feel of the site will be as intended, and all elements of the site will be functional. (See note on expanding the definition of browser support)

It's important to limit the scope of supported browsers
Why do we limit the scope of browser support? Why not just support everyone and everything? The answer to this question is simple - it's difficult, time-consuming, expensive and occasionally impossible to support everything (given the definition above) (see note on draconian branch strategies). In some cases, it's not even possible (for a given browser benchmark) to achieve support for a combination of visual design and functionality. Browsers use technology to render visual and interaction design over the Web, and this technology acts as a constraint on that design, which produces a feedback loop, causing browsers to change design. In addition, in many cases, supporting extended browser compatibility dramatically increases the complexity and scope of a Web site (e.g., different code for different browsers and platforms, etc.).

User experience will vary depending on browser type
There are a few different types of experiences people will have when visiting [insert site here], depending on their browser.

  • Explicitly "allowed" browsers: The site should look, feel, and function as desired. The browsers we currently allow explicitly are:
    • Safari, Internet Explorer 5+, Netscape 6+
    • Firefox, Camino, or any browser based on the Gecko rendering engine
    • Any browser that supports the modern Document Object Model (DOM) [NOTE: in some cases, support for the modern DOM does notabsolutely guarantee a perfect user experience. nonetheless, we elected to try to support all 'modern' browsers even though we don't/can't explicitly test for all of them]
  • Unsupported browsers: Users are directed to an upgrade page which requires them to upgrade their browser before entering the site. Users with unsupported browsers cannot gain access to the site (see note on draconian branch strategies. At present, we explicitly redirect users with the following browsers:
    • Internet explorer: Versions 3 and 4 (Current version: 6)
    • Netscape navigator: Versions 2, 3, and 4 (Current version: 7.1)
    • AOL: Versions 3, 4, and 5 (Current version: 9)
    • Opera: Versions 2, 3 and 4 (Current version: 7.5)
  • User agents not explicitly excluded or allowed: In many cases, it's difficult to know exactly what browser (or to be technically correct, user agent) is visiting the site (see note below: User agent identification). In these cases, the user agent is allowed to visit the site. They may or may not have an optimal user experience, depending on the type of agent. For example, Opera 7.5 on the Macintosh is allowed, but it has a bug in the Flash plug-in that causes the home page animation to appear incorrectly.

Defining an acceptable level of support
As described above, a certain fraction of users visiting the site will either be redirected to an upgrade page, or will have a sub-optimal user experience for one reason or another. The most difficult question to answer is, how many visitors fall into this category and what is acceptable??? The technical benchmark stated above was selected to minimize the number of users with unsupported browsers (< 1-2% of site visitors). It's really a business decision as to the order of magnitude of this number. As a side note, establishing a rigid metric (e.g., 1.2%) is dangerous, because statistics regarding the number of users with a given browser should be taken with a grain of salt (see note on user agent identification below).

Can we increase the level of browser compatibility and support?
Yes, but there are cost and development implications. This is ultimately a business decision related to the ROI associated with increasing the level of support. How much do you gain by consuming valuable development resources to support a small fraction of users? Does the importance of those users outweigh the cost?

Can we change the way we deal with unsupported browsers?
Yes. We could alter the way we handle unsupported browsers. For example, we could simply display a message that says "This site optimized for viewing in X, Y and Z browsers." While this may seem an attractive option, it could have unintended consequences (i.e., a page is broken so badly in a given browser that the user either doesn't see the message or just decides never to come back to this 'unprofessional' web site). We could also redirect to an upgrade page that then provides both links to upgrade and a link that says "Show me the site anyway", which then allows users into the site with an old or broken browser. In many ways, browser support is a "pick your poison" problem - most solutions have benefits and drawbacks.

A final note about testing
We tested [insert site here] as much as we could, given the twin constraints of time and equipment. Based on my experience, there is a theory and a practice associated with testing. In theory, one establishes a list of things to test, and then tests them. In practice, testing can be very costly, difficult, and time-consuming. In some cases it requires capital expenditures that may not be consistent with project budgets (e.g., to buy equipment, to build a QA lab to perform testing, or to outsource testing to another QA provider). The current [insert site here] was built very rapidly, with very little time (or budget) for testing, and with no universally accessible, dedicated testing equipment. We did not have the machines necessary to test against all possible combinations of supported environments (e.g., no Linux box, no Windows 2000 box). In fact, testing user environments is extremely difficult to do given the "combinatorial explosion" that occurs when trying to test every possible permutation of browser, platform and OS version, installed plug-ins, etc..

***

I realize this is a long description of what seems like a simple issue. Unfortunately, the Web technology environment is chaotic, and the issue of browser compatibility lives at the heart of this chaos.

best regards,
rPm

ADDITIONAL NOTES

user agent identification
It is common to write code that examines the "user agent string" for the browser and then tries to determine what the user agent is based on this identifier. Because these identifiers are unregulated, many user agents masquerade as something else (e.g., WebTV "looks like" Netscape navigator to many browser detection scripts). It requires a careful analysis of the user agent string to make an accurate determination. there are literally thousands of user agent strings for hundreds of user agents on the Web. for a sample of what user agent strings look like, look at this list of user agent strings for mozilla, which shows only the user agent strings for "Mozilla" browsers (a terribly misleading umbrella term that subsumes Mozilla, Firebird, Camino, and Firefox, all of which use a similar rendering engine developed as an open-source project by Netscape).

as you can see, the list is ridiculously long. there would be no way to test for all of these combinations, so we wind up looking at these strings for commonalities. in some cases, when looking for a commonality, one can overlook the part of the string that differentiates two user agents. for example, many browser detection scripts misidentify Firefox as Netscape Navigator 6 (which actually identifies itself as Moziila/5.0 - netscape skipped version 5 browsers, for some reason, yet included this in their user agent strings). because Navigator 6 was a browser with severe flaws in its rendering engine, this can often lead to problems...

<END EMAIL>

EXTENDED NOTES

Expanding the scope of browser support
it is possible to expand the definition of support to include people who can use a site for its intended purpose, while not necessarily seeing it exactly as it was intended. for example, a bug in Flash transparency in OmniWeb on the mac means that the home page looks funny. other than that, the site is perfectly usable and looks as intended. in a worst-case scenario (e.g., Netscape 2 running on Windows 95), the site might look atrocious, but if the navigation is all functional, and the content is readable, this could be considered semantic support.

many companies are loathe to adopt this approach, because it means that people could see a site that's not an adequate representation of the company brand. it is for this reason that it is sometimes more desirable to adopt what might be perceived as draconian branch strategies (see next note).

draconian branch strategies
some might say you shouldn't be so draconian, forcing people to an 'upgrade your browser' page. some might further argue that every site on the Web should support every user agent, with support defined on a sliding scale. for example, if you've got a really old browser, you can still see the content, and ideally you can access all key information, but you may not be able to experience the site as desired by either its parent company or designers.

this is fine - in principle. however, the situation changes when your client's CEO gets a call from technical support saying, "some high-powered VC using the original beta release of joe's web browser couldn't see the home page, and now he's going to poison our funding well." when said CEO or CTO hears something like this, their immediate reaction is to "fix it." in a heartbeat, the whole sliding-scale approach to briowser support goes out the window. this is just the way it is. in addition, when you're being told the site has to launch next week, you may not have the time or ability to even guarantee semantic compatibility across the spectrum of browsers and user agents.

another, more subversive, reason lies behind adopting a more draconian branch strategy - getting people to upgrade their browser increases the rate of convergence towards a standards-compliant world. it may seem like an attitude insensitive to those people with older browsers, but:

  • browsers are FREE
  • the only way to get to a world where standards mean anything is to let go of the chaos the past
  • people can hold on to the past and use old technologies, but they have to take responsibility for making this decision. an automotive analogy might help. consider someone who wants to drive a Model T Ford. this is a conscious decision to use an old technology in a world that has clearly moved on. said Model T driver wouldn't be able to drive on a modern freeway because their car is too slow. do they have a right to ask for a reduction in the speed limit? of course not. no one would consider this reasonable because there are laws set to govern the use of the roads, and the laws (supposedly) favor operational norms in society. there are no such laws on the Internet; there are only standards (which are really unenforceable guidelines).

what about accessibility? this is another large issue that would consume a whole entry in and of itself...of course accessibility is important. a site should (in principle) be capable of being rendered in a text-only browser. the contraints that come into play (again) are the realities of corporate Web development in a world where there is limited time and a limited budget. it really sucks that some people get the short end of the reality stick; i wish it wasn't this way, but sometimes it is when it comes to building businesses on the Web.

but but but
there are a thousand "but" arguments that you could throw at everything i've said. my arguments aren't all airtight; these are largely my opinions after having spent the last 6 years building Web sites. in fact, i don't believe there are any airtight arguments about browser compatibility. ultimately, browser compatibility comes down to the reality of what it takes to build a Web site. it's all theory and practice:

in theory, there's no difference between theory and practice. in practice, there is.
  - Yogi Berra

that's what it's like to build a Web site. everyone wants the theory, but all you've got, at the end of the web developer day, is practice.

Posted by docrpm on 06.10.04 at 10:43 AM | Comments (1) | TrackBack (0)

March 25, 2004

robbery and the net as amplifier

file under: technology

i've been following a thread about a fairly well-known blogger who was recently robbed. she has been chronicling some of her feelings (see why my robbery matters), and also pursuing the perpetrators, online.

it makes me wonder about a few things, and reminds me of when i got robbed...

...

net as amplifier
social networks can be used as a means to willfully (or unintentionally) destroy or enhance reputation. i believe this to be true independent of the internet; the net just makes it easier and faster for those with the connections.

the aforementioned blogger posted pictures of her robbers on her blog. i am certain that she did this with the best of intentions, but there could be unintended consequences to her action, and the social network amplifies them (both the good and the bad). one could imagine scenarios where one or more of the men in those photos were drunk, irresponsible, yet innocent, bystanders. i am not saying this is the case, but rather pointing it out as a possibilitiy in other situations.

in devious hands, this process of net amplification could be used to smear and distort people's actions and events. in other hands, it could accelerate the wheels of justice to good end.

the net is no different than other media channels in this sense - it is a neutral vehicle that is ambivalent about the quality or accuracy of the content it carries.

i got robbed once
fwiw, i experienced something similar...i had my entire bank account drained in a matter of days after a debit card theft.

through serendipity and deduction, and with the help of the police and others, we were able to apprehend the "thug." our perp turned out to be none other than my next-door-neighbor's girlfriend. she was a blond, white, 26-year-old woman, who also happened to be a closet sociopath / bulimic / cocaine addict who had been robbing friends and family blind and pilfering identities wherever she went; she finally got caught, but left quite a wake of wreckage behind her.

after the sting in which she was caught, i met with a us attorney to provide a statement. this woman stole a total of roughly $70,000 and caused pain to dozens of people spread across several states. how long could she go to jail? the us attorney consulted a table (plot depth of crime vs. type of crime vs. how many offenses) -->2-3 years, maximum (it's all a formula).

i went to her sentencing hearing. justice was served, such as it was. she cried at her hearing, where she was sentenced to 1000 hours of community service, a year in a halfway house with an electronic tracking bracelet, and drug rehab programs. she also had to pay back all she stole (mostly to financial institutions who had already reimbursed their clients).

when i walked out of the hall of justice, i felt no satisfaction. i felt hollow and sad for the desperate hell of that woman's life, one that i couldn't really understand. she committed a crime and needed to face consequences, but prison wouldn't have helped her. who knows what happened to her after a year spent wearing that bracelet.

i drew my own mixed conclusions from those experiences, and they're harder to articulate than i would have imagined. after all was said and done, i'd say forgiveness provided me the most lasting and meaningful resolution.

Posted by docrpm on 03.25.04 at 12:05 AM | Comments (0) | TrackBack (0)

March 22, 2004

ready-fire-aim in social network services

file under: technology

with more social networking tools than you can shake a mouse at (see Judith Meskill's list for proof), there are bound to be some real losers. so far, i'm not sure if there are any winners, but that remains to be seen...the night it still young.

one of the main problems here is the "build it and they will come" mentality (as opposed to finding out what people need first, then building somethinig to meet that need). danah boyd articulates the issue well.

in addition to her other arguments, she asks, what problem do we have that social network [tools] give us insight into? insight is important, and i think the failures of the current crop of applications do give us insight. among other things, they make it abundantly clear how difficult it is to model human relationships with things like ontologies or controlled vocabularies (Clay Shirky has made this point recently). this wouldn't come as a surprise to many people.

in addition to insight, the question of value is central, in my opinion...if social networking tools solve a problem that's meaningful to people, then they deliver value.

if one agrees that there is a horse-cart inversion going on here, there's another question that follows: why are so many intelligent people building things with questionable (or unknown) value, flawed logic, or just plain silly assumptions?

money is the first and most obvious answer. a lot of the visible activity is probably just bandwagon jumping because social networking software is the coolest thing since internet incubators or selling pet food on the web.

another explanation is more satisfying to me...people have an intuitive feeling that social networking applications are new and exciting and can offer something valuable beyond just making friends or getting dates.

social network applications are exciting candidates for systems with emergent properties. the simplest way to discover these properties is to build first, watch things emerge, and then refine and rebuild once you have a better idea of how these things are really useful. granted, this may not be the best way to do things, but in the absence of other approaches, it's the occam's razor solution.

in essence, developers are building sociological laboratories on the net, turning people loose, and watching the results. this is implied in eric schmidt's statement about google and social networking apps: "Social networks will get better as we figure out what problem they're intended to solve."

ok, he probably means social networking tools, but even so...there is an assumption being made here that social networking tools are necessary, that they are intended to solve any problem.

time will tell. after all, human beings have been doing reasonably well without social software for thousands of years (modulo things like war, of course). so relationship software exposes the thousands of connections each of us shares with other people...it makes us see how we're hyperconnected.

does it follow that social network software makes us better? or does it just make some things a little easier? is it evolution or revolution?

the next time someone starts frothing at the mouth about friendster or orkut or whatever, ask them that question...i'd be interested in their response.

Posted by docrpm on 03.22.04 at 3:05 PM | Comments (0) | TrackBack (0)

March 15, 2004

anti-social software ... buy now!

file under: technology , thoughts about things

ok...this cartoon is really, really funny (if you're a geek or someone who's a little fed up with all that email from friendster or orkut or YASNS).

Posted by docrpm on 03.15.04 at 12:06 PM | Comments (0) | TrackBack (0)

March 12, 2004

orkut's velvet rope

file under: technology , thoughts about things

social networking is all the rage, or at least it seems to be. i keep hearing about it everywhere i turn (NPR, friends, blogs, san francisco magazine, the checker at the grocery store). i have my doubts about most of these players, but there is a recent entrant that pushes a different set of buttons: orkut

...

orkut was one service with which i was not familiar (negative web geek points). it's a "google-affiliated" social networking application that, from what i read, is a combination of friendster and tribe and ryze and [insert other social networking app here]. there was recently a party to celebrate the launch of orkut, and reading the descriptions made me slightly queasy.

bubbly anyone?
the social networking craze definitely seems a bit "bubblish". whenever i hear about launch parties with people slapping each other on the back for being www-celebrities, and launching a service with no viable business model or clear value proposition, alarm bells go off and i feel like puking over the side of the boat. commentators and reporters have been noting the hype factor for some time now, and don't take this quite as seriously as insiders do...

still, one gets the sense that VCs are swimming these waters like sharks, and people are thinking they can build the first money-printing machines of the 21st century web. i have no doubt that money will be made, but only by relevant applications that offer something beyond vague, and possibly undesirable, promises of expanding your network of friends.

the net's velvet rope
the thing that troubles me about orkut is the thing that some are claiming is so cool: it's invitation only. for me, this feels like the net equivalent of the velvet rope at some too-cool-for-school metropolitan club. no one likes those either, except for the people who get behind the rope and feel more socially relevant as a result. eventually these ropes become more fit for the gallows, in my opinion - exclude and die.

but it's about the community
forgive me if i'm being impolitic, but...bollocks.

the orkut web site proclaims the following:

"We'd love to immediately include everyone who wants to participate; however, we're also trying to ensure that orkut remains a close-knit community. Over the next few weeks, hopefully, the network will grow to a point where everyone who wants to join has the opportunity to do so."

one source indicates that orkut has around 130,000 users at present. in what sense is this a close-kit community? studies have shown that the social channel capacity of humans is about 150 people (see Malcolm Gladwell's Tipping Point, p.179 and references cited therein)...that's to say that 150 people is about the maximum number of individuals with whom we can have genuinely social relationships.

a service like orkut was destined to grow rapidly. it follows from basic six-degrees-of-separation arguments (see The Small World Experiment and related links for more information, if you're unfamiliar with the concept). the people who built the system had probably watched friendster, and knew what they were up against, especially with the name of google floating in the background.

my point is that it's ludicrous to say that a web service (a la orkut) is invitation-only in order to maintain a close-knit community. there is no such thing in a social networking application like orkut, one that has the power of google behind it and that grows to hundreds of thousands of users in the space of a few months. many communities will exist within the orkut meta-community, defined by the relationships between its members, but there is no orkut community per se - a city, maybe, but no community with close social relationships amongst all of its members.

exclusion, hype generation, and business obfuscation
the invitation-only requirement on orkut simultaneously creates a sense of exclusion (for those not invited) and exclusivity (for those who are). i can come up with a few possible reasons for the invitation only policy:

  • limit the service to the technorati elite: this doesn't make much sense...even with an invitation only policy, exponential growth through densely connected social networks would guarantee that many members would not be a part of this ill-defined group
  • limit the service to those who really care about networking, as opposed to tourists: this is equally unlikely. in fact, by making it exclusive, they have probably generated more interest among people likely to be tourists. i can just imagine the party conversations..."are you on orkut?" [pause] "orkut? what's that?" [pause] "oh...you don't know about it?" [person A drops person B's hipster quotient, person B wants to get on orkut to regain status]
  • generate buzz and differentiation from the anyone-can-join melee (and concomitant growth issues) happening at friendster: this seems much more likely...after all, everyone is on friendster, right? who would want to join such chaos? [you wouldn't, probably, but perhaps for another set of reasons beyond the scope of this entry.] orkut is a little late to the party, and they want to generate interest. it just seems to me that this is an insulting way to do it, one that sends the wrong message to people interested in being a part of new online communities.
  • manage the (pedestrian) technology problem of scaling: this is possible, but depending on how deep the google affiliation goes, it's hard to imagine. google has to have an infrastructure that could easily support something like this without a great burden on their systems. maybe i'm wrong...

the real reason for the invitation only policy is unimportant. what matters in my mind is that the stated reason is so blatantly false...this immediately leads to the conclusion that there is some other reason that's not being stated, and that it probably has something to do with manipulative marketing, generation of (possibly undeserved) hype, and the obfuscation of true motives. business are by no means required to state all of their objectives to the marketplace, but at least come up with more convincing lies (or don't say anything at all).

the power and peril of open doors
in my partially informed opinion, social networking applications should allow anyone to join, but should transparently protect users from scale issues and other problems associated with mob dynamics (see Shirky's a group is its own worst enemy). if you want to facilitate more close-knit communities, then allow users of the system to create invitation-only groups. what they do in those groups, and what value they derive from them, is their business (provided that said groups don't violate other laws etc. etc.).

the problem with any open door policy is that sometimes the "wrong" people come through the doors. maybe they're not cool. maybe they don't have the interests that organizers were hoping for. maybe they're disruptive or crude or insulting or generally ill-mannered. but that's ok - sufficiently stable, open communities have ways of dealing with people who actively disrupt and act to the detriment of the community (e.g., charters, rules of conduct, etc.). they also welcome and benefit from diversity and openness; this is their strength.

closed-door groups deal with this problem in a more proactive way - they set up a barrier to keep others from getting inside the walls in the first place. their community is defined more narrowly. in some cases, for very small groups of individuals with a focused interest, this makes sense (e.g., ex-employees of the Acme Widget Design Company of San Francisco). in other cases, closed-door groups act more like country clubs - the requirements for entry have little to do with shared interests, and everything to do with things like economics or social status.

the (un)egalitarian web
one of the things i always liked about the internet was the sense of egalitarianism (perhaps it's illusory, but the spirit seems there). everyone had access to about the same content and interaction potential, with exceptions for places where public/private distinctions make clear sense. blogs and social networking applications have been expanding and enhancing these ideas in many ways, by increasing the number of geniune voices on the web and establishing communities with new and unusual contexts.

applications like orkut, or rather the exclusive policies associated with the application, are a step backwards and to the right. in my opinion, they establish a bad precedent, one that i suspect others may start to follow. get ready for the country-club web...

a side note about orkut's T&Cs
independent of my feelings about their entrance policy, orkut are also doing something that, in the words of one commentator, is unconscionable. take a look at orkut's terms and conditions before you sign up:

By submitting, posting or displaying any Materials on or through the orkut.com service, you automatically grant to us a worldwide, non-exclusive, sublicenseable, transferable, royalty-free, perpetual, irrevocable right to copy, distribute, create derivative works of, publicly perform and display such Materials.

in other words, whatever content you create or post on orkut, they own. period. that just plain sucks.

Posted by docrpm on 03.12.04 at 11:22 AM

March 11, 2004

link reciprocity

file under: technology

hyperlinks are becoming a currency in the digital age, but one limited to those with the power to create and destroy them.

i went to a friend's web site recently (URL withheld to protect the "innocent"), and noticed that the link to my home page had dropped off his blogroll. what??? dropped me from the secondary navigation? had i slighted this friend in some unbeknownst way? was some kind of digital payback going on? granted, it was kind of petty of me to care in the first place, but hey, links matter if you want people to read your site. and why would i write on the web if i didn't also hope that people would read?

it then occurred to me that links have become a form of currency. i'm probably not the first to say this. in fact, i'm probably about the 10,000th. but links matter to people.

link reciprocity is a term i'll use to describe the you-link-to-me-i'll-link-to-you phenomenon. i'm sure someone else has thought of that one too [pause---google search---ok, yeah, here is another blog about the exact same damn thing].

that guy i just linked already thought about it and wrote a lot on this topic. i'm not gonna write any more. you get the point. ;-)

ps: do you think he'll link to my site because i just cited him?

pps: he mentions in his intro that this phenomenon is not new. it has been happening in academia for years. of course, i should have realized this, since it used to happen to me all the time when i was in that world.

Posted by docrpm on 03.11.04 at 7:12 PM | Comments (2) | TrackBack (0)

February 19, 2004

spamwars

file under: technology

a few weeks ago, i posted some of the interesting subject lines found in my spambox.

i thought about it a bit more, and wondered, why would spammers send mail with subjects that scream "DELETE ME I AM SPAM!!!"? it turns out there is a very good reason (actually, two reasons - one good one involving technology, and one messy one involving human curiosity and/or naivete). in the course of digging this info up, i spent a little while studying the weird world of spam.

for those interested in some of the hows and whys, read on for the tip of the proverbial spamberg.

...

SPAMOUFLAGE AND SPAMJECTS
so, let's get the easy question out of the way first - why do so many spam messages have weird-yet-compelling subject lines?

well, i can't shed any light on why they're compelling, but i can say what they're trying to do...they're trying to defeat best-of-breed, automated, anti-spam filters by disguising themselves as potentially interesting messages. those subject lines are "spamouflage." (i wish i could lay claim to this term - Wired beat me to it.)

this probably comes as no surprise. some of you might even have said, 'duh' when you read the explanation above.

the better question is, why do these subject lines confuse spam filters?

it turns out that a lot of the war on spam is fought in statistical trenches, as it were. specifically, many best-of-breed anti-spam filters today use what are called Bayesian statistical filters in order to spot spam and weed it out. i'll skip the statistics and more in-depth mathematics - you can read Paul Graham's excellent anti-spam treatise to get more info.

the idea is simple - use the language of spam to identify it. after all, spam is always selling something, and the list of things being sold is pretty short (viagra, porn, low-interest mortgage loans, Nigerian letter scams, etc.). as a result, the language used in spam should be fundamentally different from the language used in everyday mail you receive from friends or co-workers (unless your friends try to sell you things on a regular basis). with this idea in mind, email software that uses Bayesian filtering does something like this:

  1. Build up a dictionary of words contained in your email (spam and non-spam); next to each word, indicate how likely it is that the associated word is found in a message that you have called spam. (this is done during the 'training' period in which your email software is figuring out what your typical messages look like).

Once the dictionary has been built, try to identify spam automatically as follows:

  1. When a new message arrives, break it up into "words" (tokens, in computer geek lingo).
  2. Look up each token in your spam-dictionary, and find the likelihood that this word belongs to a spam message. if the word isn't in the dictionary, assign it an arbitrary, fixed spam probability (usually something around 40%, i.e., 4 times out of 10 this word would appear in spam).
  3. Use Bayes' theorem to compute the probability that the entire message is spam, based on an analysis of the most interesting words (i.e., those that are most likely to be in spam, and those that are not).
  4. For messages that cross some threshold or spam probability (e.g., 90% probability the given message is spam), mark them as spam; let everything else pass through.

simple, right?

ok, maybe not. but it does explain why you get all of these weird words in spam...actually, they're not weird per se - they're just uncommon. as a result, they are often not in the dictionary your software uses to compute the probability that a message is spam. this also goes for words that are hacks (like vi^gr^), which will often be missing from your spam dictionary. the software doesn't know what to do with these words and assigns them a middle-of-the-road spam probability. this increases the likelihood that your software will think the message is valid.

so, the mystery of the odd subject lines has been explained. the same answer goes for those large paragraphs of text you sometimes see at the bottom of spam emails - they try to confuse the filter as well (since filters use the subject, headers, and message text to determine if something is spam). spammers actually use a whole host of other tricks (see the Fieldguide to Spam for an interesting list.

NOTE 1: false positives and false negatives
one thing to be aware of as far as Bayesian filters go...they were designed to give NO false positives (i.e., marking legitimate mail as spam). this is essentially a requirement, since no one ever really bothers to look through that spambox containing the 437 spams they got today...they just delete them and hope for the best. false negatives, on the other hand, are tolerable (but kept to a minimum). after all, if five spams get through out of 1000 messages, that's manageable.

NOTE 2: Bayesian filters evolve and are tailored to you as an individual user.
these two facts are important, even critical, since spam changes over time, and since the average language content of every person's email is different. for example, i might receive email from friends containing words like PHP and blog, whereas others might not. as the spammers try to adapt their messages to spoof the filters, the filters are constantly evolving themselves to meet the new onslaught. in order for spammers to get their messages through, they have to insert content in their message that is basically indistinguishable from your personal email.

WHY SPAMMERS BOTHER
economics, pure and simple. it costs about $250 to send 1 million spams ($0.0002 per message), and response rates are about 1 in 1000. paper bulk mail costs about $0.25 per message with a 1 in 20 success rate. do the math - spam is 200 times cheaper than paper bulk mail; it's all about volume.

HOW DEEP THE RABBIT HOLE GOES
once i read about Bayesian filters, my curiosity was piqued. i decided to learn more. spam spam spam and more spam! what i discovered is that the anti-spammers are working at least as hard as the spammers, and that the lines between good and evil are not as clearly drawn as one might think. there's definitely a war going on, but there are more than two sides, and there's lots of collateral damage (e.g., people unfairly nailed by so-called blacklists that can block an entire range of IP addresses from getting though corporate mail filters).

i was going to write about all of the different stuff going on, but you'd click away well before i got through the introduction...there's WAY too much info. hopefully, the links below will lead to some interesting information, should you wish to go further down the rabbit hole.

SPAM REFERENCES

Wired articles (just a few...there are many more)

Paul Graham's Writings

CAN SPAM Act

Miscellaneous Tidbits

NOTE
The post above is not meant as an endorsement of any of the legislation, authors, or organizations listed. i leave it to you, the reader, to make up your own mind regarding the participants and casulaties in the spamwars.

Posted by docrpm on 02.19.04 at 2:17 PM | Comments (2) | TrackBack (0)

January 16, 2004

melville is loquacious

file under: technology

my spam filters have been getting progressively more sieve-like. the spammers are getting smarter at disguising their drivel. at least the email titles provide some entertainment value. for your sampling, here is a list of recent subjects:

  • melville loquacious january moliere
  • afghanistan down
  • draftsman marvelous maudlin gar
  • bolshevik hereof cone
  • Re: MDM, grunya! what's this
  • Re: HDIIKINY, the procurator understood
  • beauregard actinium roof
  • hackneyed every michaelangelo
  • gallberry formatted ceil foot
  • expletive haunt maul exclaim osaka
  • hydrometer messy nitrogenous hartley adultery
  • bitwise narcosis gelatin mart
  • academy andorra influenza

it's hard to pick my favorites, although any spam email that includes an element from the periodic table in its subject is pretty cool in my book. go actinium! (Z = 89)

Posted by docrpm on 01.16.04 at 11:48 AM | Comments (7) | TrackBack (0)

August 25, 2003

gross stereotypes

file under: technology

i just read an article entitled, can programmers do interaction design?

it would be an understatement to say that i disagree with the hypothesis put forward by this author (she believes the answer to the question above is no - programmers cannot and should not do interaction design). in my opinion, her arguments reflect a set of gross stereotypes that have been floating around the web business for awhile (i.e., code monkeys should know their place at the bottom of the web design food chain).

her thesis (as i read it) also reflects the more general idea that any person who can do job A is not capable of doing job B, if jobs A and B seem to involve differing skillsets. it places people in bland, easily understandable categories (i.e., you = your job) with rigid boundaries. taken to its logical extreme, it would imply that since i am an interaction designer (and programmer), i cannot also be a good father or gardener or beautician (these tasks involve different skills, after all).

if this person were some crank speaking off the cuff in a marginal discussion forum, i could easily dismiss her ideas. however, she is a VP of design at a respected interaction design firm. by association, her views are given some degree of validity - her professional status puts a stamp of authority on a notion that is short-sighted and brutally stereotypical.

i will write this woman personally to express my displeasure. if you find her words as troubling as i do (whether or not you're in the business), i would appreciate it if you did as well.

Posted by docrpm on 08.25.03 at 8:22 PM | Comments (2) | TrackBack (0)

July 17, 2003

into the fires

file under: technology

from "Dynamic HTML: The Definitive Reference (2nd Edition)" by Danny Goodman:

"...[Netscape] Navigator 4 has become a dead-end development platform, whose installed base will only decrease over time.

Despite the fact that some organizations have continued to standardize on Navigator 4 while waiting to migrate to a more modern browser platform, this edition of 'Dynamic HTML: The Definitive Reference' cuts the cord with the Navigator 4 past. If you need assistance and examples of scripting