Coding assistants musing

I love me my Cline, Claude Code and company. But there’s major thing I found missing from them — I want my assistant to be able to step with me through a debugger, and be able to examine variables and call stack. Somehow this doesn’t exist. This is helpful for figuring out the flow of an unfamiliar program, for example.

Now, JetBrains MCP Server Plugin gets some of the way there, but… It can set breakpoints but because of the way it analyzes code text it often gets confused. For example, when asked to set a breakpoint on the first line of the method it would do it at a method signature or annotation.

And it doesn’t do anything in terms of examining the code state at a breakpoint.

So I decided to build on top of it, see JetBrains-Voitta plugin (based on a Demo Plugin). It:

  • Uses IntelliJ PSI API to provide more meaningful code structure to the LLM (as AST)
    • This helps with properly setting breakpoints from verbal instructions
    • Hopefully also this should prevent some hallucinations about methods that do not exit (educated guess).
  • Adds more debugging capability, such as inspecting the call stack and variables at a given breakpoint.

    Here are a couple of example debug sessions:

Much better.

And completely vibe-coded.

Maybe do something with Cline next?

Athena Federated Queries: Azure Data Lake Storage, part II

In our previous installment, we learned that Athena does not support ADLS directly (without Synapse). I decided to try to rectify the situation. Initial draft here: https://github.com/debedb/athena-azure-adls

It totally sucks because it’s not useful performance-wise, too slow. But at least it’s got a connection…

But then again Dremio seems to be real good about it. It appears to work well with blob storage (ADLS on Azure, GCS on GCP, S3 on AWS). Even, in some cases, better than Athena with all the blobs in S3.

I may add benchmarks if I can.

To be continued…

It’s 2024, and…

This is a special kind of rant, so I’m starting a new tag in addition to it. It’ll be updated next year, I’m sure.

The state of yak shaving in today’s computing world is insane.

Here we go.

It’s 2024, and…

  • …and I can’t get CloudWatch agent to work to get memory monitoring (also, why is this extra step needed, why can’t memory monitoring be part of default metrics? Nobody cares about memory?) Screwing around with IAM roles and access keys keeps giving me:

    ****** processing amazon-cloudwatch-agent ******
    2024/04/05 20:51:36 E! Please make sure the credentials and region set correctly on your hosts.

    At which point I give up and just do this:

    #!/bin/bash                                                              

    inst_id=$(ec2metadata --instance-id)                                                    
    while :
    do
    used_megs=$(free --mega | awk 'NR!=1 {print $3}' | head -1)
    aws cloudwatch put-metric-data \
        --namespace gg \
    --metric-name mem4 \
    --dimensions InstanceId=${inst_id} \
    --unit "Megabytes" \
    --value $used_megs
    sleep 60
    done


    Finally it works. Add it to my custom dashboard. Nice.

    Wait, what’s that? Saving metrics to my custom dashboard from the EC2 instance overrides what I just added. I have to manually edit the JSON source for the dashboard.

    It’s 2024.
  • …and we have what even our apparently an HTTP standard for determining a user’s time zone but per our overlords, but yet no… We are reduced to a ridiculous set of workarounds and explanations for this total fucking bullshit like “ask the user” — yet we do have the Accept-Language header and we’ve had it since RFC 1945 (that’s since 1996, that is for more time than the people claiming this is an answer have been sentient.

    It’s 2024.
  • ..and we have a shit-ton of Javascript frameworks, and yet some very popular ones couldn’t give a shit about a basic thing like environment variables (yeah, yeah, I know how that particular sausage is made — screw your sausage, you put wood shavings in it anyway).


  • …and because I’m starting this rant rubric, we still are on the tabs-vs-spaces (perkeleen vittupä!) and CR-vs-LF-vs-CRLF. WTF, people. This is why I am not getting anything smart, be it a car, a refrigerator, or whatever. I know that sausage. It’s reverse Polish sausage.

    It’s 2024.

Some Postman rants and tips and tricks

I like Postman in general. But some things are annoying, so there…

APIs and Collections and Environments

APIs are great, and equally great is their integration with GitHub, and ability to generate Collections from API definitions and have them be updated when API definition changes. Nice. Except… those Collections cannot be used to create Monitors or Mock servers, you need to create standalone Collections (or copy those you generated from under APIs). But now those don’t integrate with GitHub. There is a fork-and-merge mechanism that kind of takes care of the collaboration, but that those modes are different is annoying. Ditto Environments. What’s up with that?

Some more random notes

  • Completely agree with @mipsytipsy here:

    I am an extremely literal person, and literally speaking, nobody can be a “full stack” engineer. The idea is a ridiculous one. There’s too much stack! But that’s not what people mean when they say it. They mean, “I’m not just a frontend or backend engineer. I span boundaries.”
  • Yeah, this blog is for bragging.
  • What is it with fillable PDFs on some gov’t websites (I know, I know; that’s a post for a different day) — but they can be sometimes saved but not printable?
  • TFW about 16 years later after your colleague writes an impassioned call to “Tear down that GIL!” (take that, Mr. Gorbachev!), the GIL is finally torn down.
  • I was wondering what the Go team was smoking when they came up with the reference date concept and can I have some of that?
  • What is it that causes Medium to suck so much? Is it all the useless “content creators” writing things pre-GPT that just rephrase stuff from the Internet with nary a value added (“Here’s 10 reasons to learn Python”, and here’s how to write “Hello, world” in C, did you know?)? Is it that now probably thousands more are using generative AI — kinda indistinguishable? Or is it their idiotic subscription model which cannot deal with some logins? (I should really devote time to figure out that one but why — is that platform really worth anything at all?)

Adventures with Golang dependency injection

Just some notes as I am learning this… There aren’t good answers here, mostly questions (am I doing it right?). All these examples are a part of (not very well) organized GitHub repo here.

Structure injection

Having once hated the magic of Spring’s DI, I’ve grown cautiously accustomed to the whole @Autowired stuff. When it comes to Go, I’ve come across Uber’s Fx framework which looks great, but I haven’t been able to figure out just how to automagically inject fields whose values are being Provided into other structs.

An attempt to ask our overlords yielded something not very clear.

I finally broke down and asked a stupid question. Then I found the answer — do not use constructors, just use fx.In in combination with fx.Populate(). Finally this works. But doesn’t seem ideal in all cases…

Avoiding boilerplate duplication

This is all well and good, but not always. For example, consider this example in addition to the above:

package dependencies

import "go.uber.org/fx"

type Foo string

type Bar string

type Baz string

type DependenciesType struct {
	fx.In

	Foo Foo
	Bar Bar
	Baz Baz
}

func NewFoo() Foo {
	return "foo"
}

func NewBar() Bar {
	return "bar"
}

func NewBaz() Baz {
	return "baz"
}

var Dependencies DependenciesType

var DependenciesModule = fx.Options(
	fx.Provide(NewFoo),
	fx.Provide(NewBar),
	fx.Provide(NewBaz),
)

If I try to use it as dependencies.Dependencies, it’s ok (as above). But if I rather want to get rid of this var, and rather use constructors. But I don’t like the proliferation of parameters into constructors. I can use Parameter objects but I’d like to avoid the boilerplate of copying fields from the Parameter object into the struct being returned, so I’d like to use reflection like so (generics are nice):

package utils

import "reflect"

func Construct[P any, T any, PT interface{ *T }](params interface{}) PT {
	p := PT(new(T))
	construct0(params, p)
	return p
}

func construct0(params interface{}, retval interface{}) {
	// Check if retval is a pointer
	rv := reflect.ValueOf(retval)
	if rv.Kind() != reflect.Ptr {
		panic("retval is not a pointer")
	}

	// Dereference the pointer to get the underlying value
	rv = rv.Elem()

	// Check if the dereferenced value is a struct
	if rv.Kind() != reflect.Struct {
		panic("retval is not a pointer to a struct")
	}

	// Now, get the value of params
	rp := reflect.ValueOf(params)
	if rp.Kind() != reflect.Struct {
		panic("params is not a struct")
	}

	// Iterate over the fields of params and copy to retval
	for i := 0; i < rp.NumField(); i++ {
		name := rp.Type().Field(i).Name
		field, ok := rv.Type().FieldByName(name)
		if ok && field.Type == rp.Field(i).Type() {
			rv.FieldByName(name).Set(rp.Field(i))
		}
	}

}

So then I can use it as follows:

package dependencies

import (
	"example.com/fxtest/utils"
	"go.uber.org/fx"
)

type Foo *string

type Bar *string

type Baz *string

type DependenciesType struct {
	Foo Foo
	Bar Bar
	Baz Baz
}

type DependenciesParams struct {
	fx.In
	Foo Foo
	Bar Bar
	Baz Baz
}

func NewFoo() Foo {
	s := "foo"
	return &s
}

func NewBar() Bar {
	s := "bar"
	return &s
}

func NewBaz() Baz {
	s := "foo"
	return &s
}

func NewDependencies(params DependenciesParams) *DependenciesType {
	retval := utils.Construct[DependenciesParams, DependenciesType](params)
	return retval
}

var DependenciesModule = fx.Module("dependencies",
	fx.Provide(NewFoo),
	fx.Provide(NewBar),
	fx.Provide(NewBaz),

	fx.Provide(NewDependencies),
)

But while this takes care of proliferating parameters in the constructor as well as the boilerplate step of copying, I still cannot avoid duplicating the fields between DependenciesType and DependenciesParams, running into various problems.

Looks like this is still TBD on the library side; I’ll see if I can get further.

Conditional Provide

When using constructors, I would have a construct such as:

type X struct {
   field *FieldType
}

func NewX() *X {
   x := &X{}
   if os.Getenv("FOO") == "BAR" {
     x.field = NewFieldType(...)
   }
}

In other words, I wanted field to only be initialized if some environment variable is set. In transitioning from using constructors to fx.Provide(), I wanted to keep the same functionality, so I came up with this:

type XType struct {
   fx.In

   field *FieldType `optional:"true"`
}

var X XType

func NewX() *X {
   x := &X{}
   if os.Getenv("FOO") == "BAR" {
     x.field = NewFieldType(...)
   }
}

var XModule = fx.Module("x",
	func() fx.Option {
		if os.Getenv("FOO") == "BAR" {
			return fx.Options(
				fx.Provide(NewFieldType),
			)
		}
		return fx.Options()
	}(),
	fx.Populate(&X),

Works fine. But is it the right way?


Putting on my marketing hat: Random MailJet hack

(Yes, I do indeed wear multiple hats — marketing FTW, or is it WTF?)

Really wanted to use MailJet (BTW, guys, what’s with support and not being able to edit a campaign after launch? Get what I pay for?) to send users a list of items, dynamically. For example, say I have the following list of items, say, “forgotten” in a shopping cart:

User,Items
Alice,”bat, ball”
Bob,”racquet, shuttlecock”


And I want to send something like (notice I’d also like links there):

Hey Alice, did you forget this in your cart:

Ball
Bat

Turns out, loop construct doesn’t work. Aside: this is despite ChatGPT’s valiant attempt to suggest that something like this could work:

{{
{% set items = data.items|split(‘,’) %}
{% for item in items %}
{{ item.strip() }}
{% endfor %}
}}


But the answer is hacky but works. Because if you remember that SGML is OK with single quotes, then construct your contacts list like so:

User,Items

Alice,"<a href='https://www.amazon.com/BB-W-Wooden-baseball-bat-size/dp/B0039NKEZQ/'>bat</a>,<a href='https://www.amazon.com/Rawlings-Official-Recreational-Baseballs-OLB3BBOX3/dp/B00AWVNPMM/'>ball</a>
Bob,"<a href='https://www.amazon.com/YONEX-Graphite-Badminton-Racquet-Tension/dp/B08X2SXQHR/'>racquet</a>, <a href='https://www.amazon.com/White-Badminton-Birdies-Bedminton-Shuttlecocks/dp/B0B9FPRHBF'>shuttlecock</a>"




And make the HTML be just

Hey [[data:user:””]],

You did you forget this in your cart?
[[data:items:””]]


Works!

P.S. Links provided here are just whatever I found on Google, no affiliate marketing.

More random notes

  • Yeah, no-code/low-code is great (wave, OpenAI). Especially for growth-hacking, right (hello, Butcher)? But here’s your no-code platform — Google Ads. Gawd… I’d rather write code.
  • Why does FastAPILoggingHandler seems to ignore my formatter? I don’t know; but the fact that someone else also spends time figuring out the inane things that should just work is quite frustrating.
  • How many yaks have you shaved today?
  • O, GCP, how convenient: in the env var YAML sent to gcloud run you helpfully interpret things like “on”/”off”, “true”/”false”, “yes”/”no” as numbers, eh? And then you crash with:

    ERROR: gcloud crashed (ValidationError): Expected type <class 'str'> for field value, found True (type <class 'bool'>)

    Because of course you do.
  • “Overriding a number of default settings is key to shaving off unnecessary spend”. Yep.

Random notes for January 2023

Not enough for any singular entry, but enough to write a bunch of annoyed points. Because I hate Twitter threads and this is the reverse: unconnected entries jammed together.

  • GoLang: Looks like the answers to my questions are nicely written up here.
  • Technology and Society: Ok, I promise I’ll get to geekery here. So, PeopleCDC folks seem upset about the New Yorker article. But now, I am surprised — and maybe it is an oversight — at the lack of inclusion of IT people in the form. Artists, yeah, to carry the message — but if the goal is to slow the spread, why no consideration given to automation of various things (look at how pathetic most government websites are for things that are routine).

    Not expecting to hear back, really.
  • Google Ads and API Management: Every time you think you get used to all the various entities in Google Ads, you realize there’s of course a sunsetting of UA … Of course! Of course this is where I pause and let Steve Yegge on with his rant:

    Dear RECIPIENT,

    Fuck yooooouuuuuuuu. Fuck you, fuck you, Fuck You. Drop whatever you are doing because it’s not important. What is important is OUR time. It’s costing us time and money to support our shit, and we’re tired of it, so we’re not going to support it anymore. So drop your fucking plans and go start digging through our shitty documentation, begging for scraps on forums, and oh by the way, our new shit is COMPLETELY different from the old shit, because well, we fucked that design up pretty bad, heh, but hey, that’s YOUR problem, not our problem.

    We remain committed as always to ensuring everything you write will be unusable within 1 year.

  • API Management, ListHub: First, I learned there’s a standards body Unsure what I’m making of it (I mean, I suppose I’ve gotten good results from IAB, and standardization of FinOps is somewhat ongoing, so, er, maybe not all bureaucracy is an awful horrible crap.

    But that’s kind of a side note.

Refreshing Golang

Goroutines — I feel dumb

Doing a Golang refresher. Realize I still do not understand how exactly a new thread is spun when a syscall happens, or what happens to M and P when we are waiting on a channel? What does it mean that “Every M must be able to execute any runnable G” — that is, what does the word “execute” mean here? This document says so below again: “When an M is willing to start executing Go code, it must pop a P form the list. When an M ends executing Go code, it pushes the P to the list.” What is “ends executing”?

Similarly here, what does it mean “M will skip the G”? How does it “skip it” if thread M is running G’s instructions now? Doesn’t it block with the blocking G? What am I missing?

OK, so let’s say in case of I/O, it’s due to netpoller magic:

Whenever you open or accept a connection in Go, the file descriptor that backs it is set to non-blocking mode. This means that if you try to do I/O on it and the file descriptor isn’t ready, it will return an error code saying so. Whenever a goroutine tries to read or write to a connection, the networking code will do the operation until it receives such an error, then call into the netpoller, telling it to notify the goroutine when it is ready to perform I/O again. The goroutine is then scheduled out of the thread it’s running on and another goroutine is run in its place.

When the netpoller receives notification from the OS that it can perform I/O on a file descriptor, it will look through its internal data structure, see if there are any goroutines that are blocked on that file and notify them if there are any. The goroutine can then retry the I/O operation that caused it to block and succeed in doing so.

But still unclear — is it netpoller itself that schedules G out of M? How does M stop running G and starts running some other G’? And what about blocking on a channel operation?

Per this post, this “scheduling out” is done by runtime:

When M executes a certain G, if a syscall or other blocking operations occur, M will block. If there are some Gs currently executing, the runtime will remove the thread M from P, and then create a new thread.

In Go: Goroutine, OS Thread and CPU Management, Vincent describes this as

Go optimizes the system calls — whatever it is blocking or not — by wrapping them up in the runtime. This wrapper will automatically dissociate the P from the thread M and allow another thread to run on it.

The “wrapper” seems to be the netpoller (see above). Ok, I suppose all of this connects, in a somewhat handwavy enough wave, that I’m almost satisfied. I feel like it’s just a couple of dots that are unconnected though, still… I suppose we can stipulate syscalls, but how are channel blocks handled? Is the same mechanism done but via channels, rather than the netpoller in that case?

Some good deeper-than-usual resources

Some refresher examples for myself

As I was refreshing my Golang, I made a bunch of snippets as a kind of study cards. Nothing sophisticated here, just basics. Yesh, it’s like Go By Example, but writing them myself is better for remembering. (Kinda like lecture notes, except, well, you can’t do these with pen and paper…)

Patterns: descriptivism vs prescriptivism

This is going to be so short, it requires this sentence to say so so it appears a bit longer.

It seems that there are two ways of looking at them. Prescriptive: “When faced with a problem of class X, use pattern A”. Or descriptive, “When faced with a problem of class X, a lot of times engineers use approaches Alpha, Beta, Gamma that have a particular pattern in common; let’s extract it and call it A so we have a common terminology.”

The “prescriptive” part really should be a “strong suggestion” added weight to by the fact that it is widespread enough to get a name, but nothing beyond that. (See also “Thinking outside the box“).

What prompted this? Well, TIL that exercises such as Ad hoc querying on AWS have a name: “lakehouse“, and that I’ve apparently been thinking about how best to do “Reverse ETL” without thinking “Reverse ETL”. Well, I guess that’s open source marketing.

This post is not making any prescriptions.

Few gists

Just a few gists to park here for later reference.

Signing oAuth 1.0 request

To work with Twitter Ads API, need to use OAuth 1.0. There’s a nice little snippet of Java here, but there’s an issue with it. After chasing some red herrings due to Postman collections, the problem is that query string is not properly encoded. Fixed code to do that for query parameters (while still missing params from request body because I don’t need them now) and adding nonce generation at this gist.

Maven dependencies diff

Having run into problems where , here is a pom-compare.py script that compares two pom.xml files giving the difference in dependencies. Given two files — current that may have a problem, and one from a known good project — this script will show which dependencies in the problematic file may be older than needed, or are entirely missing.

Generic code for Google Ads API query

Using Google Ads API involves a lot of code that follows certain patterns (mutates, operations, builders, etc). As a fan of all things meta, including reflection, I just had to make a generic example of doing that. So now a parameterized invocation can be used in place of, for example, both campaign or ad group creation code. A similar approach can be done for update, of course.

Mocking DB calls

Using Mockito, it is quite easy to mock up a set of DB calls. For example, here’s a mock ResultSet backed by a Map.

Jar diff

Who hasn’t needed to diff JARs? Thanks to procyon, this is easy.

FinOps

Some time ago a discussion about CIO vs CMO as it comes to ad tech started, and as I see it, it still continues. As a technical professional in ad tech space, I followed it with interest.

As I was building ad tech in the cloud (which usually involves large scale — think many millions QPS), business naturally became quite cost-conscious. It was then when, I, meditating on the above CIO-CMO dichotomy, thought that perhaps the next thing is the CIO (or the CTO) vs — or together with — the CFO.

What if whether to commit cloud resources (and what kind of resources to commit) to a given business problem is dictated not purely by technology but by financial analysis? E.g., a report is worth it if we can accomplish it using spot instances mostly; if it goes beyond certain cost, it is not worth it. Etc.

These are all very abstract and vague thoughts, but why not?

Recently I learned of an effort that seems to more or less agree with that thought — the FinOps foundation, so I am checking it out currently.

Sounds interesting and promising so far.

And nice badge too.

FinOps-Foundation-Community-Member-Badge

Content assist

It looks like you are researching razors. I think you are about to go off on a yak-shaving endeavor, and I cannot let you do that, Dave.

What I would really like my DWIM
agent to do. That, and to stop calling me Dave.

Being lazy and impatient, I like an idea of an IDE. The ease of things like autocompletion, refactoring, code search, and graphical debugging with evaluation are, for the lack of a better word, are good.

I like Eclipse in particular — force of habit/finger memory; after all, neurons that pray together stay together. Just like all happy families are alike, all emacs users remember the key sequence to GTFO vi (:q!) and all vi users remember the same thing for emacs (C-x C-c n) – so they can get into their favorite editor and not have to “remember”.

So, recently I thought that it would be good for a a particular DSL I am using to have an auto-completion feature (because why should I remember ). So I thought, great, I’ll maybe write an Eclipse plugin for that… Because, hey, I’ve made one before, how bad could it be?

Well, obviously I would only be solving the problem for Eclipse users of the DSL in question. And I have a suspicion I am pretty much the only one in that group. Moreover, even I would like to use some other text editor occasionally, and get the same benefit.

It seems obvious that it should be a separation of concerns, so to speak:

  • Provider-side: A language/platform may expose a service for context-based auto-completion, and
  • Consumer-side: An editor or shell may have a plugin system exposed to take advantage of this.

Then a little gluing is all that is required. (OK, I don’t like the “provider/consumer” terminology, but I cannot come up with anything better — I almost named them “supply-side” and “demand-side” but it evokes too much association with AdTech that it’s even worse).

And indeed, there are already examples of this.

There is a focus on an IDE paradigm of using external programs for building, code completion, and any others sorts of language semantic functionality. Most of MelnormeEclipse infrastructure is UI infrastructure, the core of a concrete IDE’s engine functionality is usually driven by language-specific external programs. (This is not a requirement though — using internal tools is easily supported as well).

  • Atom defines its own API

And so I thought – wouldn’t it be good to standardize on some sort of interaction between the two in a more generic way?

And just as I thought this, I learned that the effort already exists: Language-server protocol by Microsoft.

I actually like it when an idea is validated and someone else is doing the hard work of making an OSS project out of it…

Rethinking data gravity

At some point I remember having a short chat with Werner Voegels about taking spot instances to extreme in a genuine market in which compute power can be traded. His response was “what about data gravity?” to which my counter was — but by making data transfer into S3 free (and, later, making true the adage about not underestimating the bandwidth of a truck full of tape) you, while understanding the gravity idea, also provide incentives to not make it an issue. As in — why don’t I make things redundant? Why don’t I just push data to multiple S3 regions and have my compute follow the sun in terms of cost? Sure, it doesn’t work on huge scale, but it just may work perfectly fine on some medium scale, and this is what we’ve used for implementing our DMP at OpenDSP.

Later on, I sort of dabbled in something in the arbitrage of cost space. I still think compute cost arbitrage will be a thing; 6fusion did some interesting work there; ClusterK got acquired by Amazon for their ability to save cost even when running heavy data-gravity workload such as EMR, and ultimately isn’t compute arbitrage just an arbitrage of electricity? But I digress. Or do I? Oh yes.

In a way, this is not really anything new — it is just another way to surface the same idea as Hadoop.

Of course you have an API!

The following is a dramatization of actual events.

“I need access to these reports.”

“Well, here they are, in the UI.”

“But I need programmatic access.”

“We don’t have an API yet.”

“Fine, I’ll scrape this… Wait… This is Flex. Wait, let me just run Charles… Flex is talking to the back-end using AMF. So what do you mean you don’t have an API? Of course you do — it is AMF. A little PyAMF script will do the trick.”

“Please don’t show it to anyone!”

P. S. That little script was still running (in “stealth production”) months, if not years, later.

Development philosophy

This is one of those posts that will continue to get updated periodically.

I’ve been asked to describe my software development philosophy (and variations thereof) often, so I’ll just keep this here as a list.

  • The right tool for the right job. A “PHP programmer”, to me, is like a “screwdriver plumber” or a “hammer carpenter”. First, figure out the problem you are trying to solve, then, pick the tool.
  • Do not reinvent the wheel.
    • It is likely that others solved a similar problem. There may be solutions out there already, in the form of libraries, SaaS, FOSS or commercial offerings, etc. Those are likely to have gone through extensive testing in real life. Use them. Your case is not unique, nor are you that smart.
    • You are not that smart.
    • Consider buying (borrowing, cloning, licensing) rather than rolling your own
  • Engineering is an art of tradeoffs. Time for space, technical debt for time to market, infrastructure costs for customer acquisition, etc.
  • Abstractions leak.
  • The following things are hard. However, they have been solved and tested and worked for years, if not decades. Learn to use them:
    • Calendars and timezones
    • Character encodings, Unicode, etc.
    • L10n & I18n in general
    • Relational databases
    • Networking (as in OSI)
    • Operating systems

Reporting: you’re doing it wrong

I’ve often said that there are certain things average application programmers just do not get. And those are:

  • Calendars and timezones
  • Character encodings, Unicode, etc.
  • L10n & I18n in general
  • Relational databases
  • Networking (as in OSI)
  • Operating systems

And by “do not get” I do not mean “are not experts in”. I mean, they don’t know what they don’t know. Time and time again I see evidence of this. Recently I saw one that was so bad it was good — and that, I think, necessitates a meta-like amendment to this list, in the spirit of Dunning-Kruger. As in:

You are probably not the first person to have this problem. It is very likely that smarter (yes, snowflake) people already solved this problem in some tool or library that has endured for years, if not decades. USE IT!

Here is the incident, worthy of The Daily WTF.

There is a monthly report that business runs (it doesn’t really matter what kind of report — some numbers and dollars and stuff). How is the report being generated? (Leaving aside for now the question of why one would roll one own‘s reports, rather than using a ready-made BI/reporting solution. Using the brilliant algorithm that Donald Knuth forever regrets not including in his TAO series:

  1. Set current_time to the chosen start date, at midnight.
  2. Output the relevant data from day specified by current_time.
  3. If current_time is greater than the chosen end date, exit.
  4. Increment current_time by 86400 (because 86400 is a universal constant).

What could possibly go wrong?

Nothing. Except when you hit the “fall back” time (end of DST). Once you go past that date, the system will subtract an hour, and you end up at 11pm the previous day, not midnight of the day you wanted. And because you’re stupid, you have no idea why all your business users are waiting for those reports forever.

Java-to-Python converter

“Anything that can be done, could be done ‘meta'” (© Charles Simonyi) is right up there with “Laziness, impatience and hubris” (© Larry Wall) as pithy description of my development philosophy. Also, unfortunately, there’s another one: “Once it’s clear how toproceed, why bother to proceed” (or something like that). So, with that in mind…

I wanted a Python client library for GData (thankfully, they released
one last week
, so this is moot — good!), so I thought of automagically converting the Java library to Python. I tried Java2Python, but it’s based on ANTLR grammar for Java 1.4, and the library, of course, is in Java 5. As I was relearning ANTLR and writing all these actions by hand (the pain!), I took a break and found
Java 1.5 parser with AST generation and visitor suport by Julio Gesser (no relation,
I presume?) and Sreenivasa Viswanadha, based on JavaCC. Aha! Much easier… But then, of course, Google releases the Python version of the library I needed in the first place, so I don’t bother wrapping this project up… Here it is for whoever wants it: http://code.google.com/p/j2p/.

The King, the Jedi and the Prodigal Son walk into a bar…

So, earlier I tried to switch to Blogger briefly, because my LiveJournal was messing up javablogs feeds (and I wanted something trackback-like).

But then I missed this tag/label/category functionality thingie, so I had a brief affair with Movable Type, but then, voila — The New Version of Blogger. Good, I don’t have to host the stupid thing then…


Peter Kriens has been working too much: “Today an interesting project proposal drew my attention: Corona. Ok, the name is a bad start. The Apache model of names without a cause is becoming a trend.” Eh? I was with you until the last sentence — but it’s not an Apache model of names without a cause, it’s a model of — aw, geez, there must be a pithier term for it — names for things associated with main product that are in some ways puns on the original name (JavaBeans, Jakarta, etc.) Get it? Sun – Eclipse, Eclipse – Corona? (Things will really get out of hand — with horses! — when a Corona-associated product will be called Dos Equis).

Evaluating expressions in PyDev (Eclipse plug-in for Python)

I use PyDev because, probably like many, I am used to Eclipse for Java development. What I found useful is highlighting a snippet (expression) in a debug session and doing Ctrl+Shift+D to evaluate it, and  I miss this in PyDev. A crude workaround  is to add this expression to Watch list, but that grows the Watch list and is not convenient: I not only have to do right-click Watch and then look in the Watch list, but also may need to scroll that list, and remove things, etc. That’s not what I am used to. So I threw together a crude implementation of it.

The change is in the org.python.pydev.debug project:

  1. Added
    EvalExpressionAction class to org.python.pydev.debug.ui.actions package.
  2. Changed the plugin.xml
  3. The MANIFEST.MF
    thus includes two additional bundles in Require-Bundle: field: org.eclipse.core.expressions and org.eclipse.jdt.debug.ui. (Well, the second one is only for the second keystroke – “persisting” the value in the Display view, and only because I was lazy at this point. But also, since this thing relies on other org.eclipse.jdt stuff, I figured it’s not a big deal).

    Another problem here is that I couldn’t figure out how to do Ctrl+Shift+D the second time for persisting; so Ctrl+Shift+D works to display in a popup, and Ctrl+Shift+S does the persisting. (The choice of “S” is since when I press Ctrl+Shift+D my index finger is on D and so it’s easy and fast to use the middle finger to press S immediately :). But that still is close to what I am used to blindly press. People get used to all sorts of weird keystrokes and go out of their way to reproduce them in their new environment, just witness viPlugin for  Eclipse.

Of course, as I went to announce this on the list, I saw that PyDev already has a slightly different mechanism for that. O well, at least this way still saves me some keystrokes and I learned that the Console view is also a Python shell. (That’s cause I never RTFM)… But at least I was not the only one

So anyway, this seems to work in my environment; just unzip into the Eclipse folder – and do so at your own risk…

Just say no to Holub


Boo-hoo! You had me, and then you lost me!

Frank Sinatra

При чем тут голубь?

Репортаж с Первых Весенних Олимпийских Игр

Yeah, yeah, we do want to “Just say ‘No’ to XML“. Amen.
And +1 to Mr.Holub for noting that “…many so-called programmers just don’t know how to build a compiler. I really don’t have much patience for this sort of thing.” But
it’s all downhill from there:

  • -0.1 for describing Ant as a “scripting language” (it really is declarative…)

  • -0.4 for picking on Ant, of all things, in the first place. Some people can write a compiler and still manage
    to subject “every one of [their] users to many hours of needless grappling with”, oh, I don’t know… make???

  • -0.5 for plugging his book at the end

  • -10 for doing the above with an innocent “By the way”. (+10 if this “innocence” is tongue-in-cheek, Lt.Columbo-“Oh, and just one more thing”-like. But
    “architects, consultants and instructors in C/C++, Java and OO design” don’t do this kind of subtlety.)

In all, Mr.Holub is 10 in the hole for this round…

A classic case of how a perfectly defensible thesis is ruined by the examples…

More WIBNIs

 


P.S. And on the lighter side…

GMail WIBNIs

So I noticed that when I got an e-mail about an appointment, GMail helpfully (no, I mean it!) included a conspicuous link for entering
this appointment into my Google calendar. Which leads me to a couple of WIBNIs:

  1. When I get a bounce, I should get a similar link allowing me to remove this address from my contact list. (Parse the email, come on, I know you already do, so it’s not that big an invasion of my non-existent privacy to see that this email came from a MAILER-DAEMON or something)…
  2. More or less ditto for locations mentioned in emails.
  3. When I do “Report Spam”, I don’t really give a flying spaghetti monster what the underlying algorithm is, but is it too much to expect never to see a message from that particular address in my inbox?
  4. In general, perhaps there’s a way to allow people to create solutions for similar WIBNIs, immediately adding this functionality to their own account and also contributing them to some central repository of solutions, thus enhancing Google’s hegemony further, if that’s even possible.
  5. I’ll be having more to say…

P.S. A couple of days after discussing with BOBHYTAPb the silliness of Google’s attitude toward “mail sent to yourself will not appear in your inbox as you expect because it’s a feature and you’re gonna like it and we don’t give a shit that that’s what you expect<'cause your expectations are due to bad upbringing", I noticed that this changed.

IOException: OutputStreamOfConsciousness is not accepting any more output

Given my “penchant” for using character names from French adventure narratives, I have decided to give Dbdb project the code name “Bragelonne” (the link is for… you know…). It is, after all, ten years since, you know… Which is all the more fitting (ironically) as I am about to give it up for adoption…

WIBNI…

 

    1. A “Debugging Eliza” idea from BOBHYTAPb

      Here is a bomb of an idea: A Debugging Eliza.

      After a long and fruitless session of debugging a programmer reaches a certain dead end, where he has already glossed over the problem, has not found it, but noted subconsciously that that venue has been checked – or simply didn’t think about it. At this point he needs to talk to someone about this problem – a sort of a psychiatrist, which doesn’t even have to be human – it could be a slightly tweaked version of Eliza.

      “What are you doing?”

      “I am debugging Blah-Blah Industrial Application.”

      “What was the last thing you tried?”

      “I checked that the configuration file corresponds to the Blah-Blah…”

      “And how is the Blah-Blah?”

      “It’s perfectly fine.”

      “And how does that make you feel?”

      “It means the problem is somewhere else.”

      “Where else could it be?”

      etc.

      Obviously, this is where a lot of problems are found — when you are asking someone for help, and in the process of explaining the problem realize your error.

  • Eclipse, for all its cool pluggable architecture, lacks a basic thing — macros, which should be easy given the above. That is, a way to record (or write by hand, fine) a series of steps to instruct the Eclipse workbench to do something, and then play it back. Where’s AppleScript  when you need it?

    For example, instead of creating a walkthrough. Yes, part of the pain in this particular case can be solved by, for example, checking in dot-files into the source control, and then telling everyone to “Import existing projects into a workspace” after checking out the tree. But I can’t do that — there are dot-files of a “competing” approach checked into the repository, which suit some of us fine, but lack the things others want. but that’s in this particular example, and I cannot come up with another case right now, but trust me, they exist.

     

 

YODL

Once upon a time, BOBHYTAPb, Shmumer, others and yours truly thought
that a short-term LARP-like online game
could be interesting. (Nothing came of it, of course.) One of the
problems sited at the time was that computer games were lacking in
modeling of reality in general (duh!). In particular, the thought
went, the problem is with OOP itself. So YODL was conceived. (Did I
mention that nothing came out of it?) Shortly thereafter I discovered
Subject-Oriented
Programming
articles… A while later, I found notes about YODL
which I reproduce here in their incoherent entirety without any hopes
that anyone cares
, using this LJ as my personal repository of
stuff to refer to, maybe…


interface to other language/objects/functions?

1. Interceptable actions

The most limiting feature of this scheme is the finality of all
actions. In MUDs, one active object can intercept another object’s
action and veto it. Here, once an action is initiated it is performed
(see the caveat below). Other object can only react to it later
on. Example: in a MUD, you can place a closed chest and a guardian
over it. If you try to open the chest, the guardian stops you – have
to kill him first. In this scheme, if you are close enough to the
chest to open it – you open it, guardian or not.

2. Environment as a priviledged interceptor

Broadcasting messages to those that are interested and are eligible
cf 1?

3. yodl abstract

YODL is a mark-up language used to rapidly create new game worlds by placing objects in them.
Objects can be existing, taken from the library, as well as newly created (with the YODL as well!)
on the basis of other objects. YODL supports inheritance, with every object inheriting its properties
at least from some ideal object(s) (instances of which cannot be created) , or from other functional
objects. As is standard for such inheritances, properties can be overriden, added, erased.

An object can be created by inheriting from two objects – thus compound objects (e..g, a rifle with
laser targeting can be created out of stock rifle and stock laser pointer).

As much as it seems like a use of standard OO (object-oriented) approach, YODL presents important
innovation over traditional OO approach:

We strive to make the worlds we create believable. To do that, ideally,the user must be able to do
with a given object what he can do to it in the real world. That is impossible under standard OO paradigm.

In the OO paradigm, the designer must specify the behavior that each object is capable of. If a certain behaviour is not specified, the object cannot perform it. This is a great disadvantage. It is impossible to think of all the things it is possible to do with, for example, a cup. What if a user would like to try to hammer nails with it?

YODL provides for that and other behavior by NOT providing specifically for a behavior in an
object. Instead, YODL allows designers to specify a set of actions generally available (hitting, throwing,
heating up an object~) Then the object acted upon executes that action upon itself, and the action,
based on the object’s properties, decides on the consequences. For example, consider a metal cup and
ceramic one. The designer did not specify if either cup can be hit. However, an action of being hit
is in the system, and if a cup is hit, based on its properties, the action will decide if the cup breaks
(ceramic) or bends (metal).

In contrast to OO, this can be termed AO – action-oriented paradigm.
This is a misnomer, however, since YODL does not give preference to
actions (verbs) in favor of objects (nouns). Not getting into
linguistic debates, if we need both to better describe our world,
we will have both.

Other concepts introduced in YODL are related. To go into details, we need to
provide full YODL specification, which we can’t right now. Of interest immediately, however,
are also the following concepts.

  • Action inheritance – to ease the work of designers, actions can be inherited just as objects can be.
  • Faces – actions that can be inflicted upon the object can be
    calculated automatically, some being discarded (e.g., if an action of
    -Y´break¡ cannot possibly be inflicted upon an object, it is
    discarded).

Remaining actions represent a ´face¡ of an object. This is useful to the user, who can then be presented with a list of actions he can do upon the object, as a menu. More importantly, however, this can provide differentiating ´cognitive portraits¡ of user’s characters, forcing the character to see an object in some way. For example, a character that has never heard of a gun will be able to ´press the trigger¡ but not to ´shoot¡ the gun – and definitely not to load it.

Context-aware programming, in that respect like AOP.

An ACTOR is a performer of actions. An OBJECT is something the actions are performed on.

An OBJECT IMPLEMENTS a collection of INTERFACES.

An INTERFACE REQUIRES PROPERTY REFERENCES. When implementing the INTERFACE, the OBJECT must PROVIDE those to the INTERFACE. If the PROPERTY is not set by the OBJECT, it may be deemed UNKNOWN. The INTERFACE must specifically allow a PROPERTY to have an unknown state.

Properties have a state of Unknown. If an Object’s Interface Requires a PR for the Object, and the state of the Property is currently Unknown, the Object never returns that interface as a member of the Face. (or what of optional)

A subset of a collection of INTERFACES that an OBJECT implements is a FACE of the OBJECT. An OBJECT thus has many FACES. (-L~~~~~~~~?) (set of methods???)

When the ACTOR interacts with the OBJECT, the ACTOR constructs an INSTANCE of his PERCEPTION INTERFACE (parameterized both by LOCAL_ENVIRONMEN) and passes it to the OBJECT. Based on the received INSTANCE of the PERCEPTION interface, the OBJECT returns to the Actor a Face, that is, a subset of its collection of Interfaces. What the Actor sees then is a particular Face of the Object, parameterized by the current Actor’s Perception and the current state of Local_Environment.

(Addendum: The Object doesn’t choose shit. It just passes the PERCEPTION to the Intertfaces, and they decide which Subinterfaces make up the face).

TRIGGERS

A Role is a collection of Interfaces (CF. Role theory – GG2001).

Multiple Inheritance Resolution: User Intervention. If an Actor invokes a Method appearing in more than one Interface, the Actor is asked to specify which Interface he had in mind. Or, the desiner provides a default order?

Deconstruction Interface:

BASIC Interface

A tool for script creation — just record the proceedings

Environment Triggers — implicit, e.g., an air balloon’s trigger on local pressure.

PROPERTIES — not just numbers, but instead objects with access methods!

Properties implying interfaces requiring them? — Properties only imply Passive; ACTIVE require knowledge and must be explicitly declared.

Every Interface comes of two Complementary types — Active and Passive. Pasive interface contains handlers for the Active interface.

The Actor passes the collection of his Active interfaces to the Object (with the Perception module). The Object returns a Collection of the Actors subinterfaces, corresponding to what the Object can handle given the current state of Environment and of the Actor. This could inolve Fallback on some of the active interfaces of the actor to their ancestors, as best as the object can handle for that particular interface.

Environment and Actor are special cases of Objects. The also provide Faces to the Actor. Actor’s Face includes Inventory, for example.

Case Study: Actor contains Active “Run” and Passive “Run”. P-RUN changes actors coordinates.

Waht goes into Environment? Walls treated as Objects? Is Environment any different from a specialized collection of Objects we don’t want to ttreat as Objects? Philosophically?

A Face includes all the Representation stuff — graphics, audio, smells, whatever. In fact, these should depend upon the Actor’s Perception and not on ly on the state of the Object.

PH: An Interface can be separated into Effect and Implementation.

Flying is an effect. Winged Flying, Propeller Flying, Jet Flying – implementations of that effect. This mimics Templates — but not completely.

Properties imply Passive intefaces — Passive Interfaces are written using fixed property names. Ergo, Interface designers must communicate heavily.


 Keyword unknown 
 Actions {
 
 Hit (subject object) where 
 
 Object has , is
 
 Subject has 
 
 {
 
 
 
 }
 
 } 
 
 
 Actor me {
 
 Knows hit 
 
 } 
 
 
 class chair implements matter {
 
 state = solid;
 
 // weight not here - automatically unknown
 
 } 
 
 
 class Neanderthal knows wood {
 
 } 
 
 
 class Bird knows wood 
 
 
 interface matter{
 
 property state : {solid, liquid, gas};
 
 optional property weight;
 
 } 
 
 
 interface wood implements matter {
 
 property hardness : 3;
 
 } 
 
 
 class blade implements iron {
 
 property edge: .1; 
 
 
 cut (matter m) {
 
 if (m.state=solid) {
 
 if ()~ 
 
 
 }
 
 } 
 
 
 Neanderthal N; 
 
 
 Bird b; 
 
 
 Wood w;
 
 Knife k;
 
 b.use(k.cut(w)); 
 
 

Some random links jotted in these notes:

RSS WIBNI

So, I broke down and got a paid account just so I could
syndicate (oh, and ).
Does this even work? We’ll see…

So, while I am at it, here’s an RSS WIBNI: a weighted RSS. So that,
for example, occassional entries from stay
on top, rather than being beaten by frequent spewage from something
like /. (I won’t even link to that den of iniquity, but I read
it for the articles…)

Rant

The debate holy war on the topic of software engineering vs “real” engineering seems as endless as GWOT. I am too lazy to do an extensive
search, but I do remember one of the pithy definitions to claim the use of differential equations as a necessary condition…

DISCLAIMER/DIGRESSION
I don’t really care, but “engineer” does sound cooler than “programmer”, which doesn’t have a sci-fi ring to it anymore, or “developer”, ’cause Donald Trump is also one — not that he isn’t cool

But I thought I’d throw just one more difference into the mix. Software engineers — at least those that work in application development — have to use knowledge of other domains — those, for which software is written (e.g., finance, etc.)

As far as I am concerned, these domains tend to be boring… I like technology for technology’s sake… Does that make me more of an engineer?

Discuss

I wonder whether Michael Swaine weighed/will weigh in on it…

P.S. Please…

BOOK REVIEW: “Eclipse: Building Commercial-Quality Plug-ins”

I suppose this is more of a praise of Eclipse plug-in architecture and available documentation than a review of the book per se, but I did not get from Eclipse: Building Commercial-Quality Plug-ins anything I could not by scanning online docs and playing with Eclipse myself. I was up and running with my plug-in project in a very short time without opening this book, and once I did, I did not find anything I have not already learned or known where to turn for more info…

It may be easy to say that many such books are just a rehash of the wealth of online information already freely available, but sometimes the books do have added value, say, by presenting the material for faster learning and/or reference. In this case, there can be no such added advantage – again, because the Eclipse project’s own design and documentation is very clear and thorough…

I realized all that before getting the book; in buying it, I was looking for another advantage – hidden tips and tricks, kind of like Covert Java. For example, how do I debug a plug-in project that depends on a non-plugin one?

No such luck.

I’ll be returning this book to the store now, and maybe trying to see if Contributing to Eclipse: Principles, Patterns, and Plugins is closer to what I want…


Who debugs the debuggers, part III


…See also Part II

I suppose the Javadt approach ran out of steam. For some reason,
it now takes a horribly long
time to invoke the request on an ObjectReference that
represents a java.sql.Connection.
(A horribly long time is time enough to have a smoke, and then to come back, see it’s still not done and go surf the web enough to leave the zone.)
So I decide to bite the bullet and look into creating an Eclipse plugin…

…which turns out to be not too hard. And, while I am at it, I will use
Java 6, and undo
the horrific crap I did to get around the lack of MethodExitEvent.returnValue()
feature…

However, here’s little but symptomatic discovery (duh!).
Javadt does not like a null EventSet — it just does not
check for nulls (which is ok, I suppose, for a throwaway reference
implementation). So I was returning an empty EventSet to it all
the while. But Eclipse will indiscriminately call resume() on it,
which is not what I want. So I am back to returning null. Fine. But how many of such little things would render this “framework” not really a framework… Or should this all be configurable?

JDBC notes

  1. Executing multi-line statements

    Apparently, Oracle’s JDBC driver doesn’t like CR/LF endings. LF itself is ok. So this was needed:


    sql = sql.replaceAll("\r", "");

    See also:

    http://forum.java.sun.com/thread.jspa?threadID=669282&messageID=3914430

    http://groups.google.com/group/comp.lang.java.databases/browse_frm/thread/ea6e14e596db1546/83f97ffd119eedb2

  2. “Due to a restriction in the OCI layer, the JDBC drivers do not support the passing of Boolean parameters to PL/SQL stored procedures…”

I love it when…

…I spend time working on something under an [reasonable] assumption
that I can do X, spend some more time realizing that I actually cannot,
lots more on cranking out convoluted code for working around that limitation, and
then find
out that this X has in fact been implemented in a later release than
the one I have…

Here, the feature X is being able, upon an exit from
the method, get the value it returned. This feature is there in
JDK
1.6
. I don’t need it anymore for now though… Maybe I will…

Oracle and JPDA

I believed that I had to go through the pain to bridge DBMS_DEBUG to JDWP. I’ve already started to look into it, using GNU Classpath’s implementation. But it turns out that Oracle already supports debugging stored procedures with JPDA.

But all it does is that saves work on Dbdb, not makes it irrelevant. While adapting another debugger to JPDA is useful (and I may yet do it for something else), it is not the primary value of this project. It is in the unified call stack.

And David Alpern of Oracle claims they already have something like this, but it’s nowhere to be found. JDeveloper allows debugging stored procedures but the the single call stack, which is what I think is the ultimate value of Dbdb, is not there…

Who debugs the debuggers, part II


…See also Part I

The JDB approach is kind of painful. Perhaps, another already
existing debugger can be used to try my approach? So far, I am
intimidated by Eclipse (or NetBeans), and want something easier. The reason
is that at this point my idea of integrating with a debugger is modifying
it to supply my own implementation of a Connector
thus, I need the project whose code is easier to beat into an Eclipse project
and modify.

As far as more flexible integration with any debugger (for when
this project is “mature”) that would not require modifying source
of the said debuggers, I am considering the following. Since
every good debugger has a feature to “attach” to an executing
JVM, I will implement a Java-based “tunnel” for
JDWP.
Looks like GNU Classpath
has done all of the annoying work of implementing the spec.

After examining several alternatives, I have decided, for now, to
first use modified Trace, and,
when that runs out of steam, the Javadt.


I noticed have previously reinvented the wheel when I read
about JPDA but did not notice the existence of Trace! In an effort
to track down a culprit in an execution of an application, I’ve replaced
the JVM called with FoljersCristals — my homegrown version
of Trace. Do I feel silly now…

Who debugs the debuggers

Digression

The subject really must be in Latin, n’est pas? While I have no formal instruction in Latin, I should
come up with one — what with Latin’s
pretty formal structure, my general understanding
of syntax and “feel” for languages, my finishing a Natural Language Processing (6.863J) incomplete 10 years later,
Vocabula computatralia,
Mike McLarnon’s conjugation applet
and Verbix

Should it be “Quis emendabit ipsos emendatra”?

I should probably asked someone to translate it, which reminds me of a
recursive acknowledgment Littlewood
describes in
A Mathematician’s Miscellany. He talks about a translated paper that had three end-notes at the end of it:

  1. I wish to thank NN for translating this article
  2. I wish to thank NN for translating the above note
  3. I wish to thank NN for translating the above note

And that, of course, where it ends, for, though the author did not know the
target language, he was perfectly capable of writing note #3: by copying the second note…

So, to start with, I decided to go with JDB. First question is, how
best to use it in development:

Then came home and figured out that I have to:

  • put
    tools.jar from the JRE’s home (as different
    from JAVA_HOME, which, apparently, is assumed to be
    JRE’s home — to wit, if you have JDK installed, it’s, e.g.,
    D:jdk1.5.0jre rather than D:jdk1.5.0). In other
    words, dropping the tools.jar into the jre/lib/ext
    folder in addition to it's righteous place in
    <JDK_INSTALL_DIR>lib did the trick...

  • You should properly override name() of the Connector you're
    implementing correctly for diagnostic (so that you're not confused by
    the output of jdb -listconnectors) but that's a minor thing...

Monkey business


After reading <a
href=http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=show&ixPost=75607&ixReplies=51>this, I thought I’ll put in my couple of bucks… This is a RANT!

It seems that the orientation in business is towards “monkey”
programmers — those who do not think, but do as they are
told. This is because management, apparently (and justifiably),
believes that at any given time it is easier to hire a hundred
monkeys (those are trained ones, that do not type randomly,
and so less than a million and less than infinite time will suffice,
but this is not a good analogy anyway), than a Shakespeare – or even Dumas
(with his own monkeys, so that’s another bad analogy, woe is me!)

As a result, there are (the list is by no means exhaustive; Java is
the language unless otherwise specified — I think Java has produced
more monkeys who think they are software engineers than anything
else — at least VB does not lend one an air of superiority):

  • …monkeys who would rather sharpen the
    carpal-syndrome-inducing skills of cutting and pasting the same
    thing over and over again, rather than learn something like sed or Perl or a
    similar tool —
    or, indeed, spend some effort finding out about the existence
    of such tools and their availability on the monkey platform of
    choice (read: Windows) — or even finding out what plugins are
    available for their lovely IDE.

    IBM, for example, provides a framework called
    EAD4J, Enterprise
    Application Development for Java (it is only available with
    purchase of IBM IGS consulting services). It includes components
    similar to Struts, log4j, etc.
    The framework is well designed, but here is a catch — because
    of its design, adding or changing a service requires changes to
    about 8 files. There are abstract processes, process factories,
    interfaces, factories, XML files with queries, files containing constants to
    look up these queries, etc., etc. It would really be nice if there was
    a simple way to manage it, plugging in your logic where
    some IDE plugin or script do the, well, monkey
    job. Otherwise it’s overdesigned.

    Now, there are simple plugins for the current IDE of choice, WSAD, that at
    least allow generating these
    standard files (if not managing them, which is also important —
    change one signature, and you have to change several
    files). These plugins are provided by IGS
    But nooo, the monkeys here prefer to create all of this by hand. It’s
    a painful sight.

  • …macaques who cannot fathom how one
    could write a client-server application that does not communicate through
    XML requests embedded in HTTP, but – o, horror! – actually has its own
    application layer protocol.

  • …baboons who think that
    patterns
    are not merely possible (albeit very good) approaches to problems
    (and indeed are generalizations of good approaches to common
    problems that have arisen). In fact, they
    are the only way to solve problems, and that they must be copied from of
    the book, or else it wouldn’t work. They wouldn’t know a pattern they
    haven’t read about if it bit them on that place their head is
    forever hidden in. If GoF didn’t write about it, it ain’t a pattern.

  • Ok, I am tired of enumerating primate species. I’ll
    just give an anecdote.

    I wrote a module used by several teams. Because of the ever-changing
    requirements, some methods and classes became
    useless. I gave a fair warning by email, then I gave a second one by
    marking them deprecated in the code. I notice that the
    deprecated
    tags were periodically removed. I send mail about this, and mark them
    deprecated again. And again. And again.

    A monkey who was the team leader of another team came complaining that
    I should remove it, because he cannot perform a build. Everyone else
    can,
    but he can’t, and so I should remove the single tag (that is probably
    more useful to the whole project than anything he’s ever
    produced). He cannot be bothered to find out how to make
    it work? Why can everyone else make it work? Oh, he’s using some Ant
    scripts? What? That’s an excuse? What the hell does that
    have to do with anything? Oh, he didn’t write those
    scripts. Well, write your own, or take them from those
    people for whom they work. Oh, you don’t have time? Well,
    I don’t have time to keep giving you warnings you just
    ignore, you twit.

    Screw you, I finally thought, the warning has been there for some
    time. I’ll just remove this stuff altogether.
    His build promptly crashed. “Not my problem – we talked about this
    over 5 weeks ago!”, I gloated, producing the emails from my
    appropriately named CYA folder.

    As Butch said, “that’s what he gets for fucking up my sport.”

In short, they are not
Joel’s kind of programmers,
to put it mildly. Monkeys see and monkeys do. They do not think. They
have been taught a way to do things, and it is beyond them to figure
out that there could be another way. I honestly do not think they
understand what a boolean is (I submit that in their mind there is an
if statement, and then there’s a boolean type)
when they write:


if (thingie.isOk()) {

    return true;

} else {

    return false;

}

Then someone they blindly trust (it must be an established authority,
like a book/magazine — only that approved by an already established
authority, because monkeys do not further their education on their
own, — a manager, instructor at a paid course) tells
them about a ternary operator. Now they write:


return thingie.isOk() ? true : false;

The above two examples are from an actual production code.

Further, because monkeys do not think, they often reinvent the
wheel, badly. Which is also ironic, because they have been imbued with
all the right (and wrong) buzzwords, including “reuse”. I hesitate
to hazard a guess as to whether there is some meaning in their heads
they associate with this word, or is it just something they cry out
when playing free associations with their shrinks (“OO –
Encapsulation! Polymorphism! Reuse!”).

Here are some more anecdotes.

  • One programmer on a project wrote his own utilities to convert things
    from/to hex numbers, for crying out loud. Here is Java, the only thing
    he knows at all, and he can’t be bothered to think that maybe,
    just maybe, such a thing is a part of standard API.

  • This same monkey took several weeks to write a
    parser (for a very simple
    grammar, containing only certain expressions and operators such as
    ANDs and ORs). When I asked him why he didn’t use a
    parser generator (such as ANTLR, CUPS or JavaCC), he replied that
    he didn’t know any of them. Now, it is not a crime not to know a
    particular technology, but surely a programmer must be a) aware
    that there are such things as parser generators, and b) be
    able to learn how to use one. Whether he lacked the understanding or the
    desire to learn, is this the kind of developer you want?

  • Background: We needed to create some scripts doing export from the
    database. The export was to be done under some specific
    conditions, which were to be specified in the queries
    (that is, only export dependent tables if their parent
    tables are eligible to be exported, etc.) The logic was
    only in SQL queries, the rest were just scripts passing
    these queries to DB2 command-line, logging everything.
    All of those were written by hand, 80% time spent copying
    and pasting things, and then looking for places where the
    pasted things needed to be changed a bit (for example,
    some things are exported several times into different IXF
    files, because they are dependent on different
    things. These files need to be numbered sequentially, so
    next one does not overwrite the other. What do monkeys do?
    Number them by hand. Great.)

    When I suggested automating things, in fact, automating
    from the first step – even before writing our own queries,
    using the metadata to generate the
    queries themselves, I was looked at as if I just escaped
    from the mental asylum.

    Monkey But you cannot just rely on metadata, there are also
    functional links which are not foreign keys.

    Me Why are they not foreign keys in the first place?

    Monkey Because they are functional.

    Me Stop using that word. Tell me why are they not foreign keys?

    Monkey Because they are nullable.

    Me A foreign key can be nullable! Why is it not a foreign key?
    OK, whatever, that’s our DBA’s problem… But there’s a convention for
    functional keys anyway (we know they all start with
    SFK_, by convention). I’ll use that.

    Two days pass. My script works. A week later, they have problems
    with their original scripts. My approach works,
    demonstrably. But ok, they want to keep doing it their
    way, fine. They ask for help with their way – those scripts, wrapping
    hand-made SQL queries (which are already being automatically
    generated, but I’ll hold on that for now…)

    Monkey What are you doing?

    Me Writing a Perl script.

    Monkey But there is no Perl on Windows.

    Me See, I am sitting at a Windows machine and I have Perl.

    Monkey What is it for? I thought Perl was only for the Web?

    Me I am writing a script to generate your silly scripts
    from the small set of user input. The resulting files, which you
    are now doing BY HAND, are cluttered with repetitive stuff, such
    as error-handling code and file numbering, and it’s error-prone to do
    search and
    replace manually. So we’ll generate all these scripts using my script.

    Monkey But they don’t have Perl on their Windows.

    Me Who are “they”?

    Monkey The client?

    Me First of all, this is for the AIX machine. Second, this is
    not for them, we will just deliver the generated shell scripts, the
    Perl script is for us only.

    It takes several iterations for Monkey to get it.

    A day passes…

    Me Hey, where’s my Perl script I wrote to generate the import
    scripts?

    Monkey We have to have only shell scripts.

    Me Yes, I used that one to create those shell scripts, dammit!

    Monkey (sits writing these shell scripts again by hand. At the
    moment, manually replacing some upper-case strings into lower-case)
    I
    removed it from CVS. They only want shell scripts on their machine.

    Me It wasn’t going on their machine! It’s only for us!!!

    Monkey Here, I changed these files already, you change the
    rest.

    Me (giving up) OK.

    Monkey Oh, and they have to be K-shell. Change them all to
    .ksh.

    Me Why do they have to be ksh? What’s wrong with sh? They are
    all very simple anyway, just call db2 import, check error status,
    that’s it.

    Monkey They have to be K shell. That’s what the DBA said.

    Me What the hell does the DBA have to do with it?

    Monkey He wants to be able change them, and he doesn’t know sh,
    only ksh.

    Me Ok, fine. I suppose you’re right, echo is
    different in K-shell.

    Monkey misses the seething sarcasm. Of course.

    Monkey Right here.

    Me I don’t see them. What is this OAD_0035.ksh? Is that it?

    Monkey Yes.

    Me What does this mean? What do these numbers mean?

    Monkey That’s what they said they should be called.

    Me Who are “they”???

    Silence.

    Me OK, you have a script called OAD_0035.ksh calling
    OAD_0038.ksh, which in turn calls OAD_0038_1.ksh, OAD_0039_2.ksh, etc.
    Why are they called this? It’s hard to remember which one is which.

    Monkey Why do you want to know what it means?

    Me Because if I don’t know what it means, it’s much harder for
    me to look at the file and see what is supposed to be inside. Ah, I
    see
    you added the insightful comment inside each file with its meaningful
    name. Ah, I see also, you use that stupid name inside of it over and
    over again, to write to the logs, instead of just using $0. (deep
    breath). I’ll just create some symbolic links to them with meaningful
    names, so I know what’s going on…

    An hour later

    Me Where are my links?

    Monkey They only wanted files there that are named like
    OAD_0035, etc.
    Me What the hell do these numbers mean???

    Monkey I don’t know. For security.

    Me (pause) Who told you to do this?

    Monkey The client.

    Me The client is a company. Who have you met from the company?

    Monkey I don’t know. They said the client wants this.

    Me Who said? Where? When?

    Now I’m really curious. I turn with this question to
    others. Finally I come to the last monkey who knows.

    Monkey 5 Uh, the client has guidelines on what they are
    supposed to be called. It should start with OAD, and
    then underscore, then four characters.

    Me Why four characters?

    Monkey 5 For normalization.

    Me What normalization?! What can you possibly
    change in a
    luminous egg
    , I mean, normalize in a 20-line shell script?

    Monkey 5 So they can keep them consistent and do some things to
    all of them, regardless of what they are called.

    Me What can they possibly want to do with shell scripts? Rename
    them to some other numeric pattern? There isn’t even any method as to
    how they are named, it’s not like a certain number
    pattern means it’s dependent on the other. You just
    named them randomly…

    Curtain

    But hey, fire one, and the replacement is easy to find. That’s true.
    I suppose Henry Ford would be proud, but isn’t this a backward
    approach? You don’t need monkeys at all, most of this work can
    be automated.

    Maybe I need another line of work. 🙂