ORM Considered Harmful

I don’t buy object-relational mapping. It’s not because JPA or its implementations aren’t well written, but because the concept itself robs capability in exchange for conveniences we don’t really need.

Relational databases have a lot to offer. SQL is a rich language that can perform complex queries, bulk data updates, computations, summaries, and more. It can address not just rows but individual columns, and it can join tables even if they weren’t formally set up to be joined. By contrast, ORM takes a limited view of the database as a series of objects to be manipulated individually and atomically. Create, read, update, delete.

When I interview a Java developer with an ORM background, I ask them to explain how they’d do some specific things:

  • Update one property of an entity, or a series of entities that meet a criteria. I’m looking for a solution that does not involve reading the entire entity from the database, updating the property in memory, and writing the entire entity back.
  • Produce a report that involves grouping and summarizing data (for example, total sales per region for a given quarter). I prefer solutions that don’t involve creating a report-specific Java class for every such report.
  • Read a row within a transaction in such a way that its data is guaranteed not to change for the duration of the transaction. What I want is the equivalent of the SELECT … FOR UPDATE statement in SQL. I’m still not sure if this is strictly possible using just ORM features.

Most ORM developers struggle with these tasks. But, what else can we use if not ORM? Surely we can’t just go back to using SQL queries and plain old JDBC, right?

Taming JDBC

JDBC is ridiculous. You get a connection, create a statement, apply parameters, execute it, and iterate over a result set. You pull properties from the result one at a time and do something with them. Everything throws an exception, and everything must be carefully closed to avoid resource leaks. You end up with 20 or 30 lines of code to do one SQL query. If we did that for every query, we’d have an endless morass of code on our hands.

But our job as programmers is to simplify, refactor, and reuse. In a production application, we can hide the details of JDBC in our own functions. We can use generics, callbacks, reflection, and other language facilities to reduce the amount of JDBC overhead per query. I have done just this on many projects.

I won’t present my solution in detail because what I wrote is specific to my projects. It is intended to be a design pattern rather than a reusable library. But I’ll show some of the end results in a moment.

Result Mapping

Building objects from database results is a mainstay of Enterprise Java. We assume that without an ORM, we’d be reduced to writing mountains of tedious, error-prone code like this:

customerID = resultSet.getInt("customer_id");
customerName = resultSet.getString("customer_name");
customerEmail = resultSet.getString("customer_email");

In fact, it’s pretty easy to automate these assignments using Java reflection. You can infer the property names from the column names, or write your own annotations to customize them. I typically put the functionality into the base class of my model objects, and access it via a constructor that takes a ResultSet.

Yes, it’s some effort, but it’s the kind of thing you can do once and reuse for an entire project, or even an entire organization. You can make it work exactly the way you want, and you can easily override the default behavior and process ResultSets manually when necessary.

Putting It All Together

Here’s an example of what a data access function might look like in my code:

public List<Appointment> getAppointments(int patientId) throws SQLException {
    String sql = "SELECT * FROM appointment WHERE patient_id=?";
    return this.queryForList(sql, Appointment.class, patientId);
}

There’s nothing here that isn’t straightforwardly related to the task at hand. It works because the “queryForList” function encapsulates the complexity of JDBC: it knows how to create a Statement out of the SQL string, how to apply the paramters (passed as Varags so more than one is possible), and how to create instances of the indicated class from a ResultSet. It does all of this using code I wrote myself, without relying on an ORM tool.

What if the result doesn’t map one for one with a Java object? In these cases, I use a different function that has a callback. It gives me access to the ResultSet without having to manage any of the other JDBC objects:

public Map<Integer,BigDecimal> getTotalsByProduct() throws SQLException {
    String sql = "SELECT product_id, SUM(amount) as amount FROM receipts GROUP BY product_id";
    Map<Integer,BigDecimal> totals = new HashMap<>();
    this.queryWithPopulator(sql, new Populator() {
	@Override
	public void nextResult(ResultSet rs) throws Exception {
	    totals.put(rs.getInt("product_id"), rs.getBigDecimal("amount"));
	}
    });
    return totals;
}

I could have solved the problem by creating a Java object with “productId” and “amount” properties, but I didn’t really need any such object, so I chose to put the results directly into a HashMap. I use the same technique to handle one-to-many relationships (inner JOINs) by writing code that creates a new object only when the ID of the outer table changes.

Think For Yourself, Write Your Own Code

The methods outlined here are not theoretical. I’ve used them sucessfully in large, professional projects. I also work with ORM when I’m on teams where ORM was chosen by another architect, so I appreciate what it can do. But I feel more comfortable knowing that I can write any SQL statement at any time, and process its ResultSet any way I want. Programming is about writing code, and sometimes that means building up our own code libraries that best suit our projects. We should not shy away from it.