Archive for the “Java” Category

A few weeks ago Steve Yegge posted an article about code base size and it’s negative effect on projects.  While I agree that his example of a 500k LOC project is horrid I’d have split some hairs here and say that lines of code (or bloat as he refers to it) isn’t the problem.  The problem is a horrible application architecture and lines of code is just one of the symptoms.  Other symptoms may include execution speed, memory consumption, lack of encapsulation, security vulnerabilities or a host of other issues.  I’m curious as to what political/managerial/architectural situation arose that allowed a single application code base to grow so large.

Do you have an application architect? Why didn’t the application get carved up into distinct loosely coupled systems?
Do you have sufficient business representation that prevents the “just add it to application X” problems?  How does the scope of an application grow so much with no oversight?

The application bloat I’ve seen typically takes three forms:

  1. Very little code reuse is taking place.  How many separate logging infrastructures do you have?  Using multiple ORM tools in the same project?  How long does it take a developer to understand all that?
  2. Reinventing the proverbial wheel.  This is loosely related to #1 but I’ll call it out here because it’s so important.  If you wrote from scratch any of the following for your project you need to seriously justify your decision IMHO: ORM, logging, web framework. If your language of choice doesn’t already include suffucuent choices for these utilities then you need to reconsider your platform choices.  Avoid the “not built here so I don’t trust it” syndrome.
  3. Feature-itis.  Knowing when to say “no” is a very valuable skill.  Knowing when to say “yes” conditionally is also important.  It’s ok to add that new feature but require some refactoring time so it doesn’t just add to the collective mess.  Think of architectural trade offs as a karma based system.  In other words for every shortcut you take now you make things harder on yourself later.  For example: I can add your feature in 10 days with 40% code duplication (+ 20 days of cleanup later and every other coding task has to slow down to work around this ugliness) or I can add it in 20 days with 0% code duplication (and little to none of the other slowdowns).  It’s acceptable to tell the business that the feature, if done properly, will take 20 days to implement.  In fact it’s your duty to recommend the slower approach.  If you have a boss who doesn’t understand those trade-offs you need to start looking for another job.

I’ve seen (and worked at) both kinds of mistakes and they all contribute to code bloat but that isn’t their biggest problem.  Bringing new talented developers up to speed on the whole system is a monstrous adventure.  No single developer understands enough of the entire codebase to affect significant change.  The systems become so brittle over time that making small changes involves enormous regression testing challenges.

You are a developer.  It’s your duty to protect the integrity of the code base you are working on.  That’s part if what your company is paying you to do.  Yes there will be pressures to deliver faster.  Yes there will be pressures to do things that violate your developer principals.  That’s part of life.  Accept it.  Now that you can take a deep breath what are you doing to protect your code base?  Do you have a solid architectural plan that you can show to your PHB that lets him know how this feature fits into the greater system?  Do you have at least some high-level design documents that allow you to justify your position?  Are you prepared to defend your position to your stakeholders.  “While it’s technically possible to add feature X in 10 days that will have a negative net impact on the whole project and here’s why…”.
Also, don’t think that merely switching to/from a waterfall, agile etc shop will magically fix these issues.  It won’t I’ve seen both kinds of development shops make all of these mistakes.

Comments No Comments »

The Colorado Software Summit this year was a blast. Bryan, Lisa and I learned enough to make our heads spin. We drove through a bit of snow and ice from Denver to Keystone but once we arrived we were graced with very beautiful and cold weather. I took a few pictures as usual. I think that I was a little unclear if there would be an underlying theme like last year (SOA) but I would probably pin scalability and interoperability as the general topics of the event this year. Lot of people were talking about virtualization, concurrency with todays multi-core environments, general scalability, and interoperability of web services, data formats etc.
Dan Pritchett from eBay had to be one of the more impressive speakers this year in my opinion. In many ways (but certainly not all) we share a lot of scale issues with eBay so hearing Dan talk about seemingly odd architecture decisions (and the slaying of various sacred cows) really resonates with me.  Just to pick a few oddball concepts they use in various forms at eBay:

  • No client-side database transactions.
  • Database partitioning and sharding.
  • 100% stateless application tier.
  • No 2 phase commits. Don’t worry about the order or failures, clean them up later.
  • Only use stored procedures for ETL-like routines.
  • Use lots of small soldiers instead of large soldiers. (commodity hardware, not expensive branded stuff)
  • Virtualize wherever possible so horizontal scaling is more possible.

Dan sat with us at lunch one day and he’s a simple yet captivating speaker. I talked to him for a while after the Thursday night BoF he held on the eBay architecture. We talked about monitoring and I learned a few fascinating things that have really made me think differently (or reaffirm some existing hunches) about some of the architecture decisions I’ve made on many of the applications I have managed or currently manage.

Want a few crazy facts about the eBay architecture?

  • 1 Billion photos
  • > 1 Billion page views / day
  • 26 Billion SQL queries / day
  • 100 Million items for sale
  • 99.94% availability
  • 1 terrabyte of log files generated per day
  • around 100 code branches are active at any time
  • 15,000 application servers + 300 database servers
  • 300 million front-end searches per day (plus tens of millions of internal searches) == more than Google

I also learned a bit more about Continuum, ActiveMQ and Groovy.

Matt Raible (as usual) gave a couple of great talks about Java web application frameworks. During Scott Davis’s talk on ATOM and REST, Bruce Snyder made a comment that he used JMS as a SOAP transport. Fascinating. Why don’t more people do that? All the message reliability you’d expect out of a message broker (guaranteed delivery, indemnity etc) with the contract of SOAP. Great idea. AMQP also seems like a great idea for message bus architectures. If you’re at all having to deal with bus interoperability, bus vendor lock-in or anything similar, give AMQP a shot.
I went to another Scott Davis talk on Grails. Grails is a new web application framework using Groovy that has borrowed a lot of rapid development techniques from Rails. The awesome thing about Grails is that is uses all of the leading Java tools like Spring and Hibernate. I have been playing around with it on and off for the last few months and I really like it so far. I’m really interested in seeing is mature more though.

Speaking of ATOM and REST, I like the philosophy of REST but I like that WSDL has basically standardized contracts for SOAP services. I’m not saying that I think WSDL is perfect by any means, but it works. I don’t usually need to read it because it’s automatically generated and automatically consumed by most implementations I manage. REST offers no such standard right now for contracts. In order to integrate with a RESTful service I either have to read the spec documents (assuming they exist) or I have to guess as to the contract. Should I ask for “/San_Antonio” or “/San+Antonio”. Read the GData spec. It’s ATOM but it still prints out to dozens of pages. Then what data type is returned? I’d have to make a request, inspect the returned XML and build a parser. Hopefully it’s supplied a schema. What a pain. And this is supposed to be better than SOAP. If a SOAP service provides a WSDL I just point my wsdl2java client to it (or equivalent in whatever language or toolkit I’m using) and I’m literally integrating against that interface within a few seconds. The WSDL provides all the contract I need to work.

I also went to a talk by Gregor Hophe of Google (and author os the awesome Enterprise Integration Paterns book) where he demonstrated a really cool and simple integration of the Google Mashup Editor and Yahoo Pipes. I’ve been using Yahoo Pipes for several months and as a few coworkers can attest, I really love it. Think of it as all the fundamental UNIX utilities (pipes, sed/awk, grep, tee, sort, split etc) but for RSS/ATOM feeds. I have it take a handful of very chatty RSS/ATOM feeds, combine them, filter for certain topics and spit them out as a single feed. That mini-app too all of 45 seconds to create. Gragor’s example took a Google Calendar feed, fed it into Yahoo Pipes, did a little massaging of the format, added Geo Location data and spit the result back out to a feed. That feed he sucked into the Google Mashup Editor (new service, I just got my account today) and integrated it with a Google Map so he could see the location of his calendar events. It was a really slick lightweight integration technique.

Speaking of Google, the Google Gears demo was very cool. I know of a few people who are already getting interested in this project.

Comments No Comments »

I’m headed to the Colorado Software Summit again this year. If I learn half as much as last year I’d still think it was worthwhile. I have been to a number of other software development and techie conferences and this one is hands down my favorite. Here’s a few quick reasons why:

  1. No vendor booths. No pressure, no commercialism, no spin. Just tech.
  2. Every session is worth taking notes in. They are very techie focused. I bring my laptop and between sessions find myself playing with the stuff I just learned about.
  3. Several of the speakers wrote books that I own or software that I use. Perhaps that speaks to the quality of the presenters or maybe to my taste in books and software.
  4. Keystone, Colorado is incredibly beautiful. We received 18 inches of snow during the conference last year.

Last year many of the session topics covered Service Oriented Architectures and various tools often used to facilitate them. Message busses, web services, mashups etc. This year there looks to be a bunch of topics on web frameworks, architecture (SOA, scaling, monitoring, grids etc) and data. Good stuff.

Here’s a few links from last year:

Comments No Comments »

So Bryan and I had a 6:34am flight out of San Antonio to Denver this morning. Let’s just say, we were supposed to have a 6:34am flight. We get to the airport and are in the ticket line at roughly 5:35am. Cool, an hour before the flight leaves and a very early flight, no problem. Except that the only airline with a really long line…yep you guessed it, United. At just after 6:00am we finally get to the counter and the electronic kiosk won’t accept our checkin. The lady working the counter says that we can’t checkin because we weren’t at the counter 45 minutes before the flight. ?? Um, we were in line an hour before. Too late, they were understaffed and it would take too long for the TSA guys to screen our bags and we couldn’t fly without our bags (TSA regulations), so we were being bumped. Only problem, the next flight only had one seat available. So I took the 8:40am flight to Denver while Bryan took the only seat on the 12:45 flight. His flight however, was delayed out of L.A. and he didn’t leave San Antonio until 3pm or so. What a day.

The nice girl at Enterprise felt so sorry for us that she “upgraded” us to a P.T. Cruiser. What a strange and underpowered car.

Keystone is incredible though. There’s about a foot of snow on the ground and it’s about 27° outside. I took a few pictures but it was too late by the time we got here to get much of the resort. I’ll try to take more in the morning.

I can’t wait for the conference tomorrow, there are so many sessions that I think Bryan and I are going to split up and cover as much as we can.

Comments 1 Comment »