Random Ideas

This page was migrated from the old wiki:

no more files



Let's get rid of the file abstraction. In my opinion, the file is not something that is necessary to all applications and in a lot of cases it gets in our way. Take the example of eclipse, for example. Eclipse does a lot of things amazingly well. It loads the semantic model of programming objects into a form that it can reason with, and do cool things with them like dynamic compiling, refactoring, hints, etc. But why is eclipse slow sometimes? Because it always has to synchronize with the file system. Whenever you make major modifications to the filesystem: deleting, editing or moving a large amount of files, eclipse is going to take a long time adjusting to the change because it basically has to clear out all the knowledge it has about the code and reload from the new file structure. If eclipse didn't have to sync to the filesystem it would be much much faster, and its underlying programming model maybe simpler too. Basically, I think a simpler model would be to let the user only have access to what the user interfaces allows him to do: "immerse in the user interface". The filesystem is nothing but a legacy system. Something that came with the operating system. Such is the way it works with web based systems like Gmail. The user doesn't know which object are files, and he doesn't have a need to. The underlying system may be persisted using files, or more likely, using a relational database, but none of that really matters to the user. All he wants is an easy to use and efficent user interface that lets him do what he wants. Now, I am not saying not to use files. I am just saying not to use the file abstraction. In other words, you could be using files for persistance, but you don't need your user to know about it. In the web application realm, this really doesn't apply, since most of them use relational databases instead of files, but in the desktop realm, this may be a new way of looking at applications. - toby 11:30, 19 February 2006 (CST)

code generation? how about no-generation-needed code generation?



I am of the opinion that if you have to write code generators to write programs for you because the programming tasks you have to do are too tedius and repetitive, it's problemly because your programming language isn't expressive enough to allow you to refactor together the common aspects of your code. For example, Java made a big leap when it introduced generics. Let's take a real world example. I was asked to write a data access layer, full of DAO objects. Adhereing to the spring framework frame of mind, there were to be a DAO interface for each entity. An example of one of the DAOs would be:
public interface CustomerDao {
    
    public List findAll();

    public void save(Collection customers);

    public void save(Customer customer);
    
    public void delete(Collection customers);

    public void delete(Customer customer);
    
    public void deleteAll();
    
    public int count();
}

Now, every DAO interface looks very much like this one. The only difference is the entity type - in this case Customer. The same kind of situation applies to the DAO implementation as well as the unit tests. There were about 10 entity types in total. Yes, I could have just put Object in place of Customer, and I got myself generic DAOs, and even their implementations. But that wouldn't be nice, because this is an API, and I want the signatures exposed to be type-meaningful. True, I could have just typed them all out, but normally, even 3 is too many for me. So I wrote a very small and extremely simple template engine to generate these classes for me (later I found out about velocity and freemaker, maybe I'll use those next time). Were I allowed to use Java 5 syntax, however, I could have done better. I could have written:
public interface Dao {
    
    public List findAll();

    public void save(Collection objs);

    public void save(T obj);
    
    public void delete(Collection objs);

    public void delete(T objs);
    
    public void deleteAll();
    
    public int count();
}

This looks much better and there needs only to be one file(instead of 10). I can do this for the implementation too! Although, yes, there is one thing you can't do - say what if you wanted the CustomerDao to be special and have one extra findXXX method? Then the only thing you can do in Java is add another interface that extends Dao. But anyway, the point is, one cool language feature could reduce your redundant code ten-fold. I think the way of the future is, instead of there being lots of cool code generation tools, there should be much cooler languages that eliminate the need to do any code generation at all. - toby 15:47, 22 February 2006 (CST)

Another example, Ruby on Rails. The reason I was so impressed by rails isn't it's code generation - which it does - but rather, it's lack of code generation. In rails, you use ActiveRecord to map objects directly to database tables, but you don't see one line of generated code devoted to listing table columns, or the object properties that they are mapped to. The relationships are all infered during runtime. Now, if you were using hibernate, you'd probably have to update 4 to 5 artifacts everytime you need to add a column or change a column name, you may use code generation tools so that you only have to change one of the artifacts and then simply sync the rest, but all the same. In rails, you don't have to a thing, you just add the column to the physical database, and that's it. You can start access the new column in your code. Now that's is some cool non-code-generation. - toby 15:58, 22 February 2006 (CST)

Components across all layers



With all this hype about annotations - which is basically just Xdoclet that got integrated into the language - what is really the problem they are trying to solve? I think the reason this came about is because of the rise of the layered architecture in todays applications. Even before you start a project, it is broken down into the data layer, the business layer, and the presentation layer, and the layers have strict boundaries as to what code is allowed to reference what code. For example, usually, only the business layer can reference the data layer, not vice versa, and only the presentation layer can reference the business layer, not vice versa. But sometimes components need to exist across all layers, such as DOs or data objects. Most of the time, to reduce redundancy, the designers allow the presentation layer to reference the data objects too, which are in the data layer. The problem comes when the presentation layer needs to add more information about the data objects then is available from the data layer. Like, for example, which attributes of the DOs are needed to be displayed to the client, or how to display a particular attribute of a particular DO. Because of the peculiar nature of most requirements, there are usually many special cases like these that the presentation layer needs to handle. So what people do sometimes is create a configuration object or XML file in the presentation layer to create a mapping between each attribute in each DO and what needs to be done with it to display it. What is bad about this is that you now have 2 separate listings of the same DOs and the same attributes, once in the data layer and another once in the presentation layer, and if the business layer happens to have anything it needs to tag on, you would have 3 separate listings. So now, each time you add or modify a DO, you will also have to modify 2 other configuration files. In my opinion, things that are logically together, should stay physically together. Xdoclets is designed to handle these kind of senario for you. You would put all of these information into the DOs themselves, but the business and presentation information would be put inside comments in the code. Then with some tool, you read in the DO source files, parse out the extra information in the comments, and generate the configuration files you need in the business and presentation layers. The big trick(or hack) is that because the extra information is in comments, the DOs, which are in the data layer, do not have to make any real reference to the business or the presentation layers. When you really think about it, the reference is still made, it's just not made in a way that the compiler can pick up, and therefore "It's Okay" - i.e. it does not violate the strict boundaries of the layers. This is a hack. Yes, in my opinion, Xdoclet, annotations, these are all hacks. But they do highlight a weakness in the Java language and/or the architectural model we employ. As I said, things that are conceptually together should stay together, and that is what Xdoclet and annotations try to achieve, but can't we do it in a way that is not... a hack? What alternatives do we have? Well, one of them, of course, is to simply get rid of the boundaries. Isn't the DO a component that's used across all layers of the application? So the DOs should not be restricted to any single layer. Put all the presentation related information into the DOs themselves, not as annotations or Xdoclets but as real code, meta information, etc. But the downside of this is that your DOs may become harder to reuse because all of the extra stuff it depends on. Like for example, if you use the display tag library, you may want to map each DO to the decorator you want to use with it. If you put that into the DO, the next project you want to start may not even be a web project, which means your DOs are either unusable as is, or just has extra info that you don't care for. There's a cool feature in Ruby, which is guess is sort of old school in that I think it's originally a feature in C++. The feature: you can define things for the same class in different files, even if they are part of different libraries. This is cool because now we can define different aspects of the same DO in different layers(libraries) without interfering with other layers. But also, we've separated the info into separate files again =). What are we going to do? Well, one thought is to put these declarations into the same file, with markers indicating which declarations belong to which library(of course, we are now talking about a language that doesn't exist yet, as far as I know). But this is cludgey too, I mean, yes, you can filter out the parts that you don't need to use, but you still see them because they are in the same file. I think this is where the "no more files" idea will come in play(see previous entry). In an environment that doesn't let the file abstraction sip through, the extra information you don't need can be filtered for you before you see it, you don't really need to care about how the code is stored on disk, just worry about how it is organized to you. With that in mind, I think the BeanInfo stuff in Java would be better served if there were a language construct that allows you to define bean info side the same file, rather than in a separate file. - toby 18:10, 26 March 2006 (CST)

blog comments powered by Disqus