I have long been a fan of object-oriented programming. From the first day I borrowed a book on C++ from the local library when I was about 10 or 11 years old, I was hooked. Objects were so natural, convenient, and kept code organized. Objects could be related to other objects, objects could contain lists of other objects, and object graphs were born. Back then, and for quite some time, objects seemed to be king, and were the fundamental component of software for a very long time. Even today, objects are still a very fundamental aspect of software development. Objects and object graphs solve a lot of problems.
Object graphs don’t solve every problem, however. Sometimes, the use of a rich, hierarchical object model with rich relationships between the various participants can get in the way. Simple needs can become hugely complex when object graphs are involved. Resources such as CPU, memory, and bandwidth can become needlessly consumed when object graphs are created and passed around when only a subset or aggregation of the information contained within the graph is actually required. With the complexity of modern applications, and the need to develop them cheaply with minimal developer resources…keeping things as simple and efficient as possible is becoming ever more important.
Data for Modification
When it comes to editing data, objects are still king. Creation, modification, or deletion of data is a fundamental part of just about every application. Such operations require detailed knowledge of the “true” structure of the data being edited. Knowing the hierarchical and relational associations between the entities in a graph are critical to safely and successfully creating, modifying, or deleting the data those entities represent while maintaining the integrity of that data. Rich object graphs are the ideal mechanism to store and represent data that needs to be modified.
Data for Display
Despite the value of objects for modification, the richness and detail of an object graph does not effectively meet the bar when data simply needs to be displayed. Data is most often stored in a relational structure: tables, keys, and references to keys of other tables. Pieces of information are isolated into discreet locations, aggregated into related sets, and stored in the most efficient manner possible to save space and maximize retrieval speed.
When we display data, however, we tend to display it in ways that make the most sense to humans, and those ways don’t often mirror the ways we store data or edit data. Data from multiple tables are joined, aggregated, and reduced into a flat set that represents exactly what humans are interested in. These flat result sets are best represented, not with a structured, composite object graph, but with a simple table, or a collection of simple non-related, non-structural objects. Simple, flat structures with minimal relationships are compact, efficient structures that may be quickly transferred from one tier to another in an application.
Separating Display and Modification Concerns
The differences in the needs of data display vs. data modification dictate that they be approached differently when developing software. In a modern n-Tier application, the same data will often be transferred across a wire several times. Database to Business Layer, Business Layer to Client, Service to Service, etc. Transferring a collection of rich object graphs when only pieces of information from that graph need to be displayed is wasteful of resources across all the tiers of an application. Transferring a rich object graph to a view for editing that information is essential, and therefor not necessarily wasteful, since all the details of the information being edited are required.
Older software development paradigms, such as ADO.NET with Data Sets, natively support this separation of concerns. A simple DataTable may be retrieved with exactly the information that needs to be displayed. A DataSet containing the full relational set of information required for editing may be retrieved for edit. A DataSet will also track changes and make bulk persistence of all changes simple when it is sent back to the business layer. However, DataTables and DataSets are weakly typed, simple structures that don’t fully represent the domain of an application.
Display vs. Modification w/ LINQ to SQL & Entity Framework
Modern software development paradigms involve rich domain models that accurately reflect a problem domain. Product objects represent products, Category objects represent categories, Categories contain collections of products, etc. Object graphs in a domain layer can be quite rich and complex. Things aren’t just sets of data, they are functional entities, replete behavior, continuity, integrity and relationships. The complexity of these rich object graphs has given rise to the use of O/R Mappers, which serve as intelligent data access layers capable of dynamically generating SQL statements and materializing object graphs for you, with minimal effort on the part of the developer.
LINQ to SQL is a basic, SQL Server only O/R mapper from Microsoft. While it does not provide super rich modeling capabilities, cross database support, or broad support for service-oriented applications, it does provide support for retrieving flat, non-structured data sets. L2S offers great support for “custom projections”, where specific pieces of information from various related entities in a domain model may be selected into an anonymous result set, or into a strongly type result set using a custom class.
Entity Framework is a rich, fully featured, cross-database O/R mapper from Microsoft with broad support for service-oriented applications. Entity Framework supports a wide range of entity to table mappings, giving it a degree of mapping flexibility that L2S just can’t approach. Unlike L2S, however, Entity Framework does not efficiently support custom projections. The current SQL statement generation pipeline makes the assumption that you are always querying for full entities or graphs of entities. The SQL generated for custom projections by EF can be quite inefficient, and at times, it can be incorrect and cause massive amounts of data to be processed by the database engine.
Right Tool for the Job
If you are attempting to simplify your projects, reduce overhead, and improve efficiency for your applications, make sure you pick the right tool for the job. Use object graphs when they provide useful benefit, and use simple, flat result sets when you don’t need the full richness of an object graph. Find and use tools that will support these patterns in an efficient way. LINQ to SQL may not offer rich mapping like Entity Framework, but it does support flexible querying and allows developers to retrieve the information they need with minimal resources. Methodologies like DDD encapsulate and isolate data access from the rest of an application through the use of Repositories. Some of the additional benefits of EF could be spoofed internally in Repository classes, allowing L2S to be used in the interim until EF matures more and offers better support for data display concerns.