From NHibernate to Entity Framework 6 - Part 1: the mapping story

This is the first part of an n-part series on using Entity Framework 6 coming from an NHibernate background . Although this series is primarily aimed at NHibernate veterans making the transition to Entity Framework 6, any developer new to EF will probably find most of this information helpful too.

Introduction

After having used NHibernate quite extensively for the past few years, I've just completed my first 6 months in the Entity Framework world.

I had actually already given Entity Framework a try back in the EF4 days. I quickly abandonned it as it became clear very quickly that it wasn't ready for prime time. But I have to say that Entity Framework has come a long way since then. EF6 has become not just a usable but a very capable ORM. It's definitely now a very credible alternative to NHibernate.

The transition from NH to EF6 was fairly smooth as they are conceptually very similar. However, I had to spend quite some time researching many of the finer points before starting to feel confident with EF and happy to let it manage my data. Entity Framework's rather poor documentation didn't help much.

In the NHibernate world, you know that no matter what your question is, Oren Eini (who goes by the pseudonym Ayende Rahien) will have already written a clear, precise and concise blog post addressing it. And you know that it will show up as the first result when you google it. There is unfortunately no Oren Eini equivalent in the Entity Framework world. When you google an EF-related question, all you get are Stack Overflow questions. And to be completely honest, the answers there rarely surprise by their brilliance. As soon as you venture out of the non-trivial territory, SO answers to EF-related questions tend to be tentative, incomplete, superficial, out-of-date or just plain wrong.

This series attempts at filling some of the void. If you consider yourself an NHibernate veteran and are taking on your first EF project, this should save you some time.

Database First, Model First, Code First, oh my

With Entity Framework - just like with every other ORM, your first task is to map your object model to your relational model. With EF, it's also the first hurdle: should you adopt a Database First, Model First or Code First mapping workflow?

Let's go through a very quick refresher on how NHibernate deals with this issue and see how it compares with Entity Framework.

Object-to-relational mapping the NHibernate way

With NHibernate, mapping your object model to your relational model is a fairly simple and logical process and one that your will see implemented in a very similar manner in just about every NHibernate-based application.

You start off by defining your domain model using plain C# classes (or POCO rather since it doesn't have to be C#). This part is unrelated to NHibernate and there should be nothing NHibernate-related in your domain model classes.

You then define the mappings between your domain model objects and their relational representations. You can do it either using XML (if you enjoy pain), using the built-in but undocumented fluent API or using the excellent fluent API provided by the third-party Fluent NHibernate library.

Once your mappings are defined, you can then just create a Configuration instance, add your mappings to this configuration object and call its BuildSessionFactory() method to create a SessionFactory instance. This is typically done once on application startup.

The SessionFactory instance in NHibernate is responsible for holding onto the compiled mappings and for creating new Session instances on demand. It's thread-safe and typically used as a singleton. If you were working against multiple databases, you'd have one SessionFactory instance per database, each containing the mappings for that database and able to create new Session instances that know how to generate and execute SQL queries against that database.

If needed, you can also create a new SchemaExport instance, passing in your Configuration object and call its Execute() method to generate a SQL script creating your database schema based on the mappings you defined and / or execute that script to create the database on-the-fly for you. This can be handy if you're working on a new application where the database hasn't been created yet or to keep an up-to-date database creation script under source control. E.g.:

// parameters : script = write out to schema.sql file. export = execute against database. justDrop = drop existing database only
new SchemaExport(configuration)
	.SetOutputFile(scriptPath)
	.Execute(script: true, export: false, justDrop: false);

Object-to-relational mapping the Entity Framework way: Code First

The NHibernate way of object-to-relational mapping is supported in Entity Framework and is what is refered to as "Code First". Code First mapping has been supported in Entity Framework since version 4.1.

In practice, and unless you have a really good reason to choose one of the other workflows, Code First will be the mapping workflow you'll use with EF. Just like with NHibernate's mapping model, Code First is simple, logical, flexible and it doesn't rely on Visual Studio-specific designers or on auto-generated code.

Creating your domain model

With Code First, you start off by defining your domain model using plain C# classes. Just like with NHibernate, your domain model should of course contain nothing EF-related.

Defining your object-to-relational mappings

Once your domain model is created, you can use Entity Framework's fluent mapping API to create your mappings. This API is very similar to that of Fluent NHibernate. So Fluent NHibernate users will feel right at home.

For each domain model class, you simply create a mapping class deriving from EntityTypeConfiguration<TEntityType> and add the mapping in the class' constructor. Here is an example mapping of a stereotypical User entity:

// Domain model
public class User
{
	public long Id { get; set; }
	public string Name { get; set; }
	public string Email { get; set; }
	public byte[] PasswordHash { get; set; }
	public byte[] Salt { get; set; }
}

// Entity Framework mapping
public class UserMapping : EntityTypeConfiguration<User>
{
	public UserMapping()
	{
		ToTable("Users");
		HasKey(m => m.Id);
		Property(m => m.Id).HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
		Property(m => m.Name);
		Property(m => m.Email).IsRequired().HasMaxLength(256);
		Property(m => m.PasswordHash).IsRequired().HasMaxLength(32);
		Property(m => m.Salt).IsRequired().HasMaxLength(32);
	}
}

Note that, unlike what we would have had to do with NHibernate, we didn't have to declare the properties in our domain model class as virtual. We'll get back to this later.

And as we'll see later, most of this mapping code is in fact unnecessary as Entity Framework uses convention-based mappings by default (but it doesn't hurt to specify it manually as we've done here if you prefer).

Initializing Entity Framework with your mappings

Configuring EF with your mappings is one of the points where I personally think EF gets confusing and has been poorly designed.

Entity Framework's equivalent of Nhibernate's Session is DbContext. Both NH's Session and EF's DbContext have the same purpose, are used in very much the same way and share many of the same properties. We'll take a more detailled look at the similarities and differences between Session and DbContext in a future post but, for now, whenever you see DbContext, think Session.

Entity Framework however doesn't have any equivalent for NHibernate's SessionFactory. Intead, the responsibilities of both Session and SessionFactory are held by DbContext in the EF world.

So, like SessionFactory, DbContext is responsible for creating the compiled object-to-relational mappings, caching them for the duration of the application and providing them in a thread-safe manner to all DbContext instances that have to work against the database for which those mappings were created.

And like Session, DbContext is also responsible for providing a single-threaded, short-lived unit-of-work that manages the database connection and database queries for the duration of a single business transaction.

These are two very different responsibilities with completely different lifecycles and constraints that would really have been best implemented in two different classes. But so be it.

So how does it work?

With EF, you never instantiate DbContext directly. Instead, after having created your domain model and mapping classes, you must create a new class deriving from DbContext which will be configured with your mappings. It's then this DbContext-derived class that you will use as you would have used Session in NHibernate. Quick example that creates a DbContext configured with the User mapping we created earlier:

public class MyDbContext: DbContext
{
	public MyDbContext() {}
    
    public MyDbContext(string nameOrConnectionString) : base(nameOrConnectionString)
		{}
        
	protected override void OnModelCreating(DbModelBuilder modelBuilder)
	{
		base.OnModelCreating(modelBuilder);
		modelBuilder.Configurations.Add(new UserMapping());
	}
}

There are different ways to configure EF with your mappings but the method above is the most common method and the recommended one. Override the DbContext.OnModelCreating() method and add your custom mappings and other configuration values there.

Here's we've specified our mapping class (UserMapping) explicitely. In practice, you'll most likely want to use the ConfigurationRegistar.AddFromAssembly() method instead to register all the mapping classes found in a given assemby.

What happens on DbContext instantiation

You're all done. You can now create a instance of your DbContext-derived class, passing in your database connection string or the name of the connection string to use in your app.config / web.config file and start using it to query your database:

using (var context = new MyDbContext("Server=localhost;Database=MyDatabase;Trusted_Connection=true"))
{
	var johns = context.Set<User>().Where(u => u.Name == "John").ToList();
}

which would be the equivalent of the following NHibernate code:

using (var session = sessionFactory.OpenSession())
{
	var johns = session.QueryOver<User>().Where(u => u.Name == "John").List();
}

When you create the first instance of your DbContext-derived class, EF will automatically compile and cache you mappings. As part of this process, it will call the OnModelCreating() method. The first instantiation of a DbContext-derived class within an app domain is therefore a very expensive operation that will most likely involve opening a database connection. It's the equivalent of the Configuration.BuildSessionFactory() call in NHibernate.

Any subsequent instantiations will be very cheap operations as the cached compiled mapping will get re-used. In particular, the OnModelCreating() method won't get called and no attempt to connect to the database will be made (until you start querying the database that is). So subsequent instantiations of your DbContext-derived class are doing the equivalent of the SessionFactory.OpenSession() call in NHibernate. It's therefore perfectly acceptable to create instances of your DbContext eagerly, for example at the start of every web request in a web application, as you would do with NHibernate's Session.

Convention-based mapping

In NHibernate, convention-based mapping of the object model to the relational model isn't a common sight. The third-party Fluent NHibernate library does offer a convention-based mapping feature but it's not part of NHibernate itself, it's not enabled by default and it can be quite cumbersome to customize to suit your particular model.

Entity Framework on the other side fully embraces convention-based mapping. It's built-in, it's enabled by default and, on the whole, it's rather nice.

In the example above, we explicitely mapped our User domain model object to its relational representation. But we didn't have to. Had we not specified an explicit mapping, EF would have mapped it automatically using its default conventions. Entity Framework's documentation contains an overview of the mapping conventions it uses.

This of course now begs the question: how does Entity Framework know which class should be mapped? It does it by looking for DbSet<TEntityType> properties on your DbContext-derived class.

So in order to have a domain model class automatically mapped by EF, simply add a DbSet<TEntityType> property for that class to your custom DbContext. In our previous example, if we'd wanted to have our User model automatically mapped by EF, we would have used the following implementation for our custom DbContext:

public class MyDbContext: DbContext
{
	// Tell EF that the User class needs to be mapped
	public DbSet<User> Users { get; set; } 

	public MyDbContext() {}
    
    public MyDbContext(string nameOrConnectionString) : base(nameOrConnectionString)
		{}
        
	// No need for the OnModelCreating() override anymore
    // as we're using convention-based mapping.
}

The first time that your custom DbContext class is instantiated, Entity Framework will use reflection to find all the DbSet<TEntityType> properties declared on your context and will automatically generate mappings for these domain model classes. If these classes contain references to other domain model classes, they will get mapped as well.

So if you happen to use DDD, you can limit yourself to declaring your aggregate roots as DbSet<TEntityType> properties on your DbContext. The child entities will be included in the mappings via the aggregate roots. Otherwise, declare all your domain model classes as DbSet<TEntityType> properties to ensure that they all get mapped.

Ad-hoc overrides of the convention-based mappings are as trivial as it gets. Simply create a mapping class that only contains the mappings for the properties you want to map explicitely. For example, in the case of our User entity, if we wanted to not store its PasswordHash value in the database for security reasons and store it somewhere else instead and wanted to limit the width of its Email and Salt columns as we did in our explicit mapping earlier, we could just use this cut-down mapping class that only declares the mappings that differ from the default mappings:

public class UserMapping : EntityTypeConfiguration<User>
{
	public UserMapping()
	{
		Property(m => m.Email).IsRequired().HasMaxLength(256);
		Property(m => m.Salt).IsRequired().HasMaxLength(32);
        
        Ignore(m => m.PasswordHash);
	}
}

And then configure our DbContext with it:

public class MyDbContext: DbContext
{
	// Tell EF that the User class needs to be mapped
	public DbSet<User> Users { get; set; } 

	public MyDbContext() {}
    
    public MyDbContext(string nameOrConnectionString) : base(nameOrConnectionString)
		{}
        
	protected override void OnModelCreating(DbModelBuilder modelBuilder)
	{
		base.OnModelCreating(modelBuilder);
        // Mapping overrides
		modelBuilder.Configurations.Add(new UserMapping());
	}
}

With this setup, the Id and Name properties of our User class will get mapped automatically, while its Email, PasswordHash and Salt properties are explicitely mapped.

Lazy loading and virtual properties in domain model classes

One thing that you quickly learn the hard way when working with NHibernate is that you must either declare all the properties in a domain model class virtual or disable lazy-loading for that domain model altogether. There's no middle point.

Entity Framework is more lenient in this respect. You still need to declare properties that you want lazy-loaded (navigation properties or collections) as virtual of course so that EF can generate a dynamic proxy class that overrides those properties in order to provide the lazy-loading behaviour. But you can leave all the other properties as non-virtual if you wish.

There are more subtleties involved with virtual properties and dynamic proxy classes in Entity Framework. If you're curious, the blog of Arthur Vickers, one of Entity Framework's developers, is a good place to start.

Database creation script

Once you've created an instance of your custom DbContext, you can get it to generate a database creation script for you:

var sqlScript = ((IObjectContextAdapter)context).ObjectContext.CreateDatabaseScript()

As you can see in the DbContext source code, DbContext implements IObjectContextAdapter.ObjectContext explicitely, hence the ugly but necessary and safe cast. It's not clear why the EF developers felt the need to reduce the discoverability of this property.

By default, Entity Framework will automatically create the database for you if it doesn't already exist during the first instantiation of your DbContext.


So that was Code First. I'll quickly go over Database First and Model First for the sake of completeness but you're unlikely to want to use them.

Object-to-relational mapping the Entity Framework way: Database First

Another way to map your object model to their relational representations in EF is Database First. It was the only mapping workflow available in the first version of Entity Framework.

With Database First, you first create your database. You then add an .edmx file to your project. .edmx files are very similar to NHibernate's XML mapping files. They contain your object-to-relational mapping in XML format. You can write the mappings by hand if you really want to.

Alternatively, you can use Visual Studio to create this file. VS will bring you through a wizard allowing you to connect to your database and add the relevant database tables.

VS will then generate the source code of your domain model classes based on the database tables you added. It will also generate the code of a custom DbContext that you can use to query your database. Finally, it will add a new connection string to your app.config / web.config file that points to the .edmx file. It's necessary for the generated DbContext to locate the XML mappings.

You can now just create an instance of the generated DbContext class and use it normally.

There's nothing fundamentaly different between Code First and Database First. The only difference is that with the latter, the code for your domain model classes and DbContext will be automatically generated and the mappings will be defined in XML instead of in code using a fluent API. None of these things are a good thing in my book so I really wouldn't recommend using Database First.

Object-to-relational mapping the Entity Framework way: Model First

Model First was introduced in Entity Framework 4, which was the second version of EF.

Model First is identical to Database First, with the exception that you don't have to create a database first. Instead, you start by adding an empty .edmx file to your project. You can then use the Visual Studio designer to visually "design" your domain model.

Once you're done, VS will generate the code of you domain model classes and DbContext for you just like it would do with Database First. Also not a great approach in my book.

How to tell if an existing code base is using Database First, Model First or Code First

As we've seen earlier, there are no fundamental differences between the three mappings workflows in Entity Framework. In all three cases, you will end up with a set of domain model classes (auto-generated by Visual Studio if using Database First / Model First or written by you if using Code First) and with a DbContext-derived class that you will use to query the database (auto-generated by Visual Studio if using Database First / Model First or written by you if using Code First).

The main difference is the way the object-to-relational mappings are defined. Applications using Database First or Model First will have these mappings defined in XML in an .edmx file. Applications using Code First will either have no mappings defined (if the application relies entirely on convention-based mappings) or will have mappings defined in C# using Entity Framework's fluent mapping API.

So how does Entity Framewok know which mapping workflow the application is using?

It's all down to the connection string the application provides when instantiating the DbContext-derived class.

If the application uses a standard database connection string such as:

Server=SERVER_NAME;Database=DATABASE_NAME;Trusted_Connection=true

...DbContext will start in Code First mode. In this mode, it will use reflection when first instantiated as we've seen earlier to discover any property of type DbSet<TEntityType> you've defined on the class and automatically create mappings for these entity classes. It will also call its OnModelCreating() method to register any explicit mapping the application may have specified.

If on the other side the application specifies a connection string containing information about an .edmx file, such as:

metadata=res://*/Model1.csdl|res://*/Model1.ssdl|res://*/Model1.msl;provider=System.Data.SqlClient;provider connection string="data source=SERVER_NAME;initial catalog=DATABASE_NAME;integrated security=True;MultipleActiveResultSets=True;App=EntityFramework"

...DbContext will start in Database First / Model First mode. In this mode, it will not do any convention-based mappings and it won't call its OnModelCreating() method. It will instead rely entirely on the XML mappings defined in the .edmx file.

If you're interested, the EF documentation has more details on connection strings with Entity Framework.

So if you're wondering what mapping workflow an application you've inherited uses, look for the connection string.

A complete example

This is the full source code of a small console application that uses Entity Framework Code First to query the stereotypical User model we've been using throughout this post and that demonstrates all the points we've covered above:

using System;
using System.Data.Entity;
using System.Data.Entity.Infrastructure;
using System.Data.Entity.ModelConfiguration;
using System.IO;
using System.Linq;

namespace EntityFramework
{
	class Program
	{
		private const string ConnectionString = "Server=localhost;Database=EFExample;Trusted_Connection=true";
		private static readonly string DbCreationScriptPath = Path.GetFullPath("schema.sql");

		static void Main(string[] args)
		{
			// Get EF to generate a database schema creation script for us
			using (var context = new MyDbContext(ConnectionString))
			{
				File.WriteAllText(DbCreationScriptPath, ((IObjectContextAdapter)context).ObjectContext.CreateDatabaseScript());
				Console.WriteLine("Wrote database schema creation script to {0}.", DbCreationScriptPath);
			}

			Console.WriteLine();

			// Write to database
			using (var context = new MyDbContext(ConnectionString))
			{
				var user1 = new User()
				            {
					            Name = "User 1",
					            Email = "email1",
					            PasswordHash = new byte[0],
					            Salt = new byte[0]
				            };

				var user2 = new User()
				{
					Name = "User 2",
					Email = "email2",
					PasswordHash = new byte[0],
					Salt = new byte[0]
				};

				context.Set<User>().AddRange(new [] {user1, user2});
				context.SaveChanges();

				Console.WriteLine("Inserted User Id {0} in the database.", user1.Id);
				Console.WriteLine("Inserted User Id {0} in the database.", user2.Id);
			}

			Console.WriteLine();

			// Read from database
			using (var context = new MyDbContext(ConnectionString))
			{
				var count = context.Set<User>().Count();
				Console.WriteLine("Found {0} User records in the database.", count);
			}

			Console.WriteLine("Done. Press Enter to exit.");
			Console.ReadLine();
		}

		public class MyDbContext : DbContext
		{
			// Tell EF that the User class needs to be mapped using the default conventions
			public DbSet<User> Users { get; set; }

			public MyDbContext()
			{}

			public MyDbContext(string nameOrConnectionString)
				: base(nameOrConnectionString) {}

			protected override void OnModelCreating(DbModelBuilder modelBuilder)
			{
				base.OnModelCreating(modelBuilder);

				// Specify mapping overrides
				modelBuilder.Configurations.Add(new UserMapping());
			}
		}

		//-- Domain Model
		public class User
		{
			public long Id { get; set; }
			public string Name { get; set; }
			public string Email { get; set; }
			public byte[] PasswordHash { get; set; }
			public byte[] Salt { get; set; }
		}

		//-- Mapping overrides
		public class UserMapping : EntityTypeConfiguration<User>
		{
			public UserMapping()
			{
				Property(m => m.Email).IsRequired().HasMaxLength(256);
				Property(m => m.Salt).IsRequired().HasMaxLength(32);

				Ignore(m => m.PasswordHash);
			}
		}
	}
}

EF will create the database for you if needed the first time your run the application. If you want to create it beforehand, this is the schema:

CREATE TABLE [dbo].[Users]
    (
      [Id] [bigint] NOT NULL IDENTITY ,
      [Name] [nvarchar](MAX) NULL ,
      [Email] [nvarchar](256) NOT NULL ,
      [Salt] [varbinary](32) NOT NULL ,
      PRIMARY KEY ( [Id] )
    );

For comparison purposes, here is the full source code of the same application implemented with NHibernate and Fluent NHibernate:

using System;
using System.IO;
using FluentNHibernate.Cfg;
using FluentNHibernate.Cfg.Db;
using FluentNHibernate.Mapping;
using NHibernate.Tool.hbm2ddl;

namespace NHibernate
{
	class Program
	{
		private const string ConnectionString = "Server=localhost;Database=EFExample;Trusted_Connection=true";
		private static readonly string DbCreationScriptPath = Path.GetFullPath("schema.sql");

		static void Main(string[] args)
		{
			var configuration = Fluently.Configure()
			                            .Database(MsSqlConfiguration.MsSql2012.ConnectionString(ConnectionString))
			                            .Mappings(m => m.FluentMappings.Add<UserMapping>())
			                            .BuildConfiguration();

			// Get NHibernate to generate a database schema creation script for us
			new SchemaExport(configuration)
					.SetOutputFile(DbCreationScriptPath)
					.Execute(script: true, export: false, justDrop: false);
			Console.WriteLine("Wrote database schema creation script to {0}.", DbCreationScriptPath);
			Console.WriteLine();

			var sessionFactory = configuration.BuildSessionFactory();

			// Write to database
			using (var session = sessionFactory.OpenSession())
			{
				var user1 = new User()
				{
					Name = "User 1",
					Email = "user1",
					PasswordHash = new byte[0],
					Salt = new byte[0]
				};

				var user2 = new User()
				{
					Name = "User 2",
					Email = "user2",
					PasswordHash = new byte[0],
					Salt = new byte[0]
				};

				using (var transaction = session.BeginTransaction())
				{
					session.Save(user1);
					session.Save(user2);
					transaction.Commit();
				}

				Console.WriteLine("Inserted User Id {0} in the database.", user1.Id);
				Console.WriteLine("Inserted User Id {0} in the database.", user2.Id);
			}

			Console.WriteLine();

			// Read from database
			using (var session = sessionFactory.OpenSession())
			{
				var count = session.QueryOver<User>().RowCount();
				Console.WriteLine("Found {0} User records in the database.", count);
			}

			Console.WriteLine("Done. Press Enter to exit.");
			Console.ReadLine();
		}


		//-- Domain Model
		public class User
		{
			public virtual long Id { get; set; }
			public virtual string Name { get; set; }
			public virtual string Email { get; set; }
			public virtual byte[] PasswordHash { get; set; }
			public virtual byte[] Salt { get; set; }
		}

		//-- Mappings
		public class UserMapping : ClassMap<User>
		{
			public UserMapping()
			{
				Table("Users");
				Id(m => m.Id).GeneratedBy.Identity();
				Map(m => m.Name).Length(10000); // i.e. force NVARCHAR(MAX)
				Map(m => m.Email).Not.Nullable().Length(256);
				Map(m => m.Salt).Not.Nullable().Length(32);

				// Intentionally not mapping the PasswordHash property
				// as we don't want it stored in the database.
			}
		}
	}
}

Note the single big difference between the NHibernate and the Entity Framework version: in the NHibernate version, we started an explicit database transaction when inserting new records in the database to ensure that all the inserts were done in a single transaction.

We didn't have to do this with Entity Framework as EF started this explicit transaction for us. This is one of the major difference in behaviour between NHibernate's Session and Entity Framework's DbContext. We'll cover this in more detail in a future post.


In the next post in this series, we'll take a look at one point of confusion you'll hit early on when working with Entity Framework: the difference between ObjectContext and DbContext.