Efficient Magento code – database (Flat vs EAV). Part 1.

Care to share?

Creating stores in Magento is a constant strive for performance. It can be improved thanks to caching tools (e.g. Varnish, Redis – on the caching layer), server load balancing or smart database management. Still, the best performance is achieved with a well-written code.

Headless eCommerce - Download FREE Book

In this article I am going to focus on the database. In Magento, we use two configurations of ORM for database communication, that is the flat (flat table structure) and EAV (Entity Attribute Value). Frequently, the developers perceive no difference between the two types. The truth is, however, that if it is not understood correctly, the code is exposed to significant efficiency reduction.

The article shall discuss both configurations, specifying the advantages and disadvantages of each. I will also include the SQL functions executions during the standard queries to the database which should solidify your view of the more useful configuration in a specific case.

ORM – Object Relational Mapping

ORM is a software development technique used for converting various data types into objects and vice versa. In Magento, ORM takes place by means of models in two configurations:

  • Flat – flat table structure
  • EAV (Entity Attribute Value)

Data retrieval or saving in both ORM types is either identical or very similar in Magento which may make you think there is no difference in the access or committing the data. The differences are, however, significant as you will surely notice further on.

Even the inexperienced developer knows what flat table structure is. But what is EAV?

EAV (Entity Attribute Value)

EAV is a development pattern, saving data depending on the value type. The data distribution is presented in the diagram below:

EAV is a development pattern, saving data depending on the value type. The data distribution is presented in the diagram below:

Source: http://www.magentocommerce.com/knowledge-base/entry/magento-for-dev-part-7-advanced-orm-entity-attribute-value

The diagram depicts EAV table configuration describing goods in the store.

The first table contains a set of product entities (entity_id) that is base product indexes with the type (entity_type_id) indicating the entity type in EAV structure, together with base keys (e.g. for products this can be their SKUs).

The middle table contains all the attributes found in Magento. By filtering the tables by the entity type (entity_type_id) we can learn all the possible product attributes (e.g. price, dimensions, colour etc.).

The tables to the right present the final values of a given attribute. The column where a given value is saved depends on the value type. That is when you wish to record “colour” product attribute, you should record it in catalog_product_entity_varachar table as we know that the “colour” attribute values are of the string type.

To describe a product in the flat table structure, we could convert the above diagram into one table with the number of columns corresponding to the number of attributes required. So why EAV?

Flat vs EAV

EAV possesses something the flat table is deprived of, that is flexibility. At any time of the store development we can add any number of new attributes which will be new records in eav_attribute table. In the flat table, you would have add a new column to the table.

When you look at all that from the Magento perspective, EAV is an excellent choice as various stores or applications may have different characteristics and thanks to EAV you can play with the attributes with no need to redesign the database.

What is more, flat tables have a specific maximum number of columns (e.g. inoDB can have max. 1000 columns) which limits the maximum number of attributes. In EAV, this is not a problem.

EAV, however, has another significant drawback, that is performance. When you wish to get to a specific attribute, you must get data from several tables which is connected with the need to make a couple of SQL queries in Magento. When we also reach for additional attributes with another data type, new JOINs will appear in the queries. When we add an inexperienced developer who retrieves much data in batches via a load in a loop, the linear capacity fall is certain.

Let’s look at both types in practice.

ORM – flat table structure

ORM – flat table structure

Let’s create a flat table containing information on the hotel guests:

The table contains basic information on the guests and we will use only two types of data, that is INT and VARCHAR.

INT and VARCHAR

Next, let’s fill in the table with data:

and retrieve the data for the second guest:

divante_training/residents

The above code refers to divante_training/residents model and retrieves the data for the second guest using the load method. The result of executing the above Magento code is displaying the name and surname of the guests:

ORM

ORM has done everything for us. Let’s us not trust it and check what queries were executed:

ORM – EAV

As we see, ORM created a simply database query, nothing surprising.

ORM – EAV

Now, let’s create an analogous table structure presenting hotel guests but this time in the EAV context.

analogous table structure
analogous table structure
analogous table structure

As we see above, this time I have had to create three tables: The first will contain guest’s entities (indexes) while the second and third will contain the values depending on the data type (INT → residents_entity_int, and VARCHAR → residents_entity_varchar).

An alert reader should have noticed we miss one more table with attributes (eav_attribute – described while discussing EAV theory). That was, however, created earlier together with the entire Magento database as all EAV Magento table types use it (products, orders etc.).

Let’s fill in the tables with the data analogous to the ones in the flat table.

EAV Magento table
EAV Magento table
EAV Magento table

and let’s retrieve the same guest:

flat table

This time we used divante_barcamp/residents model and retrieved the data for 14th guest:

flat table

As we see, both the reference context and the result is identical but what ORM did is different:

flat table

Retrieving data, ORM executed several queries and in the last of them we see the operation of combining the results for two queries by means of SQL UNION ALL command.

It could not have been different as ORM had to retrieve EAV entity type from tabeli eav_entity_type table, then data from our major residents_entity table, information on the attributes we need (eav_attribute), and finally the values of those attributes (residents_entity_int and residents_entity_varchar).

At this stage, the performance decrease can be seen for EAV with an identical number of database queries.

In the next part, I’ll make the situation even more complicated, by retrieving multiple data and committing the gravest of Magento developer’s sins. Can you feel the chills already?

Go to Efficient Magento code – database (Flat vs EAV). Part 2.

Published December 3, 2014