Drupal: Get Node Term Count Grouped By Parent Term in a Hierarchal Vocabulary

hierarchal_taxonomy One of the sites my company manages reviews media products for teachers. The review content type has a term reference to describe which educational subjects the product addresses. It’s a two level hierarchal vocabulary, as you can see from the screenshot of the taxonomy admin page.

Recently the stakeholders for this content type asked my PM if we could add a new field to select the top level subject that was considered the primary area that the product addresses. Adding a field seemed like overkill so I researched my options. My first thought was to allow content editors to sort the terms per node so that the first result was the primary subject. There are several modules out there, like this one that do this. But this would add another step of work for them. Instead, I had the idea that wouldn’t it be most likely that the top level term that had the most child terms selected would be the primary subject? For example, if we have the top level terms of Science, English, Math and they have 3, 6, 5 child terms selected respectively, then English would be the primary subject. I ran this idea by my PM and he thought it was worth exploring.

What this meant was that I needed to generate a report of each node’s top level subject terms and how many child terms that subject had selected. Much easier to do this in SQL than in a view! This query returns the nid, node title, top level subject name, and count of child terms for that top level subject. The field name for the term reference is field_term_subjects, as you can see from the query.

SELECT 
  f.entity_id, n.title, td_parent.name, COUNT(td.tid) 
FROM 
  field_data_field_term_subjects f 
  JOIN taxonomy_term_data td ON td.tid = f.field_term_subjects_tid 
  JOIN taxonomy_term_hierarchy h ON h.tid = td.tid AND h.parent != 0 
  JOIN taxonomy_term_data td_parent ON td_parent.tid = h.parent 
  JOIN node n ON n.nid = f.entity_id 
GROUP BY 
  f.entity_id, h.parent;