Weird bug in 6.6.2 with field value groups

We have data that matches some of the predefined labels that are groups in the system, but when we import the data it creates a new FIELD_VALUE instead of using the predefined one. Nothing breaks until I try and group things. When I tried to group F as Female it made an entry in the FIELD_VALUE_GROUP table with 51981 as the PARENT_ID instead of 48 as the PARENT_ID. Manually changing this in the DB fixes the problem.

I don’t know if the problem is that our import with Female is creating a new FIELD_VALUE (51981) instead of using the existing FIELD_VALUE (48), or if the problem is with the Grouping feature picking the wrong PARENT_ID of the the two entries with a label of Female.

I can recreate this if anyone wants to see it.

MariaDB [emerse]> select *, cast(is_group as int) from FIELD_VALUE where FIELD_ID = 8;
±------±---------±---------±-------------±--------±----------------------+
| ID | FIELD_ID | IS_GROUP | LABEL | COUNT | cast(is_group as int) |
±------±---------±---------±-------------±--------±----------------------+
| 51985 | 8 | | *Unspecified | 103 | 0 |
| 45 | 8 | | F | 0 | 0 |
| 51981 | 8 | | Female | 6289184 | 0 |
| 46 | 8 | | M | 0 | 0 |
| 51982 | 8 | | Male | 5198652 | 0 |
| 51984 | 8 | | Nonbinary | 422 | 0 |
| 47 | 8 | | U | 0 | 0 |
| 51983 | 8 | | Unknown | 7329 | 0 |
| 48 | 8 | | Female | 0 | 1 |
| 49 | 8 | | Male | 0 | 1 |
| 50 | 8 | | Unknown | 0 | 1 |
±------±---------±---------±-------------±--------±----------------------+

I’m actually wondering, would it be safe to stop tomcat, delete all the rows from FIELD_VALUE and FIELD_VALUE_GROUP and then turn Tomcat back on and update caches, etc. so that it’s all fresh data sourced from our system?

Ok, more info. The UI allows (and can even encourage depending on your data) two different FIELD_VALUE entries to have the same LABEL when one is a group and one isn’t, and this leads to a world of pain.

Yeah, I can confirm this is a bug. Looks like we don’t specify that the group is a group in the DB, and the value is a value in the DB. I can release this as 6.6.4. I look around and see if there are any similar bugs too. Thanks for catching this!!

I released this as 6.6.4.

As far as clean-up goes, there’s nothing particularly simple that you can do. You could clear out the FIELD_VALUE and FIELD_VALUE_GROUPING tables, and then go to each field you want and re-populate the values from Solr by hitting the update button. However, if you have any groupings, you’ll have to manually make those again, which might be pretty easy if you still have the old TSVs.

You can tell if there’s an issue in the mapping if the PARENT_ID of any mapping maps to a FIELD_VALUE row which doesn’t have IS_GROUP set. Basically, the invarients the code expects for those tables are:

  1. Rows referenced by FIELD_VALUE_GROUPING.PARENT_ID have IS_GROUP = true
  2. Rows referenced by FIELD_VALUE_GROUPING.CHILD_ID have IS_GROUP = false
  3. Rows referenced by FIELD_VALUE_GROUPING are in the same field (same FIELD_ID)
  4. For each row of FIELD_VALUE where IS_GROUP = true, there is at least one FIELD_VALUE_GROUPING with PARENT_ID = ID in that row. In other words, no groups are empty.

So, if you find violations of these rules, then you probably have a problem where EMERSE may not work.

With our data we don’t really need groupings. I removed those and everything is working really well now.

Thanks,

Eric