Verify hash function changes after update to Dynamics 365 Finance 2020 release wave 2
Jason Stone - Software Engineer, Dynamics 365 Finance
In Dynamics 365 2020 release wave 2, the hash function for the dimension framework in Dynamics 365 Finance has changed. This change may cause breaks in your production and test code in some scenarios. This blog post explains how to inspect your code for impacted patterns and provides guidance about how to validate this functionality after you upgrade, to help you avoid potential issues.
One of the changes in 2020 release wave 2 includes hash functionality updates to tables used by financial dimensions. The hash function, known as SHA1, is being deprecated in all Microsoft products. The reason for this change is that SHA1 has been proven by researchers to be susceptible to collision attacks.
The financial dimensions framework in Microsoft Dynamics 365 Finance has always used SHA1 to generate hash keys. This was not for cryptographic purposes, but instead used for quicker hash key lookups. Because SHA1 has been proven to allow collisions, we need to move away from SHA1 to a more reliable algorithm.
To find a new algorithm, we analyzed many non-cryptographic and cryptographic scenarios. We considered the following factors:
- Amount of data that can be compressed: The data stored in a hash key is composed in a hash message. The hash message for DimensionAttributeValueCombination is built by combining the DimensionHierarchy hash key and hash keys for each DimensionAttributeValue record. DimesionAttributeSet hash keys can be as large as 12 segments. This makes it possible to compute a message that is very large. As a result, we chose an implementation known as SpookyHash. Because SpookyHash can take in an input string and compute a unique hash key, it’s able to handle very large hash message inputs.
- Storage size: We were interested in storage size because we already had a fixed field of 160 bits where the existing SHA1 hash message is stored. To lessen the impact to customers, we wanted to be sure that an upgrade was not necessary in order to upgrade older data to use a new field. Because the storage size of SpookyHash fit into the existing hash key fields, this was a good match.
- Performance of hash generation: The performance of hash generation was also an important factor. We did not want to degrade performance by switching from SHA1. We found that SHA256 was 15% slower than SpookyHash. SHA1 was between 2.9 to 3.7 times slower.
- Non-cryptographic function: Another major consideration was that we prefer to use a non-cryptographic hash function so that it will never be confused with a cryptographic scenario.
Patterns in your code
After your code is updated to the new release it will generate Ledger dimensions, Default dimensions, and Dimension enum sets with a new hash key. Data from before the upgrade will be computed with SHA1 and data after the upgrade will be computed with SpookyHash. This means that when the Dimension APIs calculate a hash key for an older dimension combination or set, it will create a duplicate new combination or set. This combination will have a different record ID and a new hash value that is different than the original. The two dimensions will look the same according to the display value. Over time, all new combinations will be created in the new format. In the meantime, you may encounter the following issues in your production and test code.
- Direct SQL generation of hash keys. There is no easy way to generate a SpookyHash key in SQL Server, however the easiest way is through X++. Any code that generates a hash key from SQL Server will no longer generate the correct hash keys. Dimension copy/self referencing dimensions is a set of code that relied on this. We moved that code to call the same hash functions as the rest of the dimensions framework. If your code utilizes SQL Server hash key generation, then it needs to be converted to use our hash APIs.
- Any code that expects certain records to be returned can switch its order. We found areas in production and test code that relied on a specific, implied order being returned from tables such as GeneralJournalAccountEntry when joined to the DimensionAttributeValueCombination table. If your code has a dependency on implied ordering, you may need to adjust it to ensure that you get the expected records.
- Direct manipulation of dimensions tables. Access to dimension framework tables should only be done through approved APIs and you should never directly manipulate a dimension framework table. Tables that start with DimensionAttribute are meant to be immutable and should only be maintained by the dimensions framework. If you have code in tests or production that violates this, you need to change or remove this code as soon as possible. Tests should use dimensions APIs properly to ensure they do not break with changes. This also ensures that you are properly testing what you intend to test.
- Code that queries ledger dimensions by display value. This is common in tests and potentially in custom partner code, where records returned may be the wrong dimension. With the new hash function, the older, outdated record might be returned instead of finding the new hashed record. As a best practice, it is never safe to look up ledger dimensions by display value because duplicate DisplayValues may exist. If your code needs to use this method, be sure to properly look up each record ID of each backing entity value and use the denormalized fields on DimensionAttributeValueCombination to do this lookup. This is because a main account could exist in multiple charts of accounts, or a dimension value such as customer could exist in more than one ledger and thus have different records. Because the data in DimensionAttributeValueCombination is not removed by company, but the values within are, it is important that your code finds the correct records for the companies you’re in. Failure to do this will result in corrupted dimension data.
- Code that expects RecId’s of dimensions or default dimensions to match. Any code that expects RecId’s of dimensions or default dimensions to match will fail because an old hash RecId may not match the new one even if they are the same. This is common in tests. To address this for LedgerDimensions, we added a helper in LedgerDimensionFacade called AreEqual(). This will allow your code to check if two dimensions are the same, even if they have different RecIds due to the hash change.
In conclusion, the hash function for the dimension framework has changed in Dynamics 265 2020 release wave 2, potentially causing breaks in your production and test code in the above scenarios. Carefully inspect your code for the patterns discussed in this blog. Also spend time validating this functionality after you upgrade, to avoid potential issues.
If something is not covered here or if you have questions, leave us a comment. We will keep this post updated when new patterns are discovered.