Currently, almost all UW-Madison ancillary systems are receiving data from the Human Resources System (HRS) as ASCII (American Standard Code for Information Interchange). ASCII does not support a variety of characters, which limits accessibility and use of other languages, and is mostly confined to A-Z, 0-9, and some special characters. ASCII is limited to the English language and does not support international characters, most symbols, or punctuation for words borrowed from other languages. For example, ASCII does not support diacritical marks in résumé. Data encoded in ASCII will be referred to as formatted data in Person Hub. Under ASCII, 1 byte = 1 character.
Workday will make data available as UTF8 (Unicode Transformation Format – 8-bit). UTF8 allows representation of characters as ASCII text, international characters, such as Chinese characters, and other types of punctuation like the e-acute in the name Beyoncé. UTF8 data may also be referred to as unformatted data. Under UTF8, 1 character does not always use 1 byte. UTF8 characters can take up to 4 bytes.
Data Elements Impacted
Unformatted Data/UTF8 is most likely to impact name data including legal name and name in use (formerly preferred name). It is possible that any other data field from Workday could be impacted, but not as likely. The only data field where unformatted data/ UTF8 will not be accepted is in address data.
Because unformatted data/UTF8 will have the greatest impact on name, there will be 4 versions of name made available from Workday via IAM integrations:
- Legal name with unformatted/ UTF8 characters and 1,024 byte limit
- Legal name with formatted/ASCII characters and 30 character limit
- Name in use (formerly preferred name) with unformatted/ UTF8 characters and 1,024 byte limit
- Name in use (formerly preferred name) with formatted/ASCII characters and 30 character limit
Start Testing Now
Ancillary system owners need to start testing now:
- Determine whether your systems can ingest unformatted/UTF8 data.
- If the system can ingest unformatted/UTF8 data then define a strategy for how users will search for data within your systems after the implementation of unformatted/UTF8 data.
- Consider If your system will have enough room/character limit size.
- Example: Désirée is 7 characters long, but requires 9 bytes to store because each é character requires 2 bytes to represent in UTF8.
- If possible, setup or configure a test environment for your ancillary system.
- A test environment will be invaluable to test new data from Workday.
- Ancillary Systems with a final disposition of “N/A” that consume identity data from IAM infrastructure are impacted by these changes and will need to test.
- Update your system’s test plans (or create a test plan) to take into consideration the known data changes.
- Ancillary system owners are responsible for creating and executing test plans for each ancillary system. Manifest/Grouper group owners are also responsible for creating and executing test plans for their groups. IAM will make the data available, but IAM does not know how each ancillary system works.
IAM is actively working on updates to infrastructure to allow for ancillary system owner testing. IAM will:
- Communicate with system owners directly when testing options and data become available
- Update IAM test environments with unformatted/UTF8 data
- Ask ancillary system owners if their system can consume unformatted/UTF8 data and support those needs moving forward