Friday, June 5, 2009

Uncovering Myths about Globalization testing- Input validation testing 2

This post is a continuation of my previous post on the same topic and is based on the real time myths about Globalization testing as i have experienced.
In my previous post, i talked about how the byte count varies across different Unicode representations. Any tester reading this may further have a few questions here-
- What is the right approach to come up with the localized test data ?
- Once i have the test data, how do i know which localized character occupy how many bytes considering i know the type of Unicode representation (UTF-8, UTF-16 , UTF-32 etc.) ?
The answer to the first question is in itself a very broad topic and i do plan to cover this in my future posts.
The second question i.e. decoding how many bytes a character occupies for a given encoding is equally interesting. I have found one of the General testing tool created by Bj Rollison known as String Decoder quite useful.
The Bj Rollison's website has great amount of details/user guides etc. about this great utility for Internationalization testing.