On the Introduction of Grayscale Testing（Grayscale Release）
Editor’s Lead: Product managers and R&D teams can use grayscale testing (grayscale release) to select a certain group of people to use the product or application before it is officially released and popularized, and then find possible problems to avoid hitting the user experience. So, how do we conduct grayscale testing (grayscale release)? In this article, the author introduces the method of grayscale testing (grayscale release) and the difference between grayscale testing (grayscale release) and A/B testing. Let us take a look together.
The tip text below the input box on the Change WeChat ID page of WeChat is “WeChat ID is a unique certificate for your account. It can only be set once.” When my friend told me that the WeChat ID can be modified, I checked it and found that the tip texts displayed on different mobile phones were different. The tip text on my phone is "can only be set once," while the WeChat account of my friend's mobile phone shows “WeChat ID is a unique certificate for your account. It can only be changed once per year”.
What is the reason for this difference? This article explains this reason by explaining the grayscale test (grayscale release).
Grayscale testing (grayscale release) is to select a specific group of people for trial before a product or application is officially released, and gradually expand the number of trial users, so as to discover and correct problems in time.
2. Why do We Need to Conduct Grayscale Testing?
When I saw grayscale testing, I first asked myself what is grayscale testing. After searching Baidu, I understood the concept of grayscale testing; secondly, I asked myself why products should be grayscale tested.
Let me start with a general introduction to the development process of a product, and then answer my question of why products should be grayscale tested. What is my purpose?
Before the product is launched, we will conduct product testing. At this time, there will be questions about why we need to conduct grayscale testing when we already have product testing.
In fact, testing is to fix bugs, and testing does not start after the development is completed. When we first came into contact with this project, the colleagues in charge of testing needed to think about the testing and testing methods of this project, and repeatedly communicated with the R&D personnel and product manager communication.
When the communication is completed, QA will write the test point, process, and test method as “ test cases,” and then find the product manager and R & D personnel to confirm to ensure that you will not miss bugs that should have been discovered because of problems and omissions in understanding the requirements. After the testers find the BUG, the R&D staff will fix it.
After the bug is fixed, the product manager needs to confirm the effect of the product. After confirming that the product has no bugs and meets expectations, it can be launched.
But QA is not omnipotent, and it is impossible to test all situations, this is why we still need to conduct grayscale testing. In addition to some bugs mentioned in the book that only appear on mobile phones with a very small number of users, for example, mobile phones tend to heat up in summer, mobile phones are too slow, etc., which will cause bugs to appear.
All in all, my purpose is that we use grayscale testing as second-hand preparations to help tired and miserable product managers and R&D teams quickly test and find problems, and correct the problems in time before the official full-scale promotion, so as to avoid poor user experience and let the Users are disgusted with the product. If so, the efforts of the R&D team will be in vain, so grayscale testing is very important.
3. How to Conduct Grayscale Testing (Grayscale Release)?
We know why we need grayscale testing, then we have to ask ourselves how to do grayscale testing. Although our product managers don't need to conduct grayscale testing in person, since we all think about why we need to conduct grayscale testing, we will definitely think about how to do it! Reading books and searching Baidu is the best way to quickly know how to do grayscale testing.
First of all, we need to select users. You need to select suitable users to carry out a grayscale test, which needs to be based on the principle of randomness.
lThe first category is classic users. The biggest feature of this type of user is that they have no characteristics. For example, Xiaomi "enthusiasts" are typical users of Xiaomi and can represent the characteristic group who use Xiaomi mobile phones the most.
lThe second category is extreme users. The reason for choosing this category of users is that we need to detect some extreme situations. For example, a pair of sneakers is used for jogging, but someone wears them to climb mountains. People who use certain functions of the product to the extreme are extreme users.
Second, we should choose the appropriate comparison method. There are two methods for data comparison: chronological comparison and comparison of different user groups.
4. Difference Between Grayscale Testing and A/B Testing
Definition of A/B test: AB test is to make two (A/B) or multiple (A/B/n) versions for Web or APP pages or processes, and make the components the same (similar) visitor groups (target groups) randomly access these versions in the same time dimension, then collect user experience data and business data of each group, finally analyze and evaluate the best version, then officially adopt it.
Grayscale testing（grayscale release)means that after the system test passes, the test version is released to the online environment, and part of the online server code is replaced for pre-testing. After the grayscale test is over, the online version will be unified.
Essentially, it is a pre-launch test to collect user feedback.
Example: Zhihu has upgraded the playback format of video resources, but I don’t know if there is a problem with the new version. Then Zhihu can control a part of the Zhihu APP to play videos in the new format through configuration distribution. Then observe the playback success rate and freeze rate through monitoring, and roll back immediately if there is a problem.
A/B testing means that after the system test is passed and released, users with different functions of the same software will see different implementation methods and collect feedback from each user.
Essentially, it is a post-launch test to collect user feedback.
Example: Zhihu changes skins during the Spring Festival, and we can put two skins to be selected into the market. We can observe which skin has the most user clicks and the highest user stay time.
Summary: Both functions of AB are available, and there is no difference in the user groups for delivery, allowing users to choose the more popular function. This will lead to ambiguity of the target, and in the later stage, A may go online, or B may go online.
The grayscale version may not be available, or there are no serious bugs. The target customer group may be selected and restricted (for example, only low-end Android phones are released), and monitoring will determine whether there is a problem. The goal is clear, as long as the grayscale version is ok, it will continue to increase in volume until the full volume.
I don’t know if you have such doubts in your mind. Isn’t WeChat ID a small thing on WeChat? Why do we have to spend so much time on grayscale testing? Let's think about it differently. There are more than 100 million WeChat users, but the WeChat IDs of these hundreds of millions of users cannot be repeated. If they go online without grayscale testing, there will be problems such as duplicate WeChat IDs and failure to apply for revisions.
When we deliver products to users on a large scale, grayscale testing is indispensable. If we want to use grayscale release, different from the usual project process, we need to be prepared for improvement points, find improvement points through data analysis and log analysis, and quickly locate and solve problems.