How to Playtest: An outline

I’m a user experience researcher who has worked at a video game studio for the last five years. I have planned, conducted, and analyzed over 60 multi-day playtests. Some of them were remote. A lot of them were in-person. This guide will take you through each step, from ideation to analysis.

9 min readMar 6, 2022

This is just an outline of my user testing workflow. It is not an exhaustive guide to methods or analysis.

First, assess stakeholder needs. Then, define objectives & create a framework. From the objectives, make a measurement guide and translate those into surveys, observation guides, telemetry events to capture, etc. Recruit participants and do any necessary technical setup, then conduct the test, analyze the data, and review the report with stakeholders.

A drawing of the playtest cycle: Plan, Conduct, Analyze, Review, Repeat

1. Assess stakeholder needs

Maintaining relationships with stakeholders is key!

This is the first step in the collaborative research process. Maintaining working relationships with stakeholders can be difficult sometimes. I have experienced some resistance from stakeholders who aren’t fully bought in to the research process or would rather try gathering data themselves. I cannot emphasize enough how important it is to build trust and rapport with your teams.

I recommend creating a flexible roadmap for your project of what and when you plan on testing. Ideally, stakeholders should feel comfortable coming to researchers at any stage of development with questions and ideas they have about testing their content. Keep a list of all the possible research questions as they come in.

At the beginning of the playtest cycle, I make sure to check in with key stakeholders on what they’ve been working on, what content we should test, what their goals are for this milestone, and any hypotheses they may have about the content.

What has changed since the last test?
Is there new content that needs to be tested?
What are the goals for this milestone?
What are designers curious about the most? Do they have hypotheses about the content?

Prioritizing project needs is really important, because often you will be inundated with requests. In addition to building strong relationships with stakeholders, researchers must also hold strong boundaries and have a realistic idea of the scope of each test. This is particularly important for user research teams of one, or those with very few resources. Stakeholders need to invest in user research if it’s something they value. And they should.

black and white drawing of a lit candlestick in a holder — Step 1: Enlightenment

2. Define objectives and create a framework

From the discussions you had with stakeholders, you’ll develop your objectives and create a framework for the playtest.

Objectives: After talking with various stakeholders about their questions, ideas, concerns, and hypotheses, I create objectives in the form of questions. For example, perhaps I discussed with a mission designer how they’d like to assess the continuity of a specific section of a mission. They also think this part of the mission might be too challenging due to some of the enemies or other variables. From this conversation I would define the following objectives:

Do players know where to go and what to do in this area?
Do players understand who they are fighting in this area?
How challenging do players find this section? How satisfied are they with the challenge?

Framework: The playtest framework is a document which includes key dates for the test, the overall timeline, the defined objectives, methods, target participants, and technical information. This is shared with stakeholders as early as possible to gather any additions or changes before continuing to the next step of the process: creating the measurement guide.

3. Create a measurement guide

The measurement guide is a spreadsheet that contains every objective, and how that objective will be measured.

Now we need to decide on the best way to measure each objective.

Should we include a survey question?
Is this something that would be better answered with an interview?
Should we rely on telemetry data?
Is it something we can observe?
Does it require eyetracking?
Perhaps a mixture of the above? That is likely the case!

Once we have designed measurements for each objective, we can create our testing materials.

4. Translate measurements

Build your surveys and observation guides based on the measurements you chose in step 3.

Surveys: I have used Limesurvey, a free open source survey hosting tool. I have also used Surveymonkey and Qualtrics, which are paid options. Google forms also works for simple questionnaires.

The following are just a few of the survey best practices I like to share with design teams to encourage curiosity and shed light on the research process.

Avoid double-barreled questions
Ensure appropriate response options are available
Choose appropriate rating labels
Be as clear and concise as possible

Eyetracking: If any of your objectives requires eyetracking or other biometrics as a measurement, make sure to decide now how it will be used. For example, if designers want to see if players are noticing a piece of the UI meant to teach them something, the eyetracking measurement would be to observe the gaze overlay to that UI element. This also informs the test setup. If you know you will need the eyetracking overlay, you will make that a part of your technical setup.

Observation: If there are specific things you will need to observe during the playtest, make an observation guide to help keep track per player/user. For example, if you want to know how many players go to a certain location and how often, include those as lines in a spreadsheet that includes each player. The observation guide will make it easy for you to tally the amount of times each player goes to that location. Use observation guides specifically if you can’t gather this data from telemetrics.

Telemetry: Gathering telemetry data is a great way to get quantitative data about user behavior. Tag specific elements to be tracked such as difficulty settings, location mapping, and selections made. Telemetry data is often the best tool to double check survey data or avoid the need for strict observations.

Interviews: Sometimes you will have objectives that you could cover with surveys, but you might want to go deeper with participants’ thoughts on a subject or experience. Interviews often take more time than surveys, but give you the option to ask for clarity or more context to a question if respondents don’t immediately open up. These should generally be done one-on-one but I have also had great experiences conducting group interviews in specific cases. Good interviewers are able to balance standardized questions with natural conversational fluidity to get the richest insights from participants while maintaining research integrity.

You’re also going to need to create a recruitment survey for the next step of the process: Recruitment.

5. Recruit participants

Build and maintain a player database

Make flyers and post them around town! Put up a Craigslist ad! Spread the word and get people to sign up to take part in your playtest program. If you are able to build a large database, create a randomization query for choosing potential participants to send out surveys to. Avoid inviting participants who have already tested your content previously unless the test calls for it for some reason.

Send out a test-specific recruitment survey

Make sure to include all recruitment criteria previously decided on with stakeholders. This usually consists of age requirements, competitors played, preferred input devices and platforms, and genre preferences. Wait for the responses to roll in then choose your best candidates! You’ll likely want a diverse group of people within your target requirements. Avoid including unnecessary requirements.

Are commonly asked demographic questions such as gender or ethnicity really necessary? They might be! But ask yourself why you are asking these questions. Don’t ask them by default.

The number of participants will also depend on your lab size, budget, and scope. I generally shoot for between 8 and 16 participants per test. Confirm their participation either via phone or email and move on to the technical setup.

6. Technical setup

Review the content & prepare machines

Review the test content as close to the test scenario as possible and take note of any workarounds you’ll need to avoid bugs or launch content. Make sure the build is available to players whether remotely via beta access or downloaded to the playtest PCs. Change PC environment variables to identify telemetry events for each player and delete any previous player files.

Recording software

I use OBS Studio to stream and record gameplay videos. I use Snaz for a timestamp, Nohboard and Gamepad Capture overlays for controller input, and text elements indicating the game, milestone, and player ID.

Biometrics

Pre-test your hardware and software, including eyetracking or other biometric measures. Be sure the gaze overlay works in sync with your recording software. You’ll need to calibrate participants’ eyes when they arrive to the test.

Instructions

Write instructions and include survey links. Make these available on the PC desktop for players to easily access. Have Discord open for communication during the test, and make sure previous conversations are archived.

7. Moderate test sessions

Observe and take notes

Be available, take notes, pay attention to your observed measurements, and be ready to troubleshoot bugs. In my experience, instability can be a huge hurdle to overcome during testing. You’ll need a solid understanding of the content and how to debug.

Players may have navigation or onboarding issues that you have to let them struggle through — this is the point! We want to see where participants struggle and how they learn. Avoid the instinct to help when players don’t understand something right away! Patience and good note taking are key.

I find that a good observation workspace helps immensely. I suggest a monitor or screen large enough to display all player gameplay streams and an additional screen for Discord to communicate with players as well as another communication method with designers who are ideally also watching the streams. I have found group chats to be incredibly helpful in identifying problem areas and brainstorming solutions.

8. Analyze data

I have mainly used RStudio to analyze playtest data and create .html R Markdown reports. Using R can be intimidating, but it’s incredibly flexible. Honestly, datacamp and Linkedin Learning have some really helpful courses, and there is a ton of information online about how to use R. Use packages like ggplot2 to create custom figures from your playtest data. Use customizable markdown templates for your reports. I particularly like the Tufte handouts template for playtest reports. The sidenotes make for a great way to include stakeholder comments from the review meeting. Add custom CSS to style your reports, include tabs or buttons, and embed GIFs and photos.

I like to keep my reports clear and concise. Focus on the objectives, highlight the issues, and make it easily readable. Deliver your report to stakeholders and plan a meeting to review it together.

9. Review report

Once you have delivered the findings of the study to stakeholders, make sure you have a review of the results with them. Present findings and get their notes.

What do they plan on changing based on this feedback?
What else would they like to look into moving forward?

This final step of the process can and should help inform the next test cycle. Add the notes to an updated version of the report for record keeping. Keep reports organized for easy reflection later.

Once all the steps above have been completed, go over what went well during this test, what didn’t go well, and ideas or improvements for next time. Reflecting on your work is one of the best ways to continue growing and improving, and it’s even better if you have a team of researchers to collaborate with. User testing should be an iterative process, and each test session is a learning opportunity.

This is the process I have used for testing various video games at different stages of development over the past five years. I hope there is something you can take from my experience whether you work in games user research, software development, or some other industry! Be sure to let me know if you do.