Faster Tools for Planning Cities

Summary #

Commonspace was a mobile app developed by Sidewalk Labs, a unit within Google, to help city planners, urban designers, and community groups record human activity in public places. One of the features Commonspace offered its users was People Moving Count (PMC).

PMC is a method that measures how many people move through an area and by what means. A city planning department may conduct a PMC study to measure the effectiveness of a new crosswalk, the popularity of an event, or engagement with a park. Results from the PMC would then influence future urban planning policy or decisions.

summary-PMC-observer — In a PMC study, observers stand at the end of an observation line, set timers, and count how many people move past the line. The data is later collated and used for urban planning decisions. Photo courtesy Camilo Jimenez @Unplash.

Low fidelity tools and laborious processes #

Yet, tools used to conduct PMC studies limited what could be measured. City planning departments would deploy volunteers to the study site with hand clickers to count people and record results on paper worksheets. The clicker allowed for fast and accurate counting. Still, it only captured total throughput, not the demographic details of people that the planning departments also sought— such as age, gender, or means of movement. Moreover, transcribing the scribbly paper notes from volunteers into spreadsheets was time-consuming for the city planners responsible for reporting results.

summary-low-fidelity-tools — Before Commonspace, hand clickers and paper worksheets were the only tools used in PMC studies. Images courtesy of Gehl Studio.

High fidelity capture and automation #

This is where Commonspace’s PMC feature came in. In its default configuration, it could capture counts of people along with their mode of movement, gender, and age. Furthermore, the app eliminated the task of paper-to-spreadsheet transcription for city planning departments. In other words, Commonspace allowed for richer data capture and saved hours by automating collation.

summary-cs-app-screenshot — A big step up from hand clickers and paper sheets. Commonspace’s PMC feature could capture demographics and remove the need for paper-to-spreadsheet transcription.

Problems with high-density measurement #

While Commonspace offered huge improvements for PMC data collection, it also had shortcomings. Volunteers could not keep up with entering data into the app if, for example, large clusters of people with different modes, genders, and ages passed through an observation area. In other words, the UI was too slow to capture data, resulting in missed counts for certain high-density PMC situations.

Design goal: improve speed and reduce errors #

After engaging with Commonspace’s product manager at Sidewalk Labs, I set out to redesign the app’s UI with one goal: to improve the speed of data capture by at least 25% while keeping the miss rate at or below 5%. I defined my metrics as follows:

“Speed” meant how many people could be counted in a minute
“Miss rate” meant what percent of people in the observation space went uncounted as they passed

Redesigning the Commonspace UI #

I started my process, assuming that iterating on the Commonspace UI. I focused on the visual hierarchy, hit-box sizes of UI elements, and reducing the amount of eye scanning and touch-tapping needed to input data. To compare performance, I created interactive prototypes with the following variations:

The same visual hierarchy, larger affordances
A new grid hierarchy for data input
A video capture and playback feature

A side-by-side comparison of each of the prototypes I designed to compare performance. From right to left: Commonspace’s UI layout, a grid-based layout, and a layout incorporating video capture and playback.

Then I compared the performance (i.e., speed and miss rate) of these prototypes using a combination of labeled cards, stock photographs, and recorded videos of pedestrians in urban settings.

Results #

Results from my testing showed the grid-layout was 70% faster at counting people than the layout used in Commonspace’s UI. The grid variant also maintained an overall lower miss rate. Yet only the video prototype could maintain a miss rate lower than 5%— regardless of pedestrian density.

Deliverables #

For deliverables, I provided recommendations for app development based on the potential impact on PMC studies compared to their relative implementation costs. I also created higher fidelity interactive prototypes of all the variants I tested.

Design Process #

Commonspace couldn’t keep up in real-time #

Before I started this project, I had first-hand experience capturing PMC data as a volunteer for three public life studies conducted in San Francisco, CA. From that experience, I learned how brilliant the hand clickers were for counting people: fast, tactile feedback let me keep my eyes continuously on the street for observation.

Using the Commonspace app was a different story. I was nowhere near hand-clicker performance after trying to count pedestrians with the app in downtown Oakland, CA. The reasons seemed clear: I spent too much time taking my eyes off the street and looking at the UI to enter data.

Design goal: speed and accuracy #

The benefit of capturing more demographic details with an app was compelling. Yet if Commonspace couldn’t keep up with the upper limits of pedestrian density, it wouldn’t be an effective tool for PMC studies. Performance needed to improve.

I set out to redesign the app’s UI with one goal: to improve data input speed while maintaining accuracy. I thought a 25% improvement in speed and a miss rate at or below 5% would be a success. I defined my metrics as follows:

“Speed” meant how many people could be counted in one minute
“Miss rate” meant what percent of people in the observation space went uncounted as they passed

To achieve my goal, I created interactive prototypes to test against these metrics and compared the results.

Reducing visual scanning and UI taps #

While using Commonspace to count people, I repeated two actions: visually scanning the screen for the inputs that matched the person I wanted to count and then tapping those inputs.

process-cs-app-scan-tap — An audit of the Commonspace UI revealed that I needed to repeat an unnecessary amount of visual scanning and UI taps for every person I counted with the app. The figure above assumes a user is a left-to-right reader.

For the redesign, I made three assumptions based on my audit of the Commonspace UI and my previous experience volunteering for PMC studies:

Reducing visual scanning and tapping before inputting data would speed up the people counting process and reduce the miss rate
Users would visually cluster people by gender and age before counting them (a behavior known as subitizing).
Limiting the testing to pedestrian counts would reduce the overall testing needed while providing an accurate performance comparison. So I excluded bicyclist and “Other” modes (stroller, rollerskate, wheelchair, skateboards, etc.) from testing.

Based on these assumptions, I sketched out designs that would lead to less searching and tapping while counting combinations of gender and age descriptors for pedestrians.

process-tracking-tapping-compare — Comparing the visual scanning and tapping needed to input data based on different UI layouts. In this case, I’m showing the path for entering a 25-year-old female pedestrian. Since I wanted to reduce the amount of scanning and tap/toggle interactions Commonspace’s UI required for input, combining gender and age into columns seemed to be the best option for prototyping.

Sketching iterations led me to table-based layouts that combined gender and age, then to a three-column design with a single tap-toggle since it required the simplest visual scanning and the least amount of tapping from users.

Designing interactive prototypes for comparison testing #

Based on my sketches, I created prototypes from a single-screen mobile UI in Figma, which I imported into Framer to add interactivity. I refer to this prototype as the ‘grid layout’ prototype for this case study.

process-grid-to-proto — Based on my sketches, I made a low fidelity interactive prototype using Figma and Framer, which I referred to as the ‘grid layout’ during testing.

Since I viewed this step as an investigation of layout and hierarchy, I wanted to eliminate the possibility that Commonspace’s small non-standard buttons would confound my results. Hence for comparison testing, I also created an interactive prototype of the Commonspace UI, but with larger controls. I refer to this prototype as the ‘same layout larger affordances’ prototype for this case study.

process-cs-to-proto — Based on Commonspace’s UI, I sketched out alternatives with larger affordances meeting mobile standards, then created a low fidelity interactive prototype using Figma and Framer. I called this the ‘same layout larger affordances’ prototype for testing.

After completing the interactive prototypes, I ran a quick visual comparison by recording myself entering data for a list of 11 pedestrians of mixed gender and age. My first impression was that the grid prototype was a little easier — my eyes and fingers didn’t have to travel around the screen as much. Yet I set a goal to improve speed by 25%, so I needed tests that could be more objective in showing me any improvements with the UI.

A side-by-side comparison test of me counting from a list of 11 pedestrians with each prototype. The grid prototype (right) seemed significantly faster than the prototype with the same hierarchy as the Commonspace UI (left).

Time trials with cards and photographs #

For my first speed test, I wanted to compare raw counting speed between the same-layout and grid-layout prototypes. So I created a randomly ordered series of cards with all the gender-age combinations from the app into a grid. I then timed myself by counting these cards in 1-minute trials using each prototype (ten time trials, five for each prototype).

process-card-people-counting — A grid of gender-age cards, randomly sorted. I used the grid for testing raw people counting speed.

In my next test, I made collages of stock photographs with similar densities of pedestrians in profile, then performed ten more 1-minute time trials (five with each prototype).

process-photo-people-counting — Example of a collage I put together to comparison count people with each prototype. Photos courtesy Jacek Dylag, Cory Schadt, JJ Ying @ Unsplash.

After reviewing the data, the counting speed increased by 70% using the grid layout across all time trials. Given my project target was a 25% gain in speed, I considered this a success.

Time Trial Totals	Same-layout (count)	Grid-layout (count)	% increase (grid)
Cards 1-5	129	222	72%
Photos 1-5	107	179	67%
Average	-	-	70%

Testing miss rate with recorded video #

To test the miss rate, I recorded video of pedestrians passing through subway station entrances in downtown Oakland and San Francisco, CA, during rush hour. The entrances had regular pulses of high-density pedestrian traffic, which made for consistent comparison testing.

An example of a subway station entrance in downtown Oakland that I used for comparison testing. The regular pulses of pedestrians helped test the count limits with both prototypes.

I created 1-minute clips from the video footage, then recorded how many pedestrians I counted and missed by playing back the footage. As expected, the miss rate increased with both prototypes as the pedestrian density increased (density = pedestrians passing per minute). However, as density increased, misses with the grid-prototype started later and stayed lower than misses with the same-layout prototype.

process-miss-rate-comparison-chart — I created one-minute video clips where the rate of pedestrians passing by varied between 12-51 per minute. Then I compared how many people I missed while counting with each prototype. As pedestrian density increased towards its peak, miss rate with the same-layout UI increased from 0 to 33%, while miss rate rose from 0 to 12% with the grid-layout.

Whether I met my project goal of a 5% or lower miss rate was a bit of a grey area. The grid-prototype only missed 4% of people across all tests but had a 12% miss rate at 50+ pedestrians per minute. Yet the data implied that as the pedestrian density keeps increasing, there might be a threshold where no human-UI interaction could keep the miss rate lower than 5%. Indeed, keeping the miss rate low across any situation used for a PMC study probably meant observers would have to slow down time itself.

Slower, more accurate, and more pleasant with video playback #

One way observers could slow time down to count people in high-density PMC studies is by watching videos they could play and pause rather than count in real-time. To demonstrate how counting people might work in a mobile app, I created an interactive prototype that added video and playback controls to the same-layout UI I had tested previously. I used Figma to create the layout and imported the design into Framer for interactivity.

Adding video with playback controls let me pause and count pedestrians with zero miss rate when they passed in dense clusters. Being able to pause the video also took away all the stress of counting in real-time.

I did not test this video prototype against other designs but made some subjective conclusions after using it. While the overall process of counting people was slower with the additional interactions to play and pause the video, I didn’t miss a single person.

However, the biggest subjective standout was that all the stress of counting people disappeared. Counting people quickly by gender and age in real-time with the other prototypes was an intense experience requiring 100% of my focus. It was far from pleasant and not an experience that I think volunteers would seek when they sign up to help a city perform a PMC study.

On the other hand, introducing video playback into the app was a considerable scope change where feasibility might be out of reach for Sidewalk Labs. Furthermore, would the benefits of recorded video outweigh the potential entanglements with privacy laws? Rather than further investigate the performance payoff of the video-based variant, I decided to wrap up the project and prepare deliverables and recommendations.

Final Deliverables and Recommendations #

For my final deliverables, I created higher-fidelity interactive mockups from each prototype (again using Figma for the layouts and Framer for interactivity). I chose colors and layout based on what the team was already using while adding colors and modifying elements based on mobile accessibility standards.

process-deliverables — For deliverables, I created higher-fidelity mockups for each prototype. I made iterations on layout and color based on mobile accessibility standards.

For recommendations, I positioned the prototypes inside a matrix based on cost and impact. In this case, “impact” was a catch-all term I used for speed, accuracy, and subjective experience using the app for PMC studies. I intended the matrix to show how the grid-layout prototype was nicely positioned in the best quadrant of cost vs. impact. It offered a boost in speed and accuracy above the current design while being far easier to build and ship than a video-based solution.

process-impact-cost-matrix — I used a matrix of cost vs. impact to compare the different prototypes explored in my design project visually. The grid-layout prototype offered the best of both worlds: an easy-to-implement solution that resulted in better speed and accuracy for PMC studies.

Outcome #

Pandemic Halted Development #

A convergence of forces worked against making further progress with Commonspace. After I completed the project, I got the news that the team was postponing development on Commonspace –citing resource constraints. Then, only months later, Sidewalk Labs shut down operations in the wake of the COVID-19 pandemic.

Discussion #

Will PMC continue to be a job humans perform? #

While my job focuses on improving people’s experience interacting with user interfaces, sometimes involving a UI is the problem with the experience. My biggest takeaway from this project was that categorizing and counting many people with a mobile app in real-time was stressful, error-prone, and perhaps not the best long-term solution for planning cities. Yes, planning departments needed to move past the limitations of hand-clickers and paper worksheets to gather the data to help them plan and build better environments for their citizens. Yet giving humans a more descriptive clicker to collect data was a solution that ignored the elephant in the room: this was a job for machines, not people.

What if you could capture video and a computer categorized and counted pedestrians for you? Thoroughly-trained computer vision models can already accomplish this. (Crude mockup made in After Effects)

Another way to say, “let’s use a machine to count people” is “let’s use computer vision technology.” Before I started this project, computer vision for tracking and classifying objects was already a hot topic. The technology to identify a person as a person passing through a public space existed. Theoretically, employing this technology in cities could remove real-time human input limitations. However, machine learning models had not been trained for the data sets that urban planning departments sought, so a computer vision solution was not a solution yet.

Study volunteers as videographers: a means to artificial intelligence #

Realizing the potential of machine learning technology for public life studies probably requires changing the role of the volunteers who sign up to help city planning departments. Instead of quantitative people counters, these volunteers could be deployed as videographers. Some public life studies I participated in had nearly a hundred volunteers gathering data. If every one of those volunteers instead recorded videos that could later be tagged with demographics, city planning departments could build the data set required to train a machine learning model that would automate the process in the future. Whether cities seek such solutions in a post-pandemic world remains to be seen.