Functional Specification

Transcending Linear Navigation


"[The frequency of web accessibility issues that I encounter] makes me feel like a trapped lion that won’t leave even if the cage is left open."
– Alex, a high confidence screen reader user

Image of different heading levels.

Project Description

The vast majority of websites online are not built with screen-reader users in mind, so assistive technology attempts to approximate the experience of sighted users by informing them of the HTML’s structure. This approach tries to ensure that blind users can access all available information and affordances on a given website. However, most developers code websites to optimize the visual experience for sighted users, and don’t design the screen reader experience for blind individuals. This translates to unintuitive element orderings, inconsistent controls, and many accessibility challenges.  So, blind users experience an internet that is unreliable, that requires constant relearning and, in some cases, excludes them from access altogether. 

Ecommerce websites present an interesting opportunity, because designers have solidly converged on an overarching structure (which isn’t generally true across other website types). Our team is proposing an online shopping experience designed specifically for blind users that capitalizes on this structure. We have identified four key actions that are instrumental in navigation: 1) browsing through search results, 2) accessing filters, 3) finding the “my cart” button, and 4) finding the search bar.

Our team envisions a shopping experience where users could skip from one search result title to the next, without having to key past details like ratings. Some ecommerce websites already give each search result a heading, but many don’t, which substantially ups the time it takes to find the right item. Filters could also benefit from this functionality and making the distinction between “search result headings” and “filter headings” will allow users to switch between functionality dynamically, in a predictable way. 

Impact

In our research, we asked users “if you could snap your fingers and have one type of website become accessible to you today, which type would you choose?” We discovered that a plurality of 45% of users chose shopping websites as their answer, revealing shopping as a high-priority target. This makes sense because people need access to essentials like groceries, clothing, and household items. 

Online shopping is particularly attractive for people who are blind because they encounter challenges in the physical world that make it hard to get themselves to a store. In an interview our team facilitated, we heard from a blind web user named Lena who said “this [ecommerce sites] is an option that should open doors but is shutting me out even more.”

This project focuses on the browsing portion of the shopping experience. When discussing headings on shopping sites, users organically verbalized that they want a heading tag on every search result, because it is frustrating and overwhelming to hear details on items that they aren’t interested in. The ability to skip from result to result will save users a lot of time, and our team believes that, if users know that they always have access to this functionality, they will feel more confident shopping online. 

In the most developed version of this project, users will be able to find the search bar, search results, filters, and the “my cart” button in a consistent way across all ecommerce sites. Standardizing how users interact with all these features will create a reliable experience that minimizes mental strain and allows users to jump between tasks easily. 

Our team acknowledges that the web is always changing, and standards for interaction patterns morph over time. We believe that this project will be resilient to shifting trends because the overall layout of shopping sites is consistent across the ecommerce industry, and because major players like Amazon have used this format since 2011.

User Scenarios

This section gives examples of situations users currently experience on the web relating to task-specific headings. 

Example 1: Lack of Search Result Headings

Cara has a dog named Loki, whom she loves very much. Loki has recently been diagnosed with a medical condition that requires him to eat special food. Cara notices that she is almost out of the initial bag of dog food she got from the vet and goes to Petco.com to order more. 

Cara is fully blind and uses a screen reader to navigate the web. She is an experienced screen reader user and is generally adept at handling whatever challenges are thrown her way. 

She remembers that the brand is BlueBuffalo, but can’t remember the exact wording of the food, so she searches “BlueBuffalo dog food.” She opts to navigate by heading and tabs down to hear “heading level 1: dog food” and then “heading level 2: you may also like.” Cara is experienced enough with shopping sites to know that that means individual search results do not have headings on this site, and her focus has skipped to the bottom of the page. 

Cara recognizes that she will have to key over item names, ratings, quantities, and prices for all 187 search results as opposed to just hearing the item names. She is disappointed because finding Loki his food will take up to 4 times longer than it would have if search results had headings. 

Analysis 

If this website had headings for each search result, Cara could quickly parse her search results, and could have avoided frustration. 

Example 2: Inability to Switch from Headings to Search Results Easily

Daniel’s couch has been on its last leg for some time, and he finally makes the decision to buy a new on. Daniel goes to Wayfair.com, because he knows they have high-quality furniture, and he has a 15% off coupon. As a high-confidence screen reader user, he is determined to find what he needs on this site, so he can take advantage of the deal. 

Daniel has measured his space and decides that he would like a couch that is about 8 feet long and 4 feet deep, and his roommate would really like for it to be black. He searches “8 ft black couch” and there are no results. So, he searches “couch” and hopes to use filters to only see relevant options. Through some page exploration, he discovers that there is a size filter which lets him set length, width, and height and a color filter. 

However, the filters do not have headings, so Daniel must go through every filter option to find the filters he cares about. He must click the back arrow 48 times from the results heading to get to the length filter. He would really like to set a price range, and then try different combinations of dimension inputs to find the best sofa for his space and his wallet. 

Daniel switches from filters to search results several times but concludes that it’s not worth his time since it takes so long to find the filter he wants, and his input may not yield good results anyways. He searches “couch” again and resorts to keying through all search results.  

Analysis

Without the ability to easily switch between search results and filters, it is a natural conclusion for many users who are blind that filters aren’t worth their time. Our team heard from many users that they do not generally use filters, and we suspect this as the root cause. This shortcoming of easy navigation effectively robs blind users of existing functionality.

Example 3: Finding the “My Cart” Button on an Inaccessible Website

Riccardo has a lot of experience repairing things, and his vacuum cleaner has just broken. He has already figured out which part needs to be replaced, and he goes to a site that sells parts for his make and model of vacuum. 

Riccardo is a high confidence screen reader user, and quickly finds the part he needs. He adds it to his cart and begins looking for the “my cart” button to check out. Riccardo navigates to the part of the page he suspects is the main navigation, and keys around. As he parses his options, he hears “button,” button,” “button,” “button,” “button.” Riccardo realizes that many buttons on this site do not indicate what they do. He clicks several buttons to see where they lead and none of them take him to his cart. After about an hour of troubleshooting these ambiguous controls, Riccardo gives up and decides he will just have to wait and ask one of his sighted friends. 

He is frustrated that he wasted so much time on this simple task and experiences a blow to his independence. 

Analysis

Ambiguous buttons and links are one of many sources of inaccessibility on the web. Ideally, all buttons and links would both indicate that they are clickable, and what they are for. A designed experience like the one described in this project ensures that users can at least find the most important buttons on shopping sites such as “my cart” because ML is being used to determine their locations. 

Example 4: Finding the “My Cart” Button on an Accessible Website 

Mason has heard good things about Boxed.com, a grocery delivery site, and wants to try it out as a possible replacement for his current service. He is a high confidence screen reader user, and his philosophy is that if an ecommerce website is too hard to use, then they have lost his business. 

He successfully finds the items he wants to buy this week and would like to view his cart. He looks for the first heading to get to the top of the page. He then keys backwards until he finds the main navigation section, and keys around there until he finds the “my cart” button. 

He clicks the button and feels satisfied with his experience. 

Analysis

This example shows a positive experience of a user trying to figure out how to complete a key task. However, it’s worth noting that even this positive experience still takes far more effort for blind users than it does for sighted users who simply look in the top right corner and click. A designed system that allows screen reader users to opt for “my cart” at any time significantly ups usability to being on par with the experience of sighted users.

Notes on Low Confidence Users

The examples above concentrate on high confidence users to show that even people who have spent years learning assistive tech have trouble in these scenarios. 

The tasks outlined above that took high confidence users a long time to figure out are most often completely blocking for users that are just starting out. Over time, users learn paradigms about the internet, like that “you may also like” means they are at the bottom of a shopping page. They also learn resilience that increases their willingness to explore ambiguities, such as buttons that don’t say what they do. Currently, users must build up a mental model of how ecommerce sites work in order to go looking for the task they are trying to complete. So, users (especially those that are new to the web) would benefit from an experience that told them the primary tasks of a site upfront. 

Clear Definition of Scope

This project deals specifically with the shopping experience while users are browsing for the items they want. The scope excludes the checkout process and item detail pages. 

We define ecommerce websites as sites where users can buy a physical item. This definition excludes travel websites and service-based websites (like lawn care) because they do not tend to follow the same design patterns as websites that sell physical goods. 

While our team has limited the scope to the browsing portion of ecommerce websites, we believe this model could be used to design many other experiences for users who are blind. A shift to designing screen reader experiences that extract data from existing websites may have major implications on the future of accessibility because 1) this approach incorporates usability into the screen reader experience and 2) it could allow blind users to virtually experience a site as if it fully complied with W3C standards, while reducing the associated labor on the part of developers. 

Levels of Implementation

The minimum viable product for this project is to ensure that each individual search result has a heading. The screen reader should scrape the website to determine what heading levels have already been used, and assign search results the next highest heading level (if results are not already headings).

The next project extension would be to ensure that each filter has its own heading, so users can switch between options like size and color without keying past red, green, blue, etc. The screen reader would assign filters the next highest heading level, so that users can go between search results and filters.

From there, the project could continue by assigning the next highest heading levels to the “my cart” button and the search bar. Headings are typically assigned to text or links, but because the screen reader is acting as a buffer between the site and the user, we propose virtually assigning these elements header tags. That way, users can remain in the same mode as they switch between functionalities. Other elements that might also be treated in this way include:  1) the “sort by” button and 2) the main navigation button that allows users to explore by categories. At this point, it becomes important to incorporate customizability so that users can choose which options are visible to them.

We fear that labeling interactive elements as headings will confuse users because this breaks their notion of how headings are generally used. So, as the final extension, we recommend building out a shopping experience that extrapolates out the notion of headings and leaves users with a simple, intuitive experience that presents them with each task in its own right. Users could switch between “search results,” “filters,” “my cart” and so on without having to figure out which heading level corresponds to which task on every sight. The controls should parallel those used to navigate by header so that users don’t have to learn new navigation methods.

The final extension described above leans into the idea of designing an experience specifically for blind users. It standardizes all ecommerce sights so that users can start to build reliable mental models. Users continually expressed a feeling of rolling a die; that it was up to chance if they could find an element or interact with a component. For the browsing phase of shopping, this project ensures that users can find elements belonging to the key actions of shopping sites. 

Related Research

OCR

In an approach that uses CV, Optical Character Recognition (OCR) may be useful in identifying element locations. These resources outline the basics of OCR, and images are processed before OCR begins. 

Segmentation in OCR -  https://towardsdatascience.com/segmentation-in-ocr-10de176cf373

Current state of OCR -  https://ieeexplore-ieee-org.olin.idm.oclc.org/document/9183326/metrics#metrics

HTML Mapping

A CV approach will also require the ability to map a coordinate from an image to the corresponding point in the website’s code. 
Documentation of elementsFromPoint method -  https://developer.mozilla.org/en-US/docs/Web/API/Document/elementsFromPoint  

Related Work

The ecommerce-focused study listed below recommends that ecommerce sites “reorganize the site with headers,” specifically calling out search results as a high priority. Their findings bolster our analysis that headings specific to ecommerce sites are a high priority. 

Evaluation of e-commerce websites accessibility and usability: an e-commerce platform analysis with the inclusion of blind users -  https://link-springer-com.olin.idm.oclc.org/article/10.1007/s10209-017-0557-5#Sec18

Our team identified a project called Revamp that takes on the challenge of communicating item details on shopping websites. Further research into this application could reveal possible approaches for our own technical implementation, and possible pitfalls. Revamp dovetails beautifully into the extended functionality of this project, and the two could be combined to create a more complete shopping experience for users. 

Revamp: Enhancing Accessible Information Seeking Experience of Online Shopping for Blind or Low Vision Users -   http://arxiv.org/abs/2102.00576  

From a technical perspective, a technology called SaIL takes websites, identifies ARIA landmarks, and injects them into website code. There methodology may be useful in creating an ML solution. Studies associated with SaIL also revealed that a key advantage of landmarks is that they don’t have to differentiate between links and buttons, which is also true of our proposition. 

SaIL: saliency-driven injection of ARIA landmarks -  https://doi.org/10.1145/3377325.3377540

ARIA landmarks are tags that developers can include in their code to assist screen reader users. Our understanding is that these cover universal website landmarks like banners and main navigation bars. The search bar is also included in possible landmarks, but otherwise more granular, site-specific landmarks do not exist. This suggests that the research above does not overlap with our proposed project. It is also worth noting that our project could be integrated as landmarks rather than headings. 

AIRA Landmarks -  https://accessibility.oit.ncsu.edu/it-accessibility-at-nc-state/developers/accessibility-handbook/aria-landmarks/  

Minimum Viable Product Solution Overview: Search Result Headers

From users, our team heard a consistent desire for individual search results to be labeled with headings. So, the MVP of this project is to ensure this is the case from the user’s perspective. Virtually adding header tags ensures that users can leverage the same controls they are accustomed to while having access to optimized navigation with respect to parsing search results. 

Minimum Viable Product Workflow

This workflow details how a user might interact with MVP functionality, and how the system responds at each stage. Actions taken by users are highlighted in bold.

  1. The user navigates to an ecommerce website and searches for an item.
  2. The user manually turns on “shopping mode.”
    1. Ideally, users should not have to turn on shopping mode. If a method can be developed that automatically triggers the ML/CV solution at an appropriate time, this would substantially improve the usability of this tool.
  3. The system uses ML and CV to find the title of the first search result.
  4. The system notes the id/class of the first search result title and uses this information to treat all divs with the same id/class as search result titles going forward.
  5. The system evaluates the code to determine the highest heading level that is not currently in use (for example, the page uses H1 and H2, so the result here would be H3).
  6. The system assigns all search result titles with that heading level.
  7. The user chooses to navigate by heading.
  8. The user explores the page and discovers which heading level corresponds to search results.
  9. The user can continue to navigate by headings, skimming past listings they aren’t interested in and drilling down on details such as price when they want to.

Locating the First Search Result Title

Our team has identified two possible ways of locating key components on shopping websites using machine learning. The first is to take a screenshot and use segmentation to find the bounding box around the component. Those coordinates can be used to find an ID or class in the HTML. The second is to look directly at the publicly available code and find the ID or class directly. Below we outline these possibilities and consider possible limitations. 

Computer vision can be used to find the first search result, and within it, the location of the search result title. Below are screenshots from highly visited ecommerce sites illustrating the variety of ways a card title can appear. 

Sample layout of search result cards for Amazon, Walmart, Etsy, Home Depot, Wayfair, Instacart, Boxed, and Best Buy. Each has an image with an item title directly below it, with the exception of Instacart, which put the price before the title, and Best Buy, which put the title to the right of the image.

The title is typically located directly below the card’s image, although there is some variation here. Machine learning will need to be implemented to account for the nuances between how sites display search result cards. 

The output of this CV/ML algorithm should be the x, y coordinates of the card title. The JavaScript method document.elementFromPoints(x, y) can then be used to pinpoint the card heading in the website’s code.

Alternatively, it may be possible to create a ML algorithm that finds the search result title by looking only at the code. This approach circumvents several failure modes that may arise in using CV. 

Failure modes if CV is used:

Once the first search result’s title has been located, the system can use information from the code to extrapolate all other search result titles. For example, the title might have a div tag of “ID = itemName,” which suggests that all other elements with ID “itemName” are also search result titles. So, the system will only have to use ML to find a result title once. 

Expanded Functionality Overview: Creating a Designed User Experience 

In addition to ensuring that users can jump from one search result to the next, the expanded project will also enhance navigation between filters, the search bar and the “my cart” button. 

Expanded Functionality Workflow

This is an example workflow that shows all available functionalities in the expanded project. Users may complete these tasks in any given order. Actions taken by users are highlighted in bold. 

  1. The user navigates to an ecommerce website and turns on “Shopping mode” (if at all possible, the user should not have to manually turn on “shopping mode”)
  2. The user selects the “search” option.
  3. The system queries the current page to determine the location of the search bar and shifts focus there.
  4. The system notes the programmatic location of the search bar, and uses that marker going forward whenever the user chooses the “search option.”
  5. The user searches for an item they are interested in buying.
  6. The user selects the “search results” option from within “shopping mode” (see possible failure modes below for alternate workflows)
  7. The system queries the current page for the first search result title and shifts focus there. After the first time a user opts to see “search results” for a new search, the system should give them the option to go to the first result or pick up where they left off.
  8. The user skips past many result titles looking for the item that fits their needs.
  9. The user finds an item of interest and drills down on details like price quantity, and ratings.
  10. The user deicides to refine their search with filters and clicks the “filter” option.
  11. The system queries the current page for the first filter or a button to reveal filters.
  12. The system finds a button to reveal filters, shifts focus there, and notifies the user that they must click.
  13. The user clicks the filter button.
  14. The system queries the current page again to find the first available filter.
  15. The user switches between filters and search results several times looking for an item that meets all their needs.
  16. The user finds an item and adds it to their cart using the native functionality of the website.
  17. The user clicks the “my cart” option from within “shopping mode.”
  18. The system finds the “my cart” button and shifts focus there.
  19. The user clicks the “my cart” button. (The system could potentially find the “my cart” button and click it automatically, cutting out step S. However, that breaks the mental model of only shifting focus and would require further user testing to see if it confused users.)
  20. The user proceeds through checkout using the website’s native functionality.

Possible Failure Modes

It is possible that the system cannot find any search results if:

For each of these failure modes, the user should be informed of what has gone wrong so that they can remedy the problem. This requires the ML algorithm to determine why it cannot find a search result title. This should be straightforward to distinguish since these options look very different from one another. 

Locating Each Shopping Task with Machine Learning

Search Results

Search results will be handled in the same way as the MVP described above. Either:

CV is used to determine the x, y coordinates of the search result title, and that is used to identify its ID or class in the HTML.

OR 

The website’s publicly available code acts as the input, and the output is the div or class of the first search result title. 

Filters

Most ecommerce websites have their filter options visible from the start, but some have a button that reveals the filters. In the first case, an algorithm will find the first filter and output it’s ID or class, just like search results. 

In the second case, the algorithm will find the location of the filter button so that screen reader focus can shift there. Once a user clicks the button, the algorithm should run again to find the first filter in the newly revealed list of filters. 

Search results page of Home Depot, which has a left side bar of filters visibleSearch results page of Frontier Co-Op, which has a button thta read Show Filters in the top left corner

My Cart

For this task, a machine learning algorithm will locate a shopping cart icon, a shopping bag icon and/or the word “cart” using whichever discovery method seems most viable. The output will be a marker in the code that a screen reader could use to shift focus to the button. 

Screenshop of a variety of My Cart button, each with a cart icon and several with the word Cart near the icon. The icons range in style, but all have a trapazoidal shap as the basket and two circles below for wheels.

Search 

Like filters, search bars tend to be visible at all times, but may also be represented by an icon that expands into a search bar. So, the algorithm must either find the search bar, or the button that expands the search bar. In the second case, the algorithm will run again to find the newly revealed search bar. 

Home depot's home page with a large search bar at the top of the page in the center

Frontier Co-Op's homepage, with a search icon in the top right corner


Outputs for Each Key Shopping Task

Possible ML outputs

ID or class of a repeating element

Markers of interactive element

Error: results appear off page

Error: element does not apply to this page

Error: there are zero elements of this type

Search results 

âś“

âś“

âś“

âś“

âś“

Filters

âś“

âś“

âś“

âś“

âś“

My cart

 

âś“

 

 

 

Search

 

âś“

 

 

 

 

To see examples of each of these ML categories, see Appendix 1. 

Additional Tasks

More functionalities that could augment the experience include: 

Screen Reader Implementation 

Once the data and algorithm have been developed to find elements associated with the key tasks of shopping, screen reader developers may create controls that allow users to easily access this information. For example, in VoiceOver, the rotatory allows users to rotate through different modes of navigation, and the modes of navigation available depend on the context a user is in. So, in “shopping mode,” the rotatory might include “search results,” “filters,” “my cart,” and “search.” Although “my cart” is a single button, this model would allow users to find it easily, and a quick exploration of the rotary would inform the user of all their options upfront. 

Screen reader developers should decide what controls make sense for their screen reader, so they can minimize relearning on the part of the user. In terms of usability, the implementation should include the following:

Data Sources 

In order to accurately represent all the variation across ecommerce site, the project should pull from structures of a wide range of website types and sizes. We identified a wide variety of website types that come up in people’s lives from grocery delivery to pet supply stores to home improvement stores. 

To account for the most-used websites, we drew from our survey responses. We also looked into which sites appeared first on a google search and augmented those findings with the following sources: 

 

Sample of excel spreadsheet of ecommerce websites

To find small retailers that represent the “long tail,” our team randomly generated US towns, searched the given store type in that area, and recorded the first unique listing that had a website. 

To access the full version of the excel spreadsheet, reference  EcommerceWebsites.xlsx.

Project Risks & Assumptions

To help ensure that this project will have the desired output, outcomes, and impact specified in this specification, we have documented some key assumptions to check at each phase of the project, from dataset creation to screen reader integration and user adoption. The following assumptions are not necessary comprehensive at this moment in time, so more may be added as the project owner sees fit. It is crucial to track and test assumptions in order to mitigate project risks and ensure that the project meets the needs of the blind web users we’re designing for.

Dataset Creation

  1. Screen reader users could use task-level headings as a solution to high-priority pain points.

If a project is misaligned with end-user needs, the developed solutions risk not having their desired impact. This assumption is important because it will inform whether the rest of this project is worth the time and resources required to implement it. Our user research in this space (see Use Cases and Impact sections) indicates that e-commerce is crucial, and that differentiating different task headings is a widespread challenge. Headings are generally more of a time-consuming problem than fully progress-blocking, but can be such a frequent problem that they can lead to users giving up.

  1. The four key tasks we’ve identified as instrumental to e-commerce navigation are universal and significant. 

These actions are: 1) browsing through search results 2) accessing filters 3) finding the “my cart” button and 4) finding the search bar. This assumption will inform whether the solution proposed is addressing the right user experiences. If it proves false, then the dataset labelling categories will need to be adjusted. Our team has validated that these tasks are crucial to the shopping process through a survey with 42 participants and in-depth user interviews.

  1. The site structure of shopping websites will remain consistent enough to be identifiable by this dataset far into the future.

Due to the rapid evolution of web user interfaces, every long-term usability-oriented project runs the risk of becoming antiquated in 2-5 years. This assumption is difficult to validate due the nature of innovation and the speed at which digital technology can adopt revolutionary changes. We recommend monitoring the growth of shopping website interfaces, and adding to the dataset in the future to accommodate new trends in UI interfaces if need be.

  1. Machine learning is the right solution for this problem. 

Since the process of creating and implementing a machine learning solution is time and resource-intensive, it is crucial to confirm that the user issues identified in this proposal cannot be addressed using simpler, more efficient methods. The competitive landscape research section in this specification serves as a start for further investigation.

  1. Effective machine learning algorithms can be developed from the proposed dataset, given today’s state of technology.

If an effective machine learning algorithm is not currently possible, then the dataset runs the risk of becoming irrelevant before the machine learning can catch up to it.

  1. The proposed dataset does not already exist.

If the dataset proposed already exists, then there is no need to scrape or collect new images. If screenshots exist without the labels specified here, then there could still be reason to label these images with the proper labels. Our preliminary research has indicated that such a dataset does not exist, but this is worth further investigation regardless. 

  1. The existence of the proposed open-source dataset will not cause harm when used by third party developers.

The main risks we have identified related to the collection of data are regarding privacy of personally identifiable data. As long as all images are anonymized and all potential input data is fake, we don’t see much risk associated with the creation of this dataset.

Screen Reader Integration

  1. It is technically feasible for screen reader developers to incorporate this solution into their products, given limited computational power and other constraints.
  2. Screen reader designers will choose to incorporate this solution into their products.
  3. A “browsing mode” option is a good medium for the proposed functionality. It should smoothly integrate with the rest of a screen reader’s features and be intuitive, lightweight, and reliable.

User Adoption

  1. Task-specific heading identification functionality would save enough navigation time for blind users to overcome the learning curve of understanding how to use this new “browsing mode”.
  2. Screen reader users would use “browsing mode” frequently, and become more efficient, successful, and confident in their ability to independently shop online.
  3. The proposed solution will be more beneficial to blind web users than other potential solutions.

Metrics for Success

In order to validate the impact of this project, it could be helpful to collect more survey data on task-specific headings, to affirm that currently they pose a significant problem to blind web navigators, and that the four key tasks identified capture the steps in shopping process that heading levels would be most helpful for. Another potentially helpful indicator of impact could be screen reader or shopping website complaint data. If there are a significant number of complaints from blind users about locating the search results, filters, shopping cart, or search bar on a page, then that would indicate a pressing need for this solution.  

Algorithm performance could be evaluated by comparing the output labels of the algorithm to their expected labels, with both test images and new images not included in the dataset. Some user testing or research on the patience of screen reader users will need to be done in order to determine how low an error rate needs to be for users to deem this solution to be worth their time. 

After a sufficient algorithm exists, during the screen reader adoption phase, we recommend beta testing the “browsing mode” feature before releasing. User analytics might come in handy for monitoring how frequently beta testers engage with “browsing mode”, and what they primarily use it for. It would likely also be helpful to do user testing at this stage, in order to ensure that the audio output that screen reader users are hearing is giving them the right information, with customizable options if different people prefer different types of output.

Once the “browsing mode” feature is released and screen reader users have begun to use it, a usability testing report would be a great way to document the impact of this solution. This report could include comparison tests that measure the time it takes a user to find what they are looking for on a shopping website with and without “browsing mode” functionality. If there were time for a longitudinal study, it could monitor the daily shopping behavior of an A/B user study group, where half use “browsing mode” and the other half does not: this could keep track of the number of terminated tasks (abandoned shopping experiences), and the number of purchases made successfully per day. Any significant experience improvement trends observed in this study could serve as validation for the efficacy of the solution. 

In the far future, an indicator of the full success of this project would be seeing any published project-related studies being widely cited, and seeing further task-specific headings research being adopted by other groups, either to deepen what has already been started, or to adapt the dataset to other accessibility-related applications.

 

Appendix 1

Examples of each possible ML output:

ID or class of a repeating element using CV

Input : Image with a visible search result

Wayfair's search results page showing 4 fully visible search result cards

Output : [365, 755], [633, 835] - the bounding box around the title

JavaScript can then be used to find the ID or class associated with the title, in this case 

class=ProductCard-name

ID or class of a repeating element using publicly available code

Input : The site’s HTML, CSS, and JavaScript

Output:  class= ProductCard-name

 

Markers of interactive element using CV

Input : Image with a visible search bar

Wayfair's search result page, which has a fully visible search bar at the top in the center

Output : [485, 192], [1302, 238] - the bounding box around the search bar

JavaScript can then be used to find the ID or class associated with the search bar, in this case 

class=SearchBar-form

ID or class of a repeating element using publicly available code

Input : The site’s HTML, CSS, and JavaScript

Output:  class=SearchBar-form

 

Error: results appear off page

Input : Image of a screen after a user searches

Petco's search results page, where the search result cards run off the page and require the user to scroll for item information to be visible

Output : Results appear off page

 

Error: element does not apply to this page

Wayfair's home page, with no search result cards