Our Approach to Video AI in Mental Healthcare

6 min readJun 13, 2024

by Nathaniel Hundt and Eric Brown. Graphics and illustrations by Gerre Mae Barcebal, Ellen Duda, and Anna Ruch.

More than 1 in 5 adults in the United States experience mental illness annually. The American Academy of Pediatrics (AAP), the American Academy of Child and Adolescent Psychiatry (AACAP) and the Children’s Hospital Association (CHA) have all declared a national state of emergency in children’s mental health. The severe shortage of mental health clinicians and increased rates of burnout make this situation all the more dire.

Can Video and AI Help?

The pandemic opened our eyes to the benefits of telemedicine. We have also been inspired by solutions that leverage AI to improve access to mental health tools and wellness resources. Mental Health America’s virtual assistant, developed in collaboration with researchers at the University of Washington, is one such example: the tool helps people learn how to reframe distorted thinking, challenge thoughts and consider relevant, alternative ways of seeing a problem. We need more positive psychology tools like this woven into clinical practice, as well as continuing medical education.

Video and AI working in concert with each other (“video AI”) can transform workflows, such as patient intake, and in doing so, create more access for people with different abilities. We see video AI as a tool for helping patients communicate using self-recorded video after providing thought-provoking questions that are more understandable, in multiple languages. Video AI can ask for help clarifying responses, converse and summarize information so that your clinician can pick up conversations where you left them, without missing a beat.

At Dr. Katz, this AI-powered future excites us. We’ve organized our product vision into delivery phases, beginning with 1) boosting internal productivity, 2) assisting with intelligent decision support, and 3) guiding interactive coaching and tutoring sessions.

The American Medical Association (AMA) has found most clinicians are optimistic and open to using AI in their work. Of course, in order to ensure that this technology is used for good, a responsible approach is needed.

What Makes AI Responsible?

Responsible AI is a set of principles used to guide the design, development, and deployment of solutions in ethical, equitable and transparent ways. In January 2024, the Organization for Economic Cooperation and Development (OECD) published Collective Action for Responsible AI in Health, a report that suggested responsible AI should be “trustworthy, ethical, and [minimize] risks while respecting human rights and democratic values.”

Our commitment to responsible AI development includes the following three practices:

Protect Trust: Steward end-user and customer data with robust privacy, security, and safety measures. Govern data and model usage in accordance with HIPAA-compliant policies and procedures. Acknowledge and credit trusted knowledge sources.
Promote Equity: Include a broad range of customers, users, and people with lived experience in research, design, development, and implementation phases to ensure our technology solutions meet diverse needs and actively reduce bias.
Prioritize Transparency: Clearly communicate where and when AI is used on our platform, ensuring explanations are easy to understand.

At Dr. Katz, we are at the forefront in implementing these practices, which affords us with an opportunity to share examples of how they drive development decisions. Last fall we transparently shared how we were using video AI to generate metadata, such as suggested topic tags and titles, as well as paragraph descriptions of videos uploaded by content creators. Today, the addition of video AI is saving content creators precious time and streamlining publishing operations, allowing our platform to serve as a trusted source for ever more valuable content.

In order to generate recommended titles and descriptions with video AI, we leverage best-in-class large language models (LLMs), such as Claude 3, without the need to train on customer or end-user data — this decision stemmed from our regard for privacy and safety. We fine-tune these language models with our domain expertise and best practices understood from decades working in instructional design.

At the application level, we always keep the creator in the loop. People are empowered by our technology and can see recommendations, while retaining the right of editorial decision-making.

Promoting Equity

One of the biggest imperatives we see in healthcare is reducing disparity in treatment and outcomes.

We know that marginalized people face persistent, systemic injustice, and bias when they seek healthcare. One thing we have learned from patient focus groups and extensive research with users is that video, as a communication medium, is radically easier to use than text-based forms of communication — that goes for older adults, too. As we continue to develop video AI solutions, we are committed to listening to and including people with lived experience in our research and design process. We believe video AI can break down stigma, improve health literacy, and reinforce culturally competent behavior.

Our approach puts people at the center of decision-making.

Applying Lessons from Workday

In 2018, Nathaniel co-founded Workday for People with Disabilities, an employee belonging council that advocates for accommodations. This council built a large tent, representing neurodivergent people, such as myself, and coworkers with more visible disabilities.

When the council was asked to provide feedback on the company’s new headquarters in Pleasanton, California, we observed that was good for us — improved office design through accommodations such as bariatric chairs, wide hallways, accessible ramps, elevators, and a variety of workspace settings — was best for all coworkers.

We experienced positive change at Workday. At Dr. Katz, we’re applying the curb cut effect to mental healthcare. Many times, those who need a provider most cannot find one due to burdensome onboarding processes, which involve answering the same questions on long, tiring, and difficult to complete forms. We are confident that our approach, which harnesses the power of video AI, will eliminate many obstacles associated with finding and receiving high-quality mental health care for those that need it most, benefiting everyone.

The Future of Conversational UI and Intake

We also believe that the next frontier combining video and AI will power clinical apps that feel lifelike and express empathy. This new category of apps will be engaging, understanding and personalized — far more dynamic than the rather robotic experiences we currently have when interacting with automated customer service agents.

*Each technology wave has changed the way we consume information, create, and communicate. This evolution has paved the way for new standards and elevated expectations for customer service.*

For educational purposes, we’ll be able to simulate therapeutic scenarios and use affective computing techniques to make interactions more understanding — this means that training simulations for residents and trainees will be much more immersive. Biomarkers from recorded video interviews will also contain stronger signals than audio or text-based signs of depression or anxiety. Clinicians seeing patients virtually will use assistants that are more efficient than current scribes.

We see video AI as a catalyst for reimagining outdated processes, beginning with patient intake for clinical care and research trial recruitment. We have seen how TikTok and Instagram have popularized short video as a medium for storytelling, self-expression, and communication. Integrating this practice into intake will be a paradigm shift for mental healthcare; we foresee a new normal in which patients share their own videos describing their struggles and express their feelings in a way that allows them to feel both heard and seen from the very beginning of their care journey. Video allows for a richer and more nuanced form of self-expression, capturing emotions and non-verbal cues that text or audio alone might miss.

We envision a world in which clinicians are able to collect information more efficiently, review data-rich patient-recorded videos and recruit people for clinical trials with increased focus on equity.

We see video AI as a profound tool that can assist both patients and clinicians in achieving a mutual goal: to be understood and to gain a deeper understanding of ourselves in all our complexity.