How Can Collaborative Tools Improve Online Learning with VR Video?

By on

Virtual Reality (VR) has been long touted as a revolutionary technology, offering a unique and immersive learning experience that can transport students to far-flung locations and bring abstract concepts to life. However, one of the biggest challenges for VR adoption has been the high cost of creating VR content. Instructors have to find help from the VR developers or 3D model designers to create the content, because it’s hard to find or create a perfect content to fit into their classes.

With the proliferation of inexpensive panoramic consumer video cameras and various types of video editing software, using 360-degree videos in VR has attracted more attention as an alternative method for instructors building a realistic and immersive environment. It is a “more user-friendly, realistic and affordable” way to create a realistic digital experience compared to developing a simulated VR environment.

Pedagogically, collaboration learning is better than individual learning in many scenarios. This articulates a research gap in the development and empirical investigation of collaboration VR video learning environments. 

Our work designed two modes to investigate the roles of collaborative tools and shared video control, and compared it with a plain video player (See our demo through the following video). Each mode contains a video viewing system and an after-video platform for further discussion and collaboration.  Basic mode uses a conventional VR video viewing system together with an existing widely-available online platform. Non-sync mode includes a collaborative video viewing system with individual control and video timeline and an in-VR platform for after-video discussion. Sync mode contains the same in-VR after-video platform, but students have shared video control. 

The study aimed to answer two research questions: 

RQ1: How does VR video delivery via existing technology (Basic mode) compare to collaborative VR video delivery (Sync and Non-Sync mode) on measures of knowledge acquisition, collaboration, social presence, cognitive load and satisfaction?
RQ2: How does individual VR video control (Non-sync mode) compare with shared video control (Sync mode) on measures of knowledge acquisition, collaboration, social presence, cognitive load, and satisfaction?

In order to examine the influence of different types of collaborative technology on the perceptions and experiences of online learning, we conducted three conditions within-subject experiment with 54 participants (18 groups (trios)). We collected quantitative data from knowledge assessment, self-reported questionnaires and log files, then we triangulated the validated measures with qualitative data from semi-structured interviews.

Figure 1. Study protocol

For RQ1, we found that collaborative VR-based systems both achieved statistically significantly higher scores on the measures of visual knowledge acquisition, collaboration, social presence, and satisfaction, compared to the baseline system.  For qualitative results, participants reported the potential reasons, such as lack of shared context and current technical obstacles (e.g, echos), for lower scores of Basic mode on collaboration and satisfaction. They also appreciated the in-VR platform’s power to transmit and display visuals for after-video discussion, which explained the potential reason for lower scores of Basic mode on the measures of visual knowledge acquisition. 

For RQ2, The shared control in Sync Mode significantly increased the ease of collaboration and sense of social presence. In particular, shared control significantly increased the view similarity (where the team was watching the same view of the video) and discussion time during the video. Based on the qualitative results, There were better collaboration experiences with shared control in the Sync mode due to better communication comfort. There was a tension between communication comfort and learning pace flexibility and the control method would influence the perceived usefulness of collaborative tools. 

Our work provides implications for design and research on collaborative VR video viewing. One important one is balancing the trade-off between learning pace flexibility and communication comfort based on teaching needs. The expectations for time flexibility and collaboration experience might differ for diverse educational activities and learning scenarios. Therefore, VR collaborative applications should decide whether or not to use shared control based on specific purposes.

Finally, takeaways from this paper:

  • Collaborative VR video viewing system can improve visual knowledge acquisition, collaboration, social presence, and satisfaction compared to the conventional system
  • Shared video control in VR video viewing can enhance collaboration experiences by increasing communication comfort, but may also reduce learning pace flexibility.
  • In-VR platforms for after-video discussion can enhance visual transmission and engagement, and improve the overall learning experience in collaborative VR video environments.

Find more information in our paper here –– coming to CHI 2023! 

Cite this paper:

Qiao Jin*, Yu Liu*, Ruixuan Sun, Chen Chen, Puqi Zhou, Bo Han, Feng Qian, and Svetlana Yarosh. 2023, Collaborative Online Learning with VR Video: Roles of Collaborative Tools and Shared Video Control. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). https://doi.org/10.1145/3544548.3581395

Towards Practices for Human-Centered Machine Learning

By on

Toward Practices for Human-Centered Machine Learning from CACM on Vimeo.

People are excited about human-centered AI and machine learning to make AI more ethical and socially appropriate. AI has captured the popular zeitgeist with promises of generalized artificial intelligence that can solve many complex human problems. These promises of ML, however, have had negative consequences, with both ridiculous and catastrophic failures – they rack up so fast that colleagues are keeping AI Indicents databases, reports of AI ethics failures, and more to boot.

How will ML researchers and engineers avoid these problems and move towards more compassionate and responsible ML? There aren’t many concrete guidelines on what it looks like to do human-centered machine learning in practice. And while there are some pragmatic guides, they often lack the connection between technical and social/cultural/ethical focus.

In my recently published CACM article, I argue that there is a gap in building human-centered systems – the gap between the values we hold but don’t have actionable methods and technical methods that don’t align with our values. The paper argues for practices bridging the ever-significant value and the focus of ever-practical methods.  

This paper synthesizes my CS and Critical Media Studies background in thinking about how we should DO HCML. It also builds on my decade of research experience in human-centered research in a challenging area – predicting and acting on dangerous mental health behaviors discussed on social media data.  It builds on classical definitions of human-centeredness in defining HCML and lays out five practices for researchers and practitioners. These practices ask us to prioritize technical advancements EQUAL TO our commitments to social realities. In doing this, we can make genuinely impactful technical systems that meet people and communities where they’re at.

Here are the five big takeaways from the paper and the practices you can implement immediately.

  1. Ask if machine learning is the appropriate approach to take 
  2. Acknowledge that ML decisions are “political”
  3. Consider more than just a single “user” of an ML system
  4. Recognize other fields’ contributions to HCML 
  5. Think about ML failures as a point of interest, not something to be afraid of

Let’s dig into one of these that seems – considering more than just a single “user” of an ML system. When considering who “uses” a system, we often only consider the person commissioning or building the system. Even in HCI, we talk about “users” of systems and (if lucky) the people whose data goes into the model. However, many systems have much larger constellations of people “involved” in the ML model. For example, the “user” may be a government or business in facial recognition technology. But the people whose faces are in that system are also “users” of the technology. Likewise, if that facial recognition system is used in an airport to screen passengers for flight identification, everyone who walks by ambiently may interact with it. The existing ML system meaningfully impacts a user who chooses NOT to interact with that system – if opting out means they must spend more time in airport security or have their identity scrutinized more closely. Both examples make it clear that with the consideration of multiple stakeholders involved in the ML model, we should consider all the stakeholders whose data goes into creating the model.

I aim for these principles to inspire action – to encourage more profound research, empirical evaluations, and new ML methods. I also hope the practices make human-centered activities more tractable for researchers AND practitioners. I hope this inspires you and your colleagues to ask hard questions that may mean making bold decisions, taking action, and balancing these competing priorities in our work. 

You can read more about this paper in the recently published Featured Article in the Communications of the ACM here.

Siblings in the Digital Divide: Navigating Communication Challenges and Opportunities for Large Age Gap Relationships

By on

Figure 1. Author and her brother

This photo captured a bittersweet moment in my life. It was taken on my first day when I came to the university, brimming with excitement and anticipation for the journey ahead. At that time, I was 18 years old, and my younger brother was 12 years younger than me — he was only six years old. After that day, I started my own life at Shandong University in Jinan, which is a city about 300 miles away from my hometown. So during the next following years we lived separately, my brother and I had to rely on video and audio calls to stay in touch. 

But it’s still not easy to maintain a close sibling relationship. As the years went by, the distance between us grew, and we missed out on so many precious moments that we could never get back. In fact, not only me, but my parents also tried to make my brother and I get in touch with each other, like handing the phone to my brother.  But somehow we still kind of have nothing to talk about. Many times, after one or two minus small talk, my brother transfer the phone back to my parents. That is really frustrating because there are many technologies and tools that have made it easier for us to stay connected, but why we still cannot feel we are close to each other?

This experience led me to wonder – why “large gap” sibling relationships are particularly difficult to support. Before we dive into this question, I am going to talk about some important background information. 

  • Why sibling relationships are critically important?
  • Why are sibling relationships different from other types of family connections?

Why sibling relationships are critically important?

There are a number of reasons for the importance of siblings. First, sibling relationships are an important aspect of child development. Although we tend to focus more on parent relationships, prior work indicated that sibling relationship also significantly affects how children develop, particularly socially and emotionally. Second, the relationship with siblings is of extremely long duration. Contact with siblings is maintained by almost all adults throughout their lives. Thirdly, sibling relationships are pervasive relationships. Most of us have brothers and sisters. In fact, a study showed that an estimated 80 to 90% of individuals grow up with a sibling.

Why sibling relationships are different from other types of family connections?

Unlike parents or primary caregivers who generally act as a secure base, siblings are thought to fulfill the social needs of children and are more often sought out for fun and playful interactions rather than support and comfort. Also, it tends to be more equal than family members of other generations. It is also different from the roles of peers. Because of the more co-constructed experiences and contact frequency, there is an important role of shared experience in sibling learning and communication.

Figure 2. Why large-gap sibling relationships are particularly important and difficult to support?

OK, here comes our key question, 

Why “large gap” sibling relationships are particularly difficult to support?

I know there are lots of older brothers or sisters who have similar problems to me. With maturity, given the number of life changes that occur, for example, like me, going to the university, when we have a totally different life circle and timetable, this distancing is not surprising. For children, using audio or video calls also hard to engage them to maintain a long-distance relationship. So even though sibling attachment bonds are still important for each party, being an adult suggests a decrease in contact and proximity. It makes it a challenge that connects the older sibling as an adult, and the younger sibling as a child. 

The good news is that we currently have more options to connect with remote families.  Lots of technology emerged in both industrial and academic fields that offer at least a partial solution to the problem of long-distance families. I am not going to spend more time talking about all these existing tools. Some of them you may already be very familiar with.  Also, lots of designs emerged in the HCI research, aiming to connect remote families.

Figure 3. Examples of commercial tools to help family connect together
Figure 4. Examples of research to help family connect together

However, although there is a growing interest in distant family communication, in the literature on designing for remote families, the sibling relationship has not received much attention. That is, even some systems are designed for the whole family including siblings. Because of the specialty of the siblings’ relationship mentioned before, few prior works examined how technology might influence siblings’ relationships. None of the prior work explicitly investigates the specific challenges in large-gap sibling relationships’ communication. 

To truly understand the intricacies of communication between siblings separated by a significant age gap, I utilized a mixed-method approach. Two weeks of diary study for older siblings and remote, semi-structured interviews with both siblings and one parent formed the basis of my research. The data collected was analyzed through a thematic analysis that involved open coding and clustering codes into themes. We recruited families which at least two siblings and those age differences are more than 5 years old. The younger sibling’s age is between 6 to 14 years old. The elder sibling has lived separately from the family for more than half a year. They need to have experience living together and have regular direct or indirect communication. 

Figure 5. Methods used in our study

The results of this study revealed the unique needs and challenges faced by stakeholders involved in remote communication between large age gap siblings. Specifically, we found that the relationships between large age-gap siblings consist of older-to-younger companionship and care, with older siblings also taking on a pseudo-parental role. At the same time, there is a younger-to-older rivalry that can create tension between siblings and reduce the quality of family communication.

Our findings also highlighted the role of older siblings in initiating communication, engaging younger siblings, and providing technical support. Meanwhile, parents help to enrich siblings’ communication and provide logistical facilitation. However, there are challenges in managing conflicting values between parents and older siblings, promoting child-led conversations, and navigating technology obstructions.

To address these challenges, we identified three design opportunities for technology to better support the needs and practices of different stakeholders in remote sibling communication. First, technology can support co-present involvement for different stakeholders’ requirements and needs in remote settings. Second, it can scaffold child-led conversations under asymmetric relationship expectations. Lastly, technology can help negotiate value conflicts between older siblings and parents, which affect siblings’ communication and their relationships.

As we move further into the digital age, the importance of sibling relationships remains as critical as ever. However, as our research has shown, maintaining strong connections between large age-gap siblings can be challenging. By leveraging the power of technology and designing solutions that address the unique needs and practices of different stakeholders, we can bridge the gap between remote siblings and create more meaningful connections.

Find more information in our paper here –– coming to CHI 2023! 

Cite this paper:

Qiao Jin, Ye Yuan, Svetlana Yarosh. 2023, Socio-technical Opportunities in Long-Distance Communication Between the Siblings with a Large Age Difference. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). https://doi.org/10.1145/3544548.3580720

Reference

  • (Toet et al., 2021)Toet, Alexander, et al. “Augmented reality-based remote family visits in nursing homes.” ACM International Conference on Interactive Media Experiences. 2021.
  • (Shakeri and Neustaedter, 2021) Shakeri, Hanieh, and Carman Neustaedter. “Painting Portals: connecting homes through live paintings.” Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 2021.
  • (Inkpen et al, 2013)  Inkpen, Kori, et al. “Experiences2Go: sharing kids’ activities outside the home with remote family members.” Proceedings of the 2013 conference on Computer supported cooperative work. 2013.
  • (Nunez et al., 2019) Nunez, Eleuda, et al. “Effect on Social Connectedness and Stress Levels by Using a Huggable Interface in Remote Communication.” 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 2019.
  • (Jarusriboonchai et al., 2020) Jarusriboonchai, Pradthana, et al. “Always with Me: Exploring Wearable Displays as a Lightweight Intimate Communication Channel.” 
  • (Judge et al., 2011) Judge, Tejinder K., et al. “Family portals: connecting families through a multifamily media space.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2011.

Stakeholders, Rationales, and Challenges of Virtual Reality in Education: How Will VR Enter Classrooms?

By on

We are living in a thrilling age where commercial VR headsets are no longer a luxury but an affordable reality. The ability of virtual reality to transform education has been a hot topic in recent times, with a wealth of articles, studies, videos, applications, and books dedicated to the subject. The possibilities of VR as an educational tool have captured the imagination of many, with some claiming it will have a profound impact on how we learn and educate. However, it begs the question, if the prospect of learning in VR is so exhilarating, then why isn’t it more prevalent (or even present!) in higher education? Who is putting the brakes on this exciting new learning tool?  Are there hidden challenges beyond what we see in published research or studies, and do stakeholders beyond instructors and students influence the decision to embrace VR in education?

Figure 1. Who influences technology adoption in higher education?

It’s time to delve deeper into the complex world of virtual reality in education and explore the untold stories behind its adoption. For larger organizations or complex contexts such as universities, there is usually more than one type of stakeholder who works together to guide the technology’s adoption decisions. For example, prior work has identified a group of stakeholders (e.g., technology staff, financial staff and administrators) in higher education who will interact with each other to affect the strategies and decisions of a university. With this in mind, we pose three research questions: 

  • Who are the stakeholders we need to consider for using VR in the classroom?
  • What is the rationale for VR use in higher education?
  • What challenges do major stakeholders face in using VR technology in educational activities?
Figure 2. Three research questions we posed in our study

In order to get a more holistic view to answer the research questions, this study applied a multi-method approach with semi-structured interviews followed by two participatory design workshops with university students and instructors. We followed up with another round of interviews with other major stakeholders identified by the workshops. Then, we chose to have a data-driven process to analyze our data from the interviews and workshops.

Figure 3. Methods used in our study

Who are the stakeholders we need to consider for using VR in the classroom?

Through our first round of interviews, it became apparent that there are more people, beyond instructors and students, that we should consider as stakeholders when integrating VR in higher education. The university can be seen as an educational ecosystem, where instructors may be collaborating with other types of experts or services to facilitate their courses. Stakeholders identified by our participants under university systems include co-teaching instructors, TAs, teaching support staff, classroom designers, IT staff, and so on. There were also some stakeholders beyond the campus, including VR content creators/developers, funding providers, and industrial companies. 

We found that different stakeholders at higher education institutions have the power to accelerate the integration of VR technology into traditional classrooms. Most notably, institutional support can promote sustainability and maximize efficiency in many aspects in the long term, including but not limited to management, deployment, and content creation. 

Figure 4. Stakeholders who may influence VR adoption

What is the rationale for VR use in higher education?

Our data revealed five reasons why people choose to use VR in higher education. 

  • Increasing Social Presence
  • Accessing Otherwise Inaccessible Learning Contexts,
  • Understanding and Remembering Visual and Spatial Knowledge
  • Supporting Embodied Learning
  • Attracting Students through Novelty

I am going to talk about the first one — social presence. Our work points out the importance of collaborative social experiences that VR can achieve in students’ learning process. Most participants identified the ability to create a realistic social environment that supports collaboration as one key benefit of VR. Compared with some other benefits of VR, such as the engagement and interest that are brought by its novelty and would eventually fade away, the social presence is a long-lasting benefit because it is derived from the nature of virtual reality.  As one of our participants commented, “Virtual avatars and environment made it easy to get social cues, from facial expressions to body language, without worrying about privacy leaking like showing surroundings in the background on the video.

What challenges do major stakeholders face in using VR technology in educational activities?

We also identified several challenges of using VR in higher education:

  • Course design investments. 
  • Financial consideration.
  • Learning curve. 
  • Technology management (e.g., storage, maintenance, distribution, and in-class management).
  • Health concerns. 

The optimistic predictions about introducing immersive VR into the classroom are based on the fact that the hardware is now much better and cheaper. Health issues are one of the most important challenges and it’s relatable to all disciplines. Motion sickness or cybersickness, eye strain, and headache were the most frequently mentioned health concerns in the interviews. As our participant mentioned, it is extremely important to create an inclusive class and make VR accessible to people in different conditions or capabilities.

Our findings demonstrate that no matter how excited people are about using immersive VR in the classroom now, in most situations instructors can only include this as a small optional experience because of fundamental barriers to equity. For example, if one student experienced a severe sickness, most instructors in our study would choose to no longer use VR. More importantly, when these issues are not randomly distributed in the population, the situation will become more serious. Take gender differences as an example, earlier studies have shown that an advantage of men over women with regard to cybersickness in VR. We can imagine how using VR will hurt gender equity, especially in those already male-dominated fields such as computer science.

Takeaways from this article

  • Collaboration experience is critical for educational VR
  • Ensuring that VR is accessible is the most important challenge to the adoption
  • It’s not about just instructors, it’s about the whole community

Find more information in our paper here

Cite this paper: 

Qiao Jin, Yu Liu, Svetlana Yarosh, Bo Han, and Feng Qian. 2022. How Will VR Enter University Classrooms? Multi-stakeholders Investigation of VR in Higher Education. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 563, 1–17. https://doi.org/10.1145/3491102.3517542

Identifying Outreach Windows in Online Opioid Recovery

By on

Note: Rudy Berry completed a summer Research Experiences for Undergraduate (REU) program at the University of Minnesota with Professor Stevie Chancellor in the summer of 2022. This blog post summarizes his project outcomes. Way to go, Rudy! 

Summary: Identifying when relapse has occurred is a key factor to consider when determining how to reach out to individuals with Opioid Use Disorder. Information like time elapsed since a previous relapse influences the type of resources and language that should be presented. In this project, I wrote a script that successfully identifies the date of incidence of relapse from a relapse disclosure post in an opioid addiction recovery community on Reddit. With this information, we were able to determine the amount of time that has passed since an individual’s self-disclosed relapse and the time they reported it to the community. The ability to extract this kind of information from recovery posts may be a valuable tool for the future development of context-sensitive outreach systems.  

Overview: Opioid Use Disorder (OUD), colloquially known as Opioid Addiction, is a highly stigmatized health issue that has fueled the growing opioid crisis in the United States for over two decades. The CDC reports that Opioids were responsible for about 75% of all U.S. drug overdoses in 2020. Opioids have been linked to over 500,000 deaths since 1999 (CDC, 2021). In response to this crisis and the difficulty of finding support, there has been growing engagement in online recovery forums for substance abuse. These communities give members an anonymous space to seek advice, share success stories, and vent frustrations. Members of online addiction recovery communities frequently share feelings of shame and guilt (Mudry et al., 2012). So, the ability to detach oneself from a real-world identity is a major draw of these forums. The popular discussion website Reddit is home to a large online recovery community–r/opiatesrecovery.

In this project, our research goal was to identify the date that someone had relapsed in their OUD recovery journey. Identifying when relapse has occurred is key to aiding in the recovery process because advice is dependent on when someone has relapsed. If an individual has relapsed very recently, it is important to direct them to resources that can provide more urgent forms of harm reduction in the moment. If a relapse occurred in the distant past, it may be more appropriate to provide them with resources focused on long-term sobriety tips or maintenance care. The existence of online recovery communities presents a unique opportunity in HCI for researchers to develop technology that could provide additional support and resources to individuals with OUD beyond what community members already provide.

Therefore, the primary goal of this project was to write a script that could identify the date of incidence of relapse from the context of a relapse disclosure post. The project focused on two specific data sets; a set of posts and a set of comments all gathered from r/opiatesrecovery on Reddit. The ability to extract this kind of contextual information from recovery posts would allow outreach systems to provide more context-sensitive resources and messaging to individuals in OUD recovery based on an estimated date of relapse. We also wanted to determine the average window size between the incidence date of relapse and the postdate across all relapse posts and comments on the subreddit. 

What We Did: The first step we took was identifying posts where an individual had disclosed the occurrence of a relapse. Working with another team member, we created a regular expression that matches phrases that indicated relapse, like “I relapsed” or “I just relapsed”. This was done in collaboration with another ongoing project in the lab to identify people who disclose that they have relapsed. This allowed us to create reduced datasets of relapse posts and comments from a larger general dataset from across the subreddit. 

Once we collected the relapse posts, the next step was to identify nearby temporal expressions from the relapse time frame such as “yesterday” or “a week ago”. To do so we employed the SUTime library, a tool from the Stanford CoreNLP pipeline. SUTime is a powerful temporal tagging library that identifies temporal expressions by tokenizing text. It provides tags for four categories of temporal expressions: “Time”, “Duration”, “Set”, and “Interval”. When SUTime identifies a temporal expression it returns the expression text, type, date in reference to a passed in value or the system date, and the start and end position of the expression in the string of text. 

For this project, we were particularly interested in the text of the type “Time” since this allowed for the extraction of the most specific dates. However, we realized that a handful of posts in our dataset were matching the type “Duration”. This included posts with phrases like “I relapsed for a week” or “I relapsed for 5 days”. These phrases were typically found in longer posts with many details and much more context to consider. We took this into account in our validation process and included durations to establish the limitations of our system. We wanted to know whether a human reader could identify a relapse date from the context surrounding a duration. To analyze this, we took a sample of twenty posts where relapse dates were identified and a sample where none were identified and replicated this with and without durations. We then hand annotated the text to identify false positive and negative identifications. 

The second part of our validation process involved experimenting and evaluating the size of the character window around the relapse window to effectively identify relevant time words.  We picked three different window sizes and analyzed the entire post dataset using accuracy. We wanted to know how many posts our script was able to accurately identify the day, week, or month of relapse for each character count. 

*The number of posts at each step in the identification process

 Findings:

The first part of the validation process revealed that the time tagging system was much more accurate when excluding duration temporal types. A negative sample (posts with no relapse dates identified) of twenty posts with durations included revealed that there was only one post where a human reader would be able to establish a relapse date. The system correctly identified that no relapse date was discernible from the other nineteen posts. However, when excluding durations, our system correctly identified that no relapse date could be identified for all twenty posts in the negative dataset. Within a positive dataset (posts with relapse dates identified), the inclusion of durations had a more dramatic effect on the results. In the positive sample with durations included there were eleven posts where the system correctly identified that a relapse date could be identified from the text. However, there were nine posts where the system incorrectly identified the beginning of durations as possible relapse dates. So, for almost half the sample the script would identify a relapse date, while a human reader would not be able to. This can be attributed to the fact that durations were typical of posts with more complexity to consider. For instance, in an example like “I got out of rehab then relapsed for five months”, the system would incorrectly identify the relapse date as five months prior to the post date. In this case a human reader would have to analyze the entire post to make a more accurate relapse date approximation. The results of the positive dataset without durations were better, with only five posts being incorrectly labeled as posts where a relapse date could be determined. Based on this outcome we decided to work only with “time” temporal types and exclude durations.

         During the second part of our validation process, we selected character counts of 100, 150, and 200 around our regular expression. The best performance was at one hundred, with an accuracy of 73.4% for the entire dataset of posts. This was verified by reading each post and identifying the correct relapse date. The issue with wider character windows was the inclusion of many temporal expressions. Our script is written to return the first expression it finds. In text like “I started my recovery journey a year ago and today I relapsed”, the relapse date would be incorrectly identified as a year ago. Alternatively, in a phrase like “Starting all over again today after I started relapsing again last month”, the relapse date would be incorrectly identified as the post date or “today”. A window size of 100 fails for both cases, and instances like these are more frequent past one hundred characters. Further testing is necessary to determine the best way for the script to choose between multiple time expressions.

*This histogram shows the number of comments corresponding to certain window sizes in the dataset. For instance, the first bar shows that there were over 200 posts where relapse was disclosed to the subreddit within 0-10 days of occurrence.
This histogram shows the number of posts corresponding to certain window sizes in the dataset.
This histogram shows the portion of the comment histogram from 10-200 days.

The histogram data we collected reveals spikes in relapse disclosure within the first ten days of relapse as well as at the one-month, two months, one-year, and two-year marks. The post dataset had a mean window size of 64.6 days with a median of 7.0 days. The comment dataset had a mean window size of 177.8 days, with a median window size of 30.0 days.

Overall, the script we created can extract information about relapse incidence dates and could be easily replicated and improved for an outreach system. This system could use the window size in conjunction with other information such as sentiment and prior relapse disclosures to send an individual a message with context-sensitive resources and word choice. 

One finding from the identifier I found particularly interesting was how many people reached out to online communities to disclose relapse so soon after it had occurred. This highlights a need for these systems to focus on how to support individuals during the immediate aftermath of a relapse. In the future, further modifications could be made to address the contextual limitations of durations and multiple time expressions. Through this project I learned a lot about the benefits of anonymity in online spaces. It was interesting to see people being open about their setbacks and experiences in real-time. This work has made me more curious about the role that anonymous online communities play in de-stigmatizing OUD as well as mental health risks like anxiety and depression, and the types of systems that can safely facilitate them. 

https://journals.sagepub.com/doi/full/10.1177/1049732312468296

https://www.cdc.gov/drugoverdose/epidemic/index.html#:~:text=The%20number%20of%20drug%20overdose,rates%20increased%20by%20over%206%25.