Field Test: Can a Zikabot Best an Article on Zika?

I’ve always been a fan of science journalism.  Popular science tomes like Evolution for Everyone (David Sloan Wilson), The Immortal Life of Henrietta Lacks (Rebecca Skloot), and The Dynamic Dance (Barbara J. King) have expanded my sphere of knowledge and influenced the way I view the world around me.  The best science journalism seeks to both inspire and educate, making difficult concepts into easy to digest ones that even laypeople can take something away from.

When we were tasked with coming up with an emerging media field test, I knew that I wanted to explore the use of technology in science journalism.  #SciComm, as it’s popularly referred to on platforms like Twitter and Instagram, is almost inseparable from the words that comprise it, as writers seek to elucidate difficult topics and make them palatable for a broader audience.  In recent years, however, brands like Discovery have branched out into first traditional video, then 360 video, and recently even Virtual Reality (with the advent of Discovery VR).  Scicomm and science journalism are clearly starting to embrace emerging technologies.  Could chatbots be one of those options?  

The hypothesis

For my field test, I used the emerging technology of a chatbot and compared it to the traditional media format of a written article, to see which delivery method would cause users to retain more information about a given subject.  Since the main goal of science journalism is to educate, rather than entertain, I thought a measure of knowledge retention would be appropriate.  It was my hypothesis that the traditional article format would provide a richer experience, and thus allow users to retain a certain set of facts better than if they had talked to a chatbot about the same topic.  

My reasoning for this was simple: a chatbot, I thought, might be a great way to get specific answers people were looking for, but the number of questions you could ask it are seemingly endless.  If my goal was to have people retain specific information, the freeform nature of chatbots might mean that while people could get any information they desired, they might not think to inquire about the information I wanted them to learn, a case of “you don’t know what you don’t know.”  On the other hand, a traditional article lays everything the journalist wants you to learn out in the bounds of the word count.  For this reason, I thought the article would be more effective at conveying specific information, but I was certainly willing to test the new chatbot technology to find out if this was actually the case!

The audience

The next step was to define an appropriate target audience on which to run this experiment.  I settled for a demographic of people 20-35 years of age, who could be reached using the mediums of Facebook and email.  In particular, the chatbot would be distributed/advertised to people on Facebook, assuring that they were already familiar with the technology and platform (since the chatbot I chose to use operates through Facebook Messenger).  To my mind, this meant that my participants would be tech-savvy and unbiased against the “emerging technology” status of a chatbot.  The article, by contrast, would be delivered via email, but still to the same demographic of people 20-35 years of age, so that I wouldn’t see overlap in article readers who might have already perused the chatbot, but proper comparisons could still be made.  

Why Zika?

I chose to focus my field test on Zika after following several science journalism sites for months (chief among them, STATnews), and seeing the Zika virus come up again and again as a journalistic topic in 2016.  The virus and 2016’s subsequent Public Health Emergency were front of mind for many this year.  I wanted to test out both emerging tech and #scicomm trends, and an important subject like Zika seemed just the way to do it.  The facts included in my chatbot and article were based on the top-asked questions in the Center for Disease Control’s FAQ about Zika.

The experiment

The outline of this experiment was simple: all I had to do was build a chatbot to address key questions about Zika virus, write an article that covered the same basic questions and facts, and then have user/readers interact with the two different formats (only one type per person) before taking a quiz to assess how much they had learned from the materials in question.  Then I could compare the two formats based on their average quiz score to see which was more effective at transmitting information.

The technology

The implementation was anything but easy, however.  After deciding to use ChatFuel as my WYSIWYG editor for this field test, I connected my personal Facebook account to the service and was off to the races.  Or so I thought.  The test started out smoothly, with the creation of a welcome message and default answer.  These two messages even responded to me appropriately when I tested out the bot on messenger.com.  

I quickly created a few content cards and set up a few Artificial Intelligence (AI) rules, thinking I would be up and running in no time.  The AI rules were slightly tricky because I had to guess at what users might want to know in real time, and then apply as many different variables of those words to that content card as possible.

This meant that questions around things like pregnancy had to be able to be triggered by any combination of odd words – “pregnancy” and “pregnant,” to be sure, but also colloquialisms like “preggo” and intialisms like “ttc” (Trying to Conceive).  Thinking of these was difficult, because many may not be known to me, but I tried to be as comprehensive as I could with each topic I would be covering.

You can see some of my AI rules below:

ai-rules-1

ai-rules-2

That was when I started running into trouble.  For starters, I couldn’t understand why the ChatFuel “buttons” I was adding wouldn’t appear when I triggered the responses that contained them.  I wasn’t creating them for users – I planned to do so down the line, but for starters, I just wanted to see what they did, because ChatFuel’s tutorial documentation was less than helpful.

Then I noticed that my welcome message – or anything else – was not longer being triggered by my AI rules.  It wasn’t for lack of trying, as you can see:

broken-ai-hello

Frustrated, I left the bot behind for 45 minutes and when I returned, I chatted it “hi’ again, just to see what would happen.  I didn’t expect much, but this time, my workflow came back to me exactly as I had planned it:

ai-working

Excellent!  With that, I continued to flesh out each of the topics I wanted to cover, and assigned AI rules that would trigger each.  Lesson learned (as I’ve learned again and again with technology!) – sometimes it just needs time.

I ran into trouble again later, however, when trying to set my AI rules for potential questions users might ask about Zika virus and its symptoms.  Specifically, I thought that users might ask things like “can Zika kill me” or “can I die from Zika.”  I set my AI rules to trigger off words like die/death/kill/sick/illness/disease.  However, when I tested the bot, these words would not trigger my desired response (to my symptoms card) or in fact any response at all.

I explored the help documentation, community forums and more, but could not find a reason that my queries with the trigger words “die” or “kill” were not triggering my AI rules to display the symptoms card.  Interestingly enough, however, I noticed that just typing the words “die” or “kill” DID trigger my rules.  It was when they were included in a full question (“can I die from Zika?”  “Can Zika kill me?”) that the experiment went off the rails.

Worried that this might be an issue for my users as well, I set out to create a “back-up” system by which they could navigate.  I decided on a sort of “click-through” menu option, as shown below:

click-through-menu

After reading the prompts, Zikabot users could then select from any of the following to learn more about that topic: Cause of Zika, Symptoms of Zika, Treatment of Zika, Location of Zika, and Zika & Pregnancy.  If one of these menu buttons was chosen, the bot would issue a confirmation message confirming the topic, then provide 1-2 relevant facts from the Center for Disease Control, a link to learn more about the topic, and then a link to return to the main menu where users could choose another topic to learn about.

symptoms-card 

Again, this method was only in place so that the chatbot would continue to provide content and answers if users ran into problems with the Artificial Intelligence, as I had during my trials.  But as you’ll see later, it certainly came in handy!

The article

The counterpart to the Zikabot chatbot was a short article written with the express purpose of conveying all the same facts as the Zikabot conveyed to readers who used it to its full capabilities.  Again, these included information on cause, symptoms, treatment, location, and pregnancy concerns surrounding Zika virus.

The content of the article was as follows:

The Basics of Zika Virus

2016 was the year of public concern for a virus known as Zika virus.  While the virus was first identified in humans in 1952, it was not until February 1st, 2016 that the World Health Organization (WHO) declared a Public Health Emergency of International Concern regarding the recent association of Zika infection with clusters of microcephaly and other neurological disorders.  

The remainder of the year saw the spread of Zika virus into the United States, with active Zika virus transmission confirmed in Florida and Texas, and laboratory-confirmed Zika virus disease cases in 49 U.S. states and the District of Columbia as of December 7th, 2016.

Zika Virus Disease is caused by infection with the virus itself, which is primarily spread to humans through the bite of an infected mosquito.  The two species of mosquito known to spread the virus are Aedes aegypti and Aedes albopictus, which bite during both the day and the night.  

While transmission via mosquito is most common, Zika virus can also be spread during sex, from a pregnant woman to her fetus, and likely by blood transfusions (though this method has not been confirmed in the United States at this time, according to the Center for Disease Control).

Many people infected with the virus will be unaware that they have been infected, displaying either no symptoms or just a mild case.  The most common symptoms an infected person is likely to experience include fever, rash, joint pain, and conjunctivitis (red eyes).  Muscle pain and headache are also known to occur.

At the time of this writing, there is no specific medicine or vaccine to treat Zika virus infection, according to the CDC.  The symptoms can be managed but the virus itself cannot be “cured.”

This represents a problem for pregnant women, who are one of the most vulnerable populations in infected areas.  Infection during pregnancy can cause certain birth defects in the fetus, including microcephaly and clubfoot.  For this reason, the CDC has issued travel notices for areas currently dealing Zika epidemics and guidelines for those living in or who must travel to infected areas.

For further information on Zika virus, please visit the Center for Disease Control or the World Health Organization Zika websites.

The quiz

After users/readers interacted with the chatbot and the article, I wanted them to take a simple quiz to test fact retention.  I built the quiz using Google Forms and sent it to my participants after they had done step one (with either the chatbot or the article) so that they wouldn’t be compelled to cheat on the quiz.  The quiz asked a clarifying question so I could situate participants in my records, and then launched into five questions about the facts from the CDC that had hopefully been learned from the chatbot or article.  

The questions were as follows (with the correct answers highlighted in bold):

 

  • Did you interact with the Zikabot (chatbot) or read the article?

 

    1. I interacted with the Zikabot (chatbot)
    2. I read the article

 

  • What of the following is one type of mosquito known to spread the Zika virus?

 

      1. Aedes aegean
      2. Aedes albopictus
      3. Aedes albian
      4. Aedes apanman

 

 

 

  • Which of the following is a common symptom of Zika Virus Disease?

 

      1. Vomiting
      2. Sore throat
      3. Joint pain
      4. Diarrhea

 

 

 

  • Which of the following birth defects can be caused by Zika virus infection?

 

 

      1. Clubfoot
      2. Cleft palate
      3. Down Syndrome
      4. Congenital heart defects

 

  • As of December 7th, 2016, how many U.S. states (not including the District of Columbia) had seen laboratory-confirmed Zika virus disease cases?

 

      1. 43
      2. 22
      3. 7
      4. 49

 

  • Which drug can be used to treat Zika virus infection?

 

      1. Acetaminophen
      2. Avermectin
      3. Qualaquin
      4. There is no specific medicine or vaccine for Zika virus

 

Since Zika virus has been so much in the news this year and my participants might know some of this information already, I specifically wrote slightly harder questions.  For example, there are two types of mosquitoes known to spread Zika virus, Aedes aegypti and Aedes albopictus, but only one of them is widely reported on in the media (Aedes aegypti).  Similarly, the birth defect of microcephaly is widely known to be affiliated with Zika infection in pregnancy, but other birth defects are less commonly reported on.  

In cases like this, I purposefully selected the lesser-known option so that I could more properly assess whether my participants had learned the information from the chatbot or article rather than from prior knowledge.

The results

I sent both the chatbot and the article to 25 people each, and collected as many quizzes back from those groups as I could.  In the end, I received 20 completed quizzes from the chatbot subgroup and 22 completed quizzes from the article subgroup.  

Those who interacted with the chatbot got 68 out of 100 questions correct, for an average score of 68%.  Those who read the article got 86 out of 110 questions correct, for an average score of 78.18%.

Learning Format Number of Participants Correct Questions Total
Questions
Average Group Score
Chatbot 20 68 100 68%
Article 22 86 110 78.18%


At more than ten percentage points higher on average, the article group was the clear winner for fact retention.  In addition to their higher average score, the article group had a higher percentage of scores in the 80-100% correct range, with about 73% (16/22) scoring in that range.  By contrast, only 55% of the chatbot group scored in the 80-100% correct range.

Learning Format Score: 1/5 Score: 2/5 Score: 3/5 Score: 4/5 Score: 5/5
Chatbot 2 2 5 8 3
Article 1 1 4 9 7


Based on these numbers, it becomes apparent that in this particular instance, the article format was able to convey a specific set of facts to readers better than the chatbot format was.  This met my expectations and proved my hypothesis.   

Interviews + improvements

I spoke with a number of my chatbot respondents to get their take on how the chatbot experience went, and how I might improve it in the future.

Right away, one respondent pointed out the issues with relying on a chatbot to teach you specific information, saying “I only asked 2 questions and got thorough answers and links to the CDC for more information (very helpful). However I did not go through all of the options provided by the chat box and I didn’t initially understand that I had to ask a very specific question. For example, I first asked ‘how long after contracting Zika should I wait until getting pregnant?’ But that wasn’t one of the options. I also didn’t see it explicitly answered in the general pregnancy option.”

This illustrates one of the main reasons my hypothesis leaned toward the article as a preferred source of specific information.  While a chatbot is great for answering any number of questions, the technology as it stands today (if you want to rely on the Artificial Intelligence responses rather than the click-through back-up menu that I instituted) requires a user to know what they want to ask and what they want to learn from it.  This makes it difficult to learn entirely new topics because, again, you don’t know what you don’t know.    

However, a different respondent in the chatbot group actually saw benefit to the open-ended nature of the bot, specifically since there was a menu in place to guide her questions: “I liked the options for topics a lot so I could pick things out I might not have thought of.”

This informs how I might create a chatbot in the future, including a menu and perhaps surveying my target audience in advance for suggestions of what they might want to know on a particular topic.  In this case, I relied on the top questions being asked about Zika virus on the CDC’s website, allowing those topics to inform what my bot could talk about.

When questioned, users also offered ways to improve the chatbot itself.  One pointed out that “the answers could have a little more detail from the articles. Like within location, what is [sic] the most common places. If I’m using a bot, I’m probably not going to click into an article (‘cause I’m lazy).”  Another mentioned that “an ‘is there zika near me’ feature in the chat would be nice.”

The latter could certainly be set up with AI rules based on zip codes or state name/abbreviation trigger words, and is one improvement I would make for the future.  

Many users were as frustrated with the imperfect AI rules as I was, noting “the bot gets stuck if it encounters a non-predefined question” and “I had a little trouble when I asked it longer questions.”

One respondent really dug deep to figure out what wasn’t working: “It was really choosing from menus, rather than chatting. (Which is fine, chatting is hard and the menus and links worked well.) I then tried to see if it would respond with the information if I asked the question a little differently from the listed version, ‘Do we know how Zika is treated?’ to see if it could reference its topic ‘How is Zika treated’ but it didn’t respond at all. So then I typed the exact question, ‘How is Zika treated?’ (from the list of topics it said it covered) and it gave me the standard, ‘Here are some of the topics I can cover…’”

In this response, we see one of the main issues with implementation of the chatbot itself.  I wrote the bot to include a list of topics users might ask, listing them as helpful suggestions, but then failed to put the exact phrasing of those topics into those AI rules.  So rather than users being able to mimic me and ask “How is Zika treated?,” they had to use one of my AI rule triggers like “treatment,” “treated,” or “treat” and hope that the AI played nice.

To combat this in future, I would include the full phrasing of everything the bot said users could ask in the default answer, in this case including entire trigger phrases in my AI rules, things like “What causes Zika?,” “What are the symptoms of Zika?,” “How is Zika treated?,” “Where is Zika currently found in the United States?,” and “Are there any special considerations for Zika during pregnancy?”

Compare this to my existing AI rules (in the “the technology” section above) and you’ll see where the issue arises!  With unreliable AI technology in ChatFuel, users were left high and dry if they didn’t phrase their question exactly right.

One chatbot user interviewed also pointed out that that newness of the technology may have been a sticking point, affecting her quiz score.  “I got caught up a little in the tech and missed some of the info,” she said, “but I think once I got more used to it, I would have got more out of it.”

In sum, however, users were fairly pleased with their experience with this emerging technology, saying this was a “”very interesting way to present information. Might be an interesting way to collect data and gauge people’s primary concerns based on the questions they asked or topics they selected.”  Another respondent said, “I thought the chatbot pointed to well-researched and accurate information. I think this could be a good to for people seeking good information on Zika virus.”  Still another appreciate the references as well, stating, “it seems like a great idea and I appreciate how it connects you with CDC info for reference.”

Based on interviews, the tech seemed well-received, and I was left with a number of ideas for improving it on the next go-round, including:

  • Finding a chatbot service with more consistent AI technology
  • Adding location-based queries to my AI rules (is Zika in X zip code or X state?)
  • Setting trigger rules for exactly the phrasing used in my menu of topic options

The future of chatbots

In this case, my hypothesis proved correct, demonstrating that, for purposes of specific fact-retention, a chatbot was not the best way to present research.  Rather, the traditional journalistic article served readers better in this sense.  As I’m reflecting back on this experience, I think the experiment was worthwhile but perhaps the right question and function were not being asked.  

With the open-ended nature of chatbots, I don’t believe they’re best for conveying specific facts, but rather act as an interactive FAQ, allowing users to shape their own questioning experience.  Put another way, chatbots aren’t ideal for learning and then acing a test; they’re right for people who have specific questions in their own minds, and want the answers to those questions right away (without wading through a journalist’s viewpoint or filler words).

I think of chatbots almost like a live customer service account, where you ask a question and immediately get an informed answer.  This could be because this is how I’m used to “chat” being used, but moreover, the medium lends itself to rapid response for specific questions, not telling a story with a beginning, middle, and end and key takeaways for readers.

In future, I see chatbots being used in place of live customer service agents.  I don’t know that they’re the right fit for journalism, but as a delivery method for stories (the way I delivered up CDC factsheets with my chatbot, for example), they may be ideal.  

Chatbots are also great for breaking news…with the caveat that they would be the “second news breakers,” as they would have to be set up as soon as information came in but could not be the first ones on the scene.  As a way to inform the public about breaking news after the first source has broken it, however, they could be invaluable.

Chatbots are just starting to come into public consciousness as a resource for delivering breaking news and answers to specific reader queries.  With buy-in from science journalism outlets, I could see chatbots being used for these purposes in that arena, as well – but I don’t think a back-and-forth session with a chatbot will be replacing a well-constructed science journalism article any time soon!

Standard

Leave a comment