Voyant: A Reflection and Guide

Voyant, “a web-based tool” for “reading and analysis” of text, allows the user to discover the frequency of words in each document and corpus, which is a collection of documents (Voyant Tools). It displays visual components to help the user to examine and analyze words for text mining and topic modeling. In addition to the word cloud (Cirrus), Voyant displays several other visual displays or tools such as Summary, Reader, Trends, and Contexts. As an interactive tool, Voyant allows the user to select a word in the Cirrus and view the number frequency of the selected word.   The Summary provides the provides information about the texts such as the how many times a word appears in each document and which words are distinctive in certain documents. The Reader displays the documents or corpus in which the selected word/words appear. The word is highlighted throughout the corpus. The Reader provides a visual reference for the frequency of the word/words within the corpus. Trends display a line graph to provide another visual representation of the frequency of the word/words in a document or corpus. The Contexts is an interesting tool that depicts how word/words appear frequently in various parts of the document or corpus. This tool “shows the surrounding text of the selected word/words” (Voyant Tools). All five tools in Voyant provide a visual reference to depict the frequency or occurrence of the word/words in a document and corpus. It helps the user further analyze the frequency and distinctiveness of words. Voyant engages the user to think beyond what is on the surface of viewing the frequency of the words; It questions the impact of the words, depending on location, time, etc.

For my Voyant activity, I copied the text files provided by Dr. Roberston.  It was a dataset from the WPA Slave Narratives collection, which includes over a thousand interviews with former slaves from seventeen different states from 1936-1938. I copied the .txt files of the transcription, which comes from Project Gutenberg, and pasted them into Voyant. After selecting the “Reveal” button, the next page became a plethora of visual displays for examination and analysis.   What caught my attention the most was Voyant’s ability to provide different visual results, which allowed me to view the words in the selected text files in a different way. The words were no longer just words; instead, they became visually significant. The Cirrus tool, a visual word cloud, displayed the frequency or occurrence of the words in different colors. The size of the text indicated the frequency of the word. I also exported the word cloud for a single view of the tool to study it in more depth.

For the first activity, I selected two words, “come” and “one,” that were higher in frequency or higher occurrence in Cirrus. Then, I viewed the Trends graph to see the visual display of each word within a certain state and document.   I was able to view the word “one.”  Voyant would not allow me to scale the 2nd word “come.” The Trends graph becomes blank, and the reader highlights another word. By examining the Trend graph for the first word “one,”  I noticed that “one” appeared less frequently within the documents that I selected for Texas and Tennessee. However, its frequencies were higher in other documents related to states such as Oklahoma and Missouri. It might be that either the word was used less by the interviewer or the interviewee. Also, the transcriber in a specific state either used the word more frequently than the other states.   The frequencies of the word can be subjective because the selected context for old can be based on someone’s point of view and not necessarily the interviewee.

When I added more text from the list provided, I noticed that there was a change in the word cloud. For this activity, I selected “dey” and “dat” for the two most common words that appear in the Corpus. I referenced the Cirrus/Word cloud. For each different state (Texas and Mississippi) that is selected for the documents, the Trends graph displays the frequencies of the word for each state. However, when the Trends graph is exported for the 2nd document for each word, the graph looks similar to the first graph. I do not know how to fix the error in Voyant.   When I look back at the Trend graphs (before export) for each word and each state, I can see the changes in frequency of the selected word.   Depending on the document segments, the frequencies of “dey” and “dat” differ for each state. The interviewees in that particular state used that word more frequently than the other interviewees from another state.   For example, “dat” is used more frequently in Texas than Mississippi because the trend graph shows higher frequencies of the word in the documents. There is a lower frequency of “dat” in document 4 for Texas. For Mississippi, the lowest frequency of “dat” is in document 12. Another example would be the frequencies of “dey” that appear in the documents from Texas and Mississippi. The trends graphs show high frequencies of the word for both states. However, the word appears less frequently in document 4 for both states. For Mississippi, there is another low frequency of “dey” in document 12. Both states show less frequency of “dey” in document 17.   The Trends graphs for the frequencies of the “dat” and day” in TX and MS show that word usage differs between states in the South. Also, the context and dialect play a role in the frequency of the words being commonly recorded by the interviewers.

For the distinctive word activity, I viewed the Summary, Trends and Contexts more than Cirrus. The distinctive words inform me that the recording and transcription of them are subjective. The meaning of the distinctive word is used differently, and it depends on the context.   The word “ta” is distinctively used in Missouri, and it is used as a preposition. Instead of “to,” it is “ta.”  When it is compared to other states, “ta” is rarely used in the other states; therefore, “ta” is distinctive of Missouri. The second word “hoo” is used more frequently in Kentucky, and when it is compared to other states, the graph shows that it is also rarely used in other states. Therefore, “ta” and “hoo” are words distinctive of Missouri and Kentucky.

Voyant is an engaging digital tool for examining and analyzing text from a document or a collection of documents (corpus). It provides a visual reference for looking at text from a different perspective. It has several tools for the user to see the visual results of the frequency of words. Voyant is not perfect, but it is a great tool to for visualizing text in different ways. Despite the minor setbacks (mostly due to technical issues/glitches), Voyant goes beyond the visualization of words. It allows the user to think about words in the document or corpus and begin to question how they affect certain social, political, and/or cultural aspects of humanity.

 

 

 

 

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php