overview.html

<!DOCTYPE html>
<html >

<head>

  <meta charset="UTF-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <title>overview.utf8</title>
  <meta name="description" content="">
  <meta name="generator" content="bookdown <!--bookdown:version--> and GitBook 2.6.7">

  <meta property="og:title" content="overview.utf8" />
  <meta property="og:type" content="book" />
  
  
  <meta name="twitter:card" content="summary" />
  <meta name="twitter:title" content="overview.utf8" />
  
  
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <meta name="apple-mobile-web-app-capable" content="yes">
  <meta name="apple-mobile-web-app-status-bar-style" content="black">
  
  
<!--bookdown:link_prev-->
<!--bookdown:link_next-->
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/lunr.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>


<link rel="stylesheet" href="style.css" type="text/css" />
</head>

<body>


<!--bookdown:title:start-->
<!--bookdown:title:end-->

<!--bookdown:toc:start-->
  <div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">

    <div class="book-summary">
      <nav role="navigation">
<!--bookdown:toc2:start-->
<ul>
<li><a href="#chap:overview"><span class="toc-section-number">1</span> An overview of the EMU-SDMS </a></li>
<li><a href="#extract-symbolic-information-we-are-interessted-in"><span class="toc-section-number">2</span> extract symbolic information we are interessted in</a></li>
<li><a href="#extract-the-according-sample-values"><span class="toc-section-number">3</span> extract the according sample values</a></li>
</ul>
<!--bookdown:toc2:end-->
      </nav>
    </div>

    <div class="book-body">
      <div class="body-inner">
        <div class="book-header" role="navigation">
          <h1>
            <i class="fa fa-circle-o-notch fa-spin"></i><a href="./"></a>
          </h1>
        </div>

        <div class="page-wrapper" tabindex="-1" role="main">
          <div class="page-inner">

            <section class="normal" id="section-">
<!--bookdown:toc:end-->
<!--bookdown:body:start-->
<div id="chap:overview" class="section level1">
<h1><span class="header-section-number">1</span> An overview of the EMU-SDMS <a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a></h1>
<p><img src="pics/EMU-webAppIcon-roundCorners.png" width="512" /></p>
<p>The EMU Speech Database Management System (EMU-SDMS) is a collection of software tools which aims to be as close to an all-in-one solution for generating, manipulating, querying, analyzing and managing speech databases as possible. It was developed to fill the void in the landscape of software tools for the speech sciences by providing an integrated system that is centered around the R language and environment for statistical computing and graphics . This manual contains the documentation for the three software components ,  and the . In addition, it provides an in-depth description of the  database format which is also considered an integral part of the new system. These four components comprise the  and benefit the speech sciences and spoken language research by providing an integrated system to answer research questions such as: </p>
<p>This manual is targeted at new  users as well as users familiar with the legacy EMU system. In addition, it is aimed at people who are interested in the technical details such as data structures/formats and implementation strategies, be it for reimplementation purposes or simply for a better understanding of the inner workings of the new system. To accommodate these different target groups, after initially giving an overview of the system, this manual presents a usage tutorial that walks the user through the entire process of answering a research question. This tutorial will start with a set of  audio and Praat   annotation files and end with a statistical analysis to address the hypothesis posed by the research question. The following Part  of this documentation is separated into six chapters that give an in-depth explanation of the various components that comprise the  and integral concepts of the new system. These chapters provide a tutorial-like overview by providing multiple examples. To give the reader a synopsis of the main functions and central objects that are provided by ’s main R package , an overview of these functions is presented in Part . Part  focuses on the actual implementation of the components and is geared towards people interested in the technical details. Further examples and file format descriptions are available in various appendices. This structure enables the novice  user to simply skip the technical details and still get an in-depth overview of how to work with the new system and discover what it is capable of.</p>
<p>A prerequisite that is presumed throughout this document is the reader’s familiarity with basic terminology in the speech sciences (e.g., familiarity with the  and how speech is annotated at a coarse and fine grained level). Further, we assume the reader has a grasp of the basic concepts of the R language and environment for statistical computing and graphics. For readers new to R, there are multiple, freely available R tutorials online (e.g., ). R also has a set of very detailed manuals and tutorials that come preinstalled with R. To be able to access R’s own ``An Introduction to R&quot; introduction, simply type  into the R console and click on the link to the tutorial.</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

<p>The  has a number of predecessors that have been continuously developed over a number of years . The components presented here are the completely rewritten and newly designed, next incarnation of the EMU system, which we will refer to as the EMU Speech Database Management System (EMU-SDMS). The  keeps most of the core concepts of the previous system, which we will refer to as the legacy system, in place while improving on things like usability, maintainability, scalability, stability, speed and more. We feel the redesign and reimplementation elevates the system into a modern set of speech and language tools that enables a workflow adapted to the challenges confronting speech scientists and the ever growing size of speech databases. The redesign has enabled us to implement several components of the new  so that they can be used independently of the  for tasks such as web-based collaborative annotation efforts and performing speech signal processing in a statistical programming environment. Nevertheless, the main goal of the redesign and reimplementation was to provide a modern set of tools that reduces the complexity of the tool chain needed to answer spoken language research questions down to a few interoperable tools. The tools the  provides are designed to streamline the process of obtaining usable data, all from within an environment that can also be used to analyze, visualize and statistically evaluate the data.</p>
<p>Upon developing the new system, rather than starting completely from scratch it seemed more appropriate to partially reuse the concepts of the legacy system in order to achieve our goals. A major observation at the time was that the R language and environment for statistical computing and graphics  was gaining more and more traction for statistical and data visualization purposes in the speech and spoken language research community. However, R was mostly only used towards the end of the data analysis chain where data usually was pre-converted into a comma-separated values or equivalent file format by the user using other tools to calculate, extract and pre-process the data. While designing the new , we brought R to the front of the tool chain to the point just beyond data acquisition. This allows the entire data annotation, data extraction and analysis process to be completed in R, while keeping the key user requirements in mind. Due to personal experiences gained by using the legacy system for research puposes and in various undergraduate courses , we learned that the key user requirements were data and database portability, a simple installation process, a simplified/streamlined user experience and cross-platform availability. Supplying all of ’s core functionality in the form of R packages that do not rely on external software at runtime seemed to meet all of these requirements.</p>
<p>As the early incarnations of the legacy EMU system and its predecessors were conceived either at a time that predated the R system or during the infancy of R’s package ecosystem, the legacy system was implemented as a modular yet composite standalone program with a communication and data exchange interface to the R/Splus systems . Recent developments in the package ecosystem of R such as the availability of the  package  and the related packages  and  , as well as the  package  and the  package , have made R an attractive sole target platform for the . These and other packages provide additional functional power that enabled the ’s core functionality to be implemented in the form of R packages. The availability of certain R packages had a large impact on the architectural design decisions that we made for the new system.</p>
<p>R Example  shows the simple installation process which we were able to achieve due to the R package infrastructure. Compared to the legacy EMU and other systems, the installation process of the entire system has been reduced to a single R command. Throughout this documentation we will try to highlight how the  is also able to meet the rest of the above key user requirements.</p>
<p>&lt;&lt;rexample:overview_install, rexample=TRUE, eval=FALSE&gt;&gt;=
# install the entire EMU-SDMS
# by installing the emuR package
install.packages(“emuR”)
@</p>
<p>It is worth noting that throughout this manual R Example code snippets will be given in the form of R Example . These examples represent working R code that allow the reader to follow along in a hands-on manor and give a feel for what it is like working with the new .</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

<p>As was previously mentioned, the new  is made up of four main components. The components are the  format; the R packages  and ; and the web application, the , which is ‘s new  component. An overview of the ’s architecture and the components’ relationships within the system is shown in Figure . In Figure , the  package plays a central role as it is the only component that interacts with all of the other components of the . It performs file and DB handling for the files that comprise an  (see Chapter ); it uses the  package for signal processing purposes (see Chapter ); and it can serve s to the  (see Chapter ).</p>

<p>Although the system is made of four main components, the user largely only interacts directly with the  and the  package. A summary of the default workflow illustrating theses interactions can be seen below:</p>

<p>Initially the user creates a reference to an  by loading it into their current R session using the  function (see step 1). This database reference can then be used to either serve () the database to the  or query () the annotations of the  (see steps 2 and 3). The result of a query can then be used to either perform one or more so-called requeries or extract signal values that correspond to the result of a  or  (see step 4). Finally, the signal data can undergo further preparation (e.g., correction of outliers) and visual inspection before further analysis and statistical processing is carried out (see steps 5, 6 and 7). Although the R packages provided by the  do provide functions for steps 4, 5 and 6, it is worth noting that the plethora of R packages that the R package ecosystem provides can and should be used to perform these duties. The resulting objects of most of the above functions are derived  or  objects which can be used as inputs for hundreds if not thousands of other R functions.</p>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

<p>Besides providing a fully integrated system, the  has several unique features that set it apart from other current, widely used systems . To our knowledge, the  is the only system that allows the user to model their annotation structures based on a hybrid model of time-based annotations (such as those offered by Praat’s tier-based annotation mechanics) and hierarchical timeless annotations. An example of such a hybrid annotation structure is displayed in Figure . These hybrid annotations benefit the user in multiple ways, as they reduce data redundancy and explicitly allow relationships to be expressed across annotation levels (see Chapter  for further information on hierarchical annotations and Chapter  on how to query these annotation structures).</p>
<p>&lt;&lt;overview_hybridAnnot, fig.cap = “Example of a hybrid annotation combining time-based (\textit{Phonetic} level) and hierarchical (\textit{Phoneme}, \textit{Syllable}, \textit{Text} levels including the inter-level links) annotations.”, echo=FALSE, fig.width=4.5, fig.height=3&gt;&gt;=
library(ggplot2)
ae = load_emuDB(file.path(tempdir(), “emuR_demoData”, “ae_emuDB”), verbose = F)</p>
</div>
<div id="extract-symbolic-information-we-are-interessted-in" class="section level1">
<h1><span class="header-section-number">2</span> extract symbolic information we are interessted in</h1>
<p>sl_text = query(ae, “Text == friends”)
sl_syllable = query(ae, “[Syllable =~ .* ^ Text == friends]”)
sl_syllable_parents = requery_hier(ae, sl_syllable, level = “Text”)
sl_phoneme = query(ae, “[Phoneme =~ .* ^ Text == friends]”)
sl_phoneme_parents = requery_hier(ae, sl_phoneme, level = “Syllable”)
sl_phonetic = query(ae, “[Phonetic =~.* ^ Text == friends]”)
sl_phonetic_parents = requery_hier(ae, sl_phonetic, level = “Phoneme”)</p>
</div>
<div id="extract-the-according-sample-values" class="section level1">
<h1><span class="header-section-number">3</span> extract the according sample values</h1>
<p>td = get_trackdata(ae, sl_text, “MEDIAFILE_SAMPLES”, resultType = “emuRtrackdata”, verbose = F)
# create sample numbers
sampleNrs = seq(sl_text<span class="math inline">\(sample_start, sl_text\)</span>sample_end, 1)</p>
<p>all_sl = list(sl_phonetic, sl_phoneme, sl_syllable, sl_text)
all_parents = list(sl_phonetic_parents, sl_phoneme_parents, sl_syllable_parents)</p>
<p>lab_txt_size = 7
level_txt_size = 7</p>
<p>hierta_plot = ggplot(td, aes(x=sampleNrs, y=T1)) + geom_line(colour=“#E7E7E7”) + theme_bw()</p>
<p>hierta_plot = hierta_plot + theme(axis.line=element_blank(),axis.text.x=element_blank(),
axis.text.y=element_blank(),axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),legend.position=“none”,
panel.background=element_blank(),panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),plot.background=element_blank())</p>
<p>for(i in 1:length(all_sl)){
cur_sl = all_sl[[i]]</p>
<p>minMaxDist = max(td<span class="math inline">\(T1) - min(td\)</span>T1)
propMinMaxDist = minMaxDist * 1/length(all_sl)</p>
<p>cur_y_min = min(td$T1) + (i-1)*propMinMaxDist
# plot with segments
if(i %% 2 == 0){
levelNameX = max(sampleNrs)
levelVjust = 1.5
}else{
levelNameX = min(sampleNrs)
levelVjust = -0.5
}</p>
<p>hierta_plot = hierta_plot + annotate(“text”, x = (cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2), y = cur_y_min + propMinMaxDist/2, label = paste0(&quot;\\textit{&quot;, cur_sl\)</span>labels, “}”)) +
annotate(“text”, x = levelNameX, y = cur_y_min + propMinMaxDist/2, label = paste0(“\textit{”, unique(cur_sl$level), “}”), angle=90, vjust=levelVjust)</p>
<p># first level -&gt; draw time lines
if(i == 1){
hierta_plot = hierta_plot + annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start, y = cur_y_min + propMinMaxDist, xend = cur_sl\)</span>sample_start, yend = cur_y_min + propMinMaxDist / 2, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start, y = cur_y_min + propMinMaxDist * 3/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 3/4, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2, y = cur_y_min + propMinMaxDist * 3/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 5/8, colour=“#38363e”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_end, y = cur_y_min + propMinMaxDist/2, xend = cur_sl\)</span>sample_end, yend = cur_y_min, colour=“#888888”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_end, y = cur_y_min + propMinMaxDist * 1/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 1/4, colour=“#888888”) +
annotate(“segment”, x = cur_sl<span class="math inline">\(sample_start + (cur_sl\)</span>sample_end - cur_sl<span class="math inline">\(sample_start)/2, y = cur_y_min + propMinMaxDist * 1/4, xend = cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2, yend = cur_y_min + propMinMaxDist * 3/8, colour=“#888888”)</p>
<p>}</p>
<p># SIC fix this
if(i != length(all_sl)){
cur_parent = all_parents[[i]]
parent_y_min = min(td<span class="math inline">\(T1) + (i)*propMinMaxDist  hierta_plot = hierta_plot + annotate(&quot;segment&quot;, x = (cur_sl\)</span>sample_start + (cur_sl<span class="math inline">\(sample_end - cur_sl\)</span>sample_start)/2), y = cur_y_min + propMinMaxDist/2 + propMinMaxDist/4, xend = (cur_parent<span class="math inline">\(sample_start + (cur_parent\)</span>sample_end - cur_parent$sample_start)/2), yend = parent_y_min + propMinMaxDist/2 - propMinMaxDist/4, linetype=“dashed”)
}</p>
<p>}</p>
<p>hierta_plot
@</p>
<p>Further, to our knowledge, the  is the first system that makes use of a web application as its primary  for annotating speech. This unique approach enables the  component to be used in multiple ways. It can be used as a stand-alone annotation tool, connected to a loaded  via ’s  function and used to communicate to other servers. This enables it to be used as a collaborative annotation tool. An in-depth explanation of how this component can be used in these three scenarios is given in Chapter .</p>
<p>As demonstrated in the default workflow of Section , an additional unique feature provided by  is the ability to use the result of a query to extract derived (e.g., formants and RMS values) and complementary signals (e.g.,  data) that match the segments of a query. This, for example, aids the user in answering questions related to derived speech signals such as: . Chapter  gives a complete walk-through of how to go about answering this question using the tools provided by the .</p>
<p>The features provided by the  make it an all-in-one speech database management solution that is centered around R. It enriches the R platform by providing specialized speech signal processing, speech database management, data extraction and speech annotation capabilities. By achieving this without relying on any external software sources except the web browser, the  significantly reduces the number of tools the speech and spoken language researcher has to deal with and helps to simplify answering research questions. As the only prerequisite for using the  is a basic familiarity with the R platform, if the above features would improve your workflow, the  is indeed for you.</p>
<p>&lt;&lt;echo=FALSE, results=‘hide’, message=FALSE&gt;&gt;=
# clean up emuR_demoData
unlink(file.path(tempdir(), “emuR_demoData”), recursive = TRUE)
@</p>
</div>
<div class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Sections of this chapter have been published in <span class="citation">@winkelmann:2017aa</span><a href="#fnref1" class="footnote-back">↩</a></p></li>
</ol>
</div>
<!--bookdown:body:end-->
            </section>

          </div>
        </div>
      </div>
<!--bookdown:link_prev-->
<!--bookdown:link_next-->
    </div>
  </div>
<!--bookdown:config-->

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    var src = "true";
    if (src === "" || src === "true") src = "https://cdn.bootcss.com/mathjax/2.7.1/MathJax.js?config=TeX-MML-AM_CHTML";
    if (location.protocol !== "file:" && /^https?:/.test(src))
      src = src.replace(/^https?:/, '');
    script.src = src;
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>
</body>

</html>