-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.html
718 lines (640 loc) · 24.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Introduction to Elasticsearch</title>
<meta name="description" content="A framework for easily creating beautiful presentations using HTML">
<meta name="author" content="Hakim El Hattab">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<meta name="viewport"
content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="css/reveal.css">
<link rel="stylesheet" href="css/theme/black.css" id="theme">
<!-- Code syntax highlighting -->
<link rel="stylesheet" href="lib/css/zenburn.css">
<style>
.background-image-overlay {
background: black !important;
padding: 10px;
opacity: 0.8;
}
</style>
<!-- Printing and PDF exports -->
<script>
var link = document.createElement('link');
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match(/print-pdf/gi) ? 'css/print/pdf.css' : 'css/print/paper.css';
document.getElementsByTagName('head')[0].appendChild(link);
</script>
<!--[if lt IE 9]>
<script src="lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<div class="slides">
<section data-background="images/relevance-scoring.jpg">
<h1>Introduction to Elasticsearch</h1>
<p>
<small>Eirik Ola Aksnes</small>
</p>
<aside class="notes">
<ul>
<li>
Velkommen, i dag skal jeg holde en veldig kort introduksjon til Elasticsearch.
</li>
</ul>
</aside>
</section>
<section>
<h2>Agenda</h2>
<ul>
<li class="fragment">How does a search engine work</li>
<li class="fragment">Elasticsearch
<ul>
<li class="fragment">What is</li>
<li class="fragment">How to get started</li>
<li class="fragment">Own experience</li>
<li class="fragment">Use cases</li>
</ul>
</li>
</ul>
<aside class="notes">
<ul>
<li>Agendaen idag er som følgende...</li>
</ul>
</aside>
</section>
<section>
<section>
<h2>How does a search engine work?</h2>
<aside class="notes">
<ul>
<li>
Hvordan fungerer en søkemotor?
</li>
</ul>
</aside>
</section>
<section data-background="images/overflow2.jpg">
<blockquote class="background-image-overlay">
Your document collection is big! <br />
Scan through all the documents every time you search for something?
</blockquote>
<aside class="notes">
<ul>
<li>
Dette ville tatt evigheter
</li>
</ul>
</aside>
</section>
<section data-background="images/kokeboka4.png">
<blockquote class="background-image-overlay">
Pre-process the documents and create an index!
</blockquote>
<!--<aside class="notes">
To make your queries fast and efficient a search engine will pre-process the documents and create an
index
</aside>-->
<aside class="notes">
<ul>
<li>
For å gjøre dine søk raskt og effektivt vil en søkemotor forhåndsbehandle dokumentene og lage en index
</li>
</ul>
</aside>
</section>
<section data-background="white">
<h2>Create an inverted index</h2>
<img style="box-shadow: none;" data-src="images/inverted-index-1.svg">
<!--<aside class="notes">
The inverted index maps terms to documents containing the term.
</aside>-->
<aside class="notes">
<ul>
<li>Man lager seg da noe som heter en "invertert index"</li>
<li>På venstre siden har vi tre dokumenter...</li>
<li>Siden dette er en BigOne konferanse, så vil mye av innholdet i dag være pizza relatert...</li>
<li>Det som skjer er at man lager seg en invertert index av disse dokumentene (dokumentene blir
indeksert, som det heter)
</li>
<li>En invertert index (som vi ser på høyre side her nå) inneholder alle ordene som finnes i dokumentene, og for hvert ord så
lister man opp hvilke dokumenter som inneholder ordet...
</li>
<li>Så ordet "pizza" finnes i dokument 0 og 2</li>
</ul>
</aside>
</section>
<section data-background="white">
<h2>Find unique terms</h2>
<img style="box-shadow: none;" data-src="images/inverted-index-2.svg">
<aside class="notes">
<ul>
<li>
Så hvordan finner man unike ord?
<ul>
<li>
Hvis man f.eks tar for seg dokumentet "Turles loves pizza", så vil det gå igjennom forskjellige steg...
<ul>
<li>Man splitter opp dokumentet i ord</li>
<li>Man gjør alle bokstaver små</li>
<li>Man finner grunnstammer for ord, f.eks "Loves" blir "love"</li>
</ul>
</li>
<li>
Dette er ett forenklet eksempel...
</li>
</ul>
</li>
</ul>
</aside>
</section>
<section data-background="white">
<h2>Search against the inverted index</h2>
<img style="box-shadow: none;" data-src="images/inverted-index-3.svg">
</section>
<section data-background="white">
<h2>Sort by relevance</h2>
<p>How well each document matches the query</p>
<img style="box-shadow: none;" data-src="images/inverted-index-4.svg">
<aside class="notes">
By default, Elasticsearch sorts matching results by their relevance score, that is, by how well each
document matches the query.
</aside>
</section>
</section>
<section data-background="white">
<img style="box-shadow: none;" data-src="images/introduction.svg">
<aside class="notes">
<ul>
<li>Jeg skal nå bruke sirka 1 minutt på å si hva Elasticsearch er.
<ul>
<li>
Lucene
<ul>
<li>Cumbersome to use directly</li>
<li>Provides few features for scaling past a single machine</li>
</ul>
</li>
<li>
Real time
<ul>
<li>Det går fort å indeksere dokumenter</li>
<li>Data er tilgjenglig for søk nesten med en gang etter indeksering</li>
</ul>
</li>
</ul>
</li>
</ul>
</aside>
</section>
<section>
<section>
<h2>How to get started with Elasticsearch?</h2>
<aside class="notes">
<ul>
<li>Så hvordan kan man komme igang med Elasticsearch...</li>
</ul>
</aside>
</section>
<section>
<h2>It is that easy</h2>
<ul>
<li>Download Elasticsearch from www.elastic.co</li>
<li>Elasticsearch only requires Java to run</li>
</ul>
<pre><code class="hljs" data-trim contenteditable>
wget https://download.elasticsearch.org/elasticsearch/release/...
tar -zxvf elasticsearch-2.2.0.tar.gz
cd elasticsearch-2.2.0/bin
./elasticsearch.sh
</code></pre>
</section>
<section>
<h2>Zero configurations</h2>
<ul>
<li>Elasticsearch just works
<ul>
<li>No configuration is needed</li>
<li>It has sensible defaults settings</li>
</ul>
</li>
</ul>
<aside class="notes">
It is easy to get started with Elasticsearch!
</aside>
</section>
<section>
<h2>Is Elasticsearch alive?</h2>
<p>You can access it at http://localhost:9200 on your web browser, which returns this:</p>
<pre class="fragment"><code class="hljs" data-trim contenteditable>
{
"status":200,
"name":"Cypher",
"cluster_name":"elasticsearch",
"version":{
"number":"1.5.2",
"build_hash":"62ff9868b4c8a0c45860bebb259e21980778ab1c",
"build_timestamp":"2015-04-27T09:21:06Z",
"build_snapshot":false,
"lucene_version":"4.10.4"
},
"tagline":"You Know, for Search"
}
</code></pre>
</section>
<section>
<h2>REST API</h2>
<ul>
<li>Elasticsearch hides the complexities of Lucene behind a REST API
<ul>
<li>POST (create)</li>
<li>GET (read)</li>
<li>PUT (update)</li>
<li>DELETE (delete)</li>
</ul>
</ul>
</section>
<section>
<h2>DEMO - CURL works just fine</h2>
<img height="300" data-src="images/crud.png"/>
<ul>
<li class="fragment">An index is like a database</li>
<li class="fragment">An type is like a SQL table</li>
</ul>
</section>
<section>
<h2>What is stored in Elasticsearch?</h2>
<p>JSON documents!</p>
<pre class="json"><code class="hljs" data-trim contenteditable>
{
"title": "Introduction to Elasticsearch",
"date": "2016-04-07",
"author": "Eirik Ola Aksnes"
}
</code></pre>
</section>
<section>
<h2>Let's do an example - A BigOne pizza website</h2>
<ul>
<li class="fragment">We are building a website to find BigOne pizzas</li>
<li class="fragment">We have a collection of BigOne pizzas
<br />
<img data-src="images/big-one-chicken.jpg" />
<img data-src="images/big-one-bacon.jpg" />
<img data-src="images/big-one-classic.jpg" />
</li>
<li class="fragment">We want simple text based searching</li>
</ul>
</section>
<section>
<h2>How to store the pizzas?</h2>
<p>The act of storing data in Elasticsearch is called indexing.</p>
<pre><code class="hljs" data-trim contenteditable>
$curl -X POST localhost:9200/big-one/pizza/1 --data
'{
"name": "California Sunset Chicken"
}'
$curl -X POST localhost:9200/big-one/pizza/2 --data
'{
"name": "American Bacon"
}'
$curl -X POST localhost:9200/big-one/pizza/3 --data
'{
"name": "Classic American"
}'
</code></pre>
<aside class="notes">
It is much like the INSERT keyword in SQL except that, if the document already exists, the new
document would replace the old. The second part indicates on which index (an index could be compared to an SQL database, though I
don’t like this comparison) your query will be performed, and what is the type (a type could be
compared to an SQL table, though I don’t like this comparison either) of the document. From now, I
will write indices and types in orange
<!--
curl -X POST localhost:9200/big-one/pizza/1 --data '{ "name": "California Sunset Chicken" }'
curl -X POST localhost:9200/big-one/pizza/2 --data '{ "name": "American Bacon" }'
curl -X POST localhost:9200/big-one/pizza/3 --data '{ "name": "Classic American"}'
-->
</aside>
</section>
<section>
<h2>Get</h2>
<pre><code class="hljs" data-trim contenteditable>
$curl -X GET localhost:9200/big-one/pizza/1
</code></pre>
<p>Result:</p>
<pre class="fragment"><code class="hljs" data-trim contenteditable>
{
"_index":"big-one",
"_type":"pizza",
"_id":"1",
"_version":1,
"found":true,
"_source":{
"name":"California Sunset Chicken"
}
}
</code></pre>
</section>
<section>
<h2>Update</h2>
<pre><code class="hljs" data-trim contenteditable>
$curl -X PUT localhost:9200/big-one/pizza/1 --data
'{
"name":"California Sunset Chicken Awesome"
}'
</code></pre>
<p>Result:</p>
<pre class="fragment"><code class="hljs" data-trim contenteditable>
{
"_index":"big-one",
"_type":"pizza",
"_id":"1",
"_version":2,
"created":false
}
</code></pre>
</section>
<section>
<h2>Delete</h2>
<pre><code class="hljs" data-trim contenteditable>
$curl -X DELETE localhost:9200/big-one/pizza/1
</code></pre>
</section>
<!--<section>
<h2>So far, all we have is a NoSQL document store which is fast, reliable, scalable and easy to use! Now
to the really cool part, full-text search...</h2>
</section>-->
<section>
<h2>So far</h2>
<ul>
<li>All we have is NoSQL document store which is
<ul>
<li>Fast</li>
<li>Scalable</li>
<li>Easy to use</li>
</ul>
</li>
<li>Now to the really cool part, full-text search...</li>
</ul>
</section>
<section>
<h2>Full-text search</h2>
<p>Find all the pizzas that contains the word "American"</p>
<pre><code class="hljs" data-trim contenteditable>
$curl -X GET localhost:9200/big-one/pizza/_search?q=American
</code></pre>
</section>
<section>
<h2>Full-text search - Result</h2>
<pre class="json"><code class="hljs">
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":0.19178301,
"hits":[
{
"_index":"big-one",
"_type":"pizza",
"_id":"2",
"_score":0.19178301,
"_source":{
"name":"American Bacon"
}
},
{
"_index":"big-one",
"_type":"pizza",
"_id":"3",
"_score":0.19178301,
"_source":{
"name":"Classic American"
}
}
]
}
}
</code></pre>
<aside class="notes">
<p>Sorted by relevance!</p>
</aside>
</section>
<section>
<h2>Alternate Approach</h2>
<ul>
<li>Search using Query DSL</li>
</ul>
</section>
<section>
<h2>Full-text search</h2>
<p>Find the pizzas with a name that contains the word "American"</p>
<pre><code class="hljs" data-trim contenteditable>
$curl -XGET localhost:9200/big-one/pizza/_search -d
'{
"query":{
"match":{
"name":"American"
}
}
}'
</code></pre>
</section>
<section>
<h2>Full-text search - result</h2>
<pre><code class="hljs" data-trim contenteditable>
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.19178301,
"hits": [
{
"_index": "big-one",
"_type": "pizza",
"_id": "2",
"_score": 0.19178301,
"_source": {
"name": "American Bacon"
}
},
{
"_index": "big-one",
"_type": "pizza",
"_id": "3",
"_score": 0.19178301,
"_source": {
"name": "Classic American"
}
}
]
}
}
</code></pre>
</section>
</section>
<section>
<section>
<h2>Own experience with Elasticsearch</h2>
</section>
<section data-background="images/kokeboka4.png">
<aside class="notes">
<ul>
<li>
Grunnen til at jeg ønsket å lære meg om søkemotorer var...
</li>
<li>
At jeg lenge har samlet på mat oppskrifter...
</li>
<li>
Og etter hvert har det blitt veldig mange mat oppskrifter, nesten 400 sider...
</li>
<li>
Som gjør at det nesten er umulig å finne det man søke etter...
</li>
<li>
F.eks jeg ønsker å finne alle hovedretter som inneholder kylling...
</li>
</ul>
</aside>
</section>
<section>
<h2>Alt Mulig Mat</h2>
<ul>
<li>Text based searching</li>
<li>Structured searching (get all "Dessert" recipes)</li>
</ul>
<a href="https://altmuligmat.no" target="_blank"><img data-src="images/altmuligmat.png"/></a>
</section>
<section>
<h2>How to use Ealsticsearch?</h2>
<p>Commonly used in addition to another database...</p>
<img height="350" data-src="images/overview.png"/>
</section>
</section>
<section>
<section>
<h2>Use Cases</h2>
<h5>What can Elasticsearch be used for?</h5>
</section>
<section>
<h2>For Big Data</h2>
<p>
Github uses Elasticsearch to search 20TB data, including 1.3 billion files and 130 billion code
lines
</p>
<aside class="notes">
Relationship databases:
<ul>
<li>This works well with smaller data sets, but is not very scalable</li>
<li>When the volume goes up, performance down (write operations)</li>
</ul>
</aside>
</section>
<section>
<h2>Text search</h2>
<p>With filtering, aggregations, highlightning, pagination...</p>
<img data-src="images/github2.png"/>
</section>
<section>
<h2>Pure Analytics</h2>
<p>Count things and summarize your data, lots of data, often on timestamped data!</p>
<img data-src="images/analytics.jpg"/>
</section>
<section>
<h2>Centralized Logging</h2>
<p>Logs > Logstash > Elasticsearch > Kibana</p>
<img data-src="images/elk.png"/>
</section>
<section>
<h2>Geolocation</h2>
<img data-src="images/fouraquare.png"/>
</section>
</section>
<section>
<h2>The end!</h2>
<ul>
<li>It is easy to start building advanced search functionality
<ul>
<li>No configuration is needed</li>
<li>Just add data and start searching</li>
</ul>
</li>
</ul>
</section>
<section>
<h2>Questions?</h2>
</section>
<section>
<h2>References</h2>
<ul>
<li><a href="http://www.slideshare.net/jfaustin/introduction-to-elasticsearch-33976717" target="_blank">Introduction
to Elasticsearch</a></li>
<li>
<a href="https://nullwords.wordpress.com/2013/04/18/inverted-indexes-inside-how-search-engines-work/"
target="_blank">Inverted Indexes – Inside How Search Engines Work</a></li>
<li><a href="http://operational.io/elk-stack-for-network-operations-reloaded/" target="_blank">ELK Stack
for Network Operations</a></li>
<li><a href="http://www.slideshare.net/bigdatalondon/3-elastic-searchcostin-leau" target="_blank">Search
and Analytics (using Elasticsearch)</a></li>
<li><a href="http://www.slideshare.net/sdenthumdas/introduction-to-elasticsearch-47385272"
target="_blank">Introduction to Elastic-search</a></li>
</ul>
</section>
</div>
</div>
<script src="lib/js/head.min.js"></script>
<script src="js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
controls: true,
progress: true,
history: true,
center: true,
transition: 'slide', // none/fade/slide/convex/concave/zoom
// Optional reveal.js plugins
dependencies: [
{
src: 'lib/js/classList.js', condition: function () {
return !document.body.classList;
}
},
{
src: 'plugin/markdown/marked.js', condition: function () {
return !!document.querySelector('[data-markdown]');
}
},
{
src: 'plugin/markdown/markdown.js', condition: function () {
return !!document.querySelector('[data-markdown]');
}
},
{
src: 'plugin/highlight/highlight.js', async: true, callback: function () {
hljs.initHighlightingOnLoad();
}
},
{src: 'plugin/zoom-js/zoom.js', async: true},
{src: 'plugin/notes/notes.js', async: true}
]
});
</script>
</body>
</html>