Skip to content

BafS/ElasticSearchExperiment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Elastic Search Experiments

CLD Cloud Computing class - 2016
HEIG-VD, Yverdon-les-Bains, Switzerland
Authors: Bafs, D34D10CK, BinaryBrain, ValentinMinder

Structure of the Project

The aim of this project is to have a first experience with Elastic Search.

We developped several ressources, including

Outline of Elastic Search Topic

1. How it works

Goal of this chapter: Explain globally your topic. Recall that your audience consists of application architects, developers and testers, but also business people (your boss). Don't forget to discuss cost and strategic questions.

Diagram of the Architecture

  • Real-time data gathering
  • Abstraction and wrappers with logstatch (for transport and log parsing)

Installation, Configuration and Usage

  • Installation and configuration procedure (no demo)
  • How to manage multiple servers

Pros and cons

Pros:

  • RESTful JSON API
  • Fast deployement
  • Aggregation
  • Designed to be distributed
  • Query DSL syntax is really flexible and easy to use
  • Open source

Cons:

  • Installation & configuration
  • Still relatively new project, not a lot of documentations, maybe not very stable.
  • Security: ElasticSearch does not provide any authentication or access control functionality.
  • Transactions: There is no support for transactions or processing on data manipulation
  • Durability: ES is distributed and fairly stable but backups and durability are not as high priority as in other data stores. This is probably the most important if you're going to make ES the primary store since losing your data is never good.

Reference: Slant, Quora

Demo with real data

  • Connect a server (Apache, nginx, database ...) that sends its data to the ELS
  • Create some errors/logs (Chaosmonkey, db shutdown, wrong login, kill processes)
  • Explore the logs to find the issues
  • Show that the same ELS contains logs from multiple sourice

2. Search capacities and Proof-of-Concept

Goal of this chapter: For the proof-of-concept choose a practical application that illustrates well your topic.

Query language

  • queries type
  • language type
  • possible questions to ask

Demo with simple data

  • Demonstration with "easy-to-understand" data, not meaningless obsure logs
  • Every attendee will understand and realize what data we're talking about
  • Demonstration of the search capacities
  • Source of data: Public transport stops

3. Use Elastic Search as a service ?

Goal of this chapter: Give general recommendations when it makes sense to use the cloud service / technology and when it doesn't.

Building a solution:

  • option 1: à deployer soi-même en entreprise: ELK: Elastic, Logstatch, Kibana = full stack solution
  • option 2: use an ELS-as-a-service like Amazon or others.

To compare:

  • option1 vs. option2: simplification ? advantages ? conditions ? constraints ? limitations ? capacities ?
  • is diagnostic offered with Amazon service?
  • how to connect my existing services (IT or Amazon) to Amazon ELS

Amazon vs. ELK Full-stack

Pros Amazon (or generally other providers)

  • All logs accessible in one place.
  • Hassle-free managed ELK you don’t need to maintain and scale
  • Use built-in Kibana or use your own Kibana or Grafana with Logsene
  • Nothing to install – use any log shipper you want
  • price: pay-as-you-go and per-service (no IT & infrastructure, maintanance, etc)

Cons Amazon:

  • version of Elasticsearch: not choosable
  • Elasticsearch 1.5.x and other versions have critical bugs
  • less flexible with library support, the technology moves fast!
  • limited choices in terms of VM characteristics: type, disk size, RAM)
  • requires cloud usage (country and legal issues, technical issues with traffic/bandwidth)

Source: Cloud Academy

Cost of Amazon ELS:

  • instance: the cheapest is t2.micro.elasticsearch is from 0.021USD/h, that's 15.12USD/month.
  • storage: from USD0.079 per GB / month on magnetic storage
  • transfer of large amounts to or from the internet costs. All the log data have to go IN first: its free from the internet, but might cost from certain services (up to USD 0.01/GB, 10USD/TB). Transfering data OUT to the internet is expansive (up to USD0.090 per GB, that's 90USD/TB). Therefore it's wiser to do all other computation also on an Amazon service (with free or cheap transfer rate). Usually Kibana will only transfer small amounts of data for what it needs to render the grafics or the search query.
  • Source: AWS

Competitors of Amazon ELS

Logsene

Logsene by SemaText is an alternative service that integrates with SPM Performance Monitoring. Available in the Cloud and On Premise. Correlate logs with performance metrics via SPM. UI view of logs.

Whoosh

Whoosh is developped in Python (ELS in Java). Not as powerfull but usefull if you code in python.

Apache Solr

Apache Solr needs a schema (not ELS), has bigger community. Better performances.

Releases

No releases published

Packages

No packages published