Business Service "dataspects Search and Wiki"

From SMW CindyKate - Main
Community:Component0391924916
Jump to: navigation, search
Knowledge management consists of optimizing the access to and the management of knowledge for the sake of decision and performance support.

Contents

Content

Background

1. People are accustomed to searching things on Google. That's why your enterprise knowledge management search engine must model itself on Google.
2. People are accustomed to reading things on Wikipedia. That's why your enterprise knowledge management presentation must model itself on Wikipedia.
3. People are accustomed to editing things on Microsoft Word. That's why your enterprise knowledge management editor must model itself on Microsoft Word.
4. People are accustomed to linking things using Properties. That's why your enterprise knowledge management structure must model itself on Semantic Web Concepts.
5. Organisations are accustomed to working with files and documents. That's why your enterprise knowledge management presentation must smoothly integrate files and documents (upload, preview, versioning).
6. Organisations are accustomed to working with printouts and PDF. That's why your enterprise knowledge management presentation must output printer friendly formats and PDF.
7. Organisations are accustomed to working with hierarchical workflows and access control. That's why your enterprise knowledge management presentation must provide basic workflows and access control.
8. Organisations are accustomed to working with tables. That's why your enterprise knowledge management presentation must provide intuitive creation and design of tables within a visual editor

This article is about dataspectsSystem — a system that facilitates managing all aspects (e.g. install, upgrade, configure, manage, restore, etc.) of Semantic MediaWikis and WYSYWIG editors in a narrow sense and complementary/auxiliary systems in a broader sense, e.g. search, backup, integration, etc.

dataspectsSystem

  • https://github.com/dataspects
  • Mutually Exclusive, Collectively Exhaustive (MECE)
  • A place for everything and everything in its place.
  • Convention over configuration
  • Completely decomposable

UX

Using the system must be:

  1. clear by intuition/GUI
  2. clear by mouseover message/hint
  3. clear by documentation at mouseclick on link/button to popup or toggled div
  4. clear by documentation at mouseclick on link/button to information in new window
  5. clear by calling/mailing to a supporting party

Costs in terms of transactional brain cycles and time units used/wasted increase logarithmically.

Architecture and concepts

Step Create a new system Restore a system snapshot

Step 0

In the Vagrantfile specify

aAnsibleTags = [
  "install_docker",
  "start_docker",
  "install_user",
  "load_docker_images",
  "run_dockerized_mariadb",
  "run_dockerized_apache",
  "restore_duplicity_snapshot",
  "reconfigure_mediawiki",
  "set_up_dataspects_search",
  "install_mediawiki"
]
docker_network: {
  name: "myNetwork0",
  sub: "172.20.0.0"
}

Step 1

In the Vagrantfile specify

{
  apache_machine_ip_on_virtualbox_network: "192.168.50.10",
  mariadb_machine_ip_on_virtualbox_network: "192.168.50.11"
}
{
  db_container_name: "smwckDB",
  db_docker_image: "mariadb:10.3.6",
  db_container_ip_on_docker_network: "172.20.0.11",
  db_port_on_host: 3306,
  db_files_on_host: "/opt/mysql_data",
  mysql_root_password: "password"
}
{
  docker_images_to_load: [
    "dataspects_apache7.2_180611.tar"
  ],
  apache_container_name: "smwckApache",
  apache_image: "dataspects/apache7.2:180611",
  apache_container_ip_on_docker_network: "172.20.0.10",
  apache_port_on_host: "80",
  mediawiki_workspace_in_container: "/var/www/html",
  mediawiki_workspace_on_host: "/home/vagrant/mediawiki",
  mediawiki_version: "REL1_31"
}
vm.network "forwarded_port"

In the Vagrantfile specify

{
  db_files_on_host: "/opt/mysql_data",
  db_port_on_host: "3306",
  db_container_ip_on_docker_network: "172.20.0.5",
  db_container_port_on_docker_network: "3306",
  apache_port_on_host: "80",
  apache_container_ip_on_docker_network: "172.20.0.6",
  docker_images_to_load: [
    "dataspects_apache5.6_180529.tar",
    "dataspects_duplicity_180528.tar",
    "mariadb_10.3.6.tar"
  ],
  hContainerNames: {
    db_container_name: "smwckDB",
    apache_container_name: "smwckApache"
  }
}
vm.network "forwarded_port"

Step 2

Create a myOSUserProfile.yml and pass as environment variable OS_USER_PROFILE

---

# NO TRAILING SLASHES!

lsdr_restoring_os_user: smwckmain
lsdr_public_gpg_key: |
  -----BEGIN PGP PUBLIC KEY BLOCK-----
  ...
  -----END PGP PUBLIC KEY BLOCK-----
lsdr_secret_gpg_key: |
  -----BEGIN PGP PRIVATE KEY BLOCK-----
  ...
  -----END PGP PRIVATE KEY BLOCK-----

Step 3

[email protected]:~/dataspectsSystem$ \
  SYSTEM_ARCHITECTURE=multiple_machines \
  OS_USER_PROFILE="myOSUserProfile.yml" \
    vagrant up mariadb_machine apache_machine

Create a myDuplicitySnapshotRestoreProfile.yml and pass as environment variable RESTORE_DUPLICITY_SNAPSHOT_PROFILE

---

# https://github.com/dataspects/run-dockerized-mariadb
db_name: smwckdatabase
mysql_root_password: password
db_docker_image: mariadb:10.3.6

# https://github.com/dataspects/run-dockerized-apache
apache_image: dataspects/apache5.6:180529
mediawiki_workspace_on_host: "/home/smwckmain/Restored_SMWCK"
mediawiki_workspace_in_container: /var/www/html
mediawiki_script_path_in_container: m
mediawiki_server: http://p51:20100

# https://github.com/dataspects/restore-duplicity-snapshot
duplicity_image: dataspects/duplicity:180528
duplicity_snapshot_url: "/mnt/Backup_SMWCK"

# Restoring user
gnupg_key_passphrase: ""
gnupg_key_id: ""

Step 4

[email protected]:~/dataspectsSystem$ \
  SYSTEM_ARCHITECTURE=single_machine \
  DUPLICITY_SNAPSHOTS_SOURCE_URL="/home/lex/Dropbox/Backups/Backup_SMWCK/" \
  OS_USER_PROFILE="myOSUserProfile.yml" \
  RESTORE_DUPLICITY_SNAPSHOT_PROFILE="myDuplicitySnapshotRestoreProfile.yml" \ 
    vagrant up single_machine

Rebuild dataspects Search

In a single_machine setup the wiki operates in container smwckApache (172.20.0.6 on myNetwork0) exposing container port 80 as single_machine port 80.

The dataspectsMainAPI operates in a dataspects container (172.20.0.9 on myNetwork0) exposing exposing container port 4567 as single_machine port 4567. It volumes the single_machine's /mnt/RPOFILES/dataspectsSearch_config.yml into the container's /usr/src/dataspectsSearch_config.yml and runs puma using the container's internal config/puma.rb and config.ru.

Vagrantfile

Aspects

See The Twelve-Factor App

Segmentation Principles

  • mkdir ..., run docker ... are never abstracted

The stage for building blocks

  1. install plain vanilla OS instances on bare metal or virtual machines
  2. install and configure the minimum of packages to ensure the plain vanilla OS instances are healthy, performant and monitored
  3. install and configure the Docker platform on all machines
  4. install and configure specific users (name, groups, certificates, keys) on all machines

Currently dataspects System is segmented into these services:

Building Block "dataspects/apache"

Build image

Build environment Dockerfile Build command
[email protected]:~$ git clone https://github.com/dataspects/dockerized-apache.git
5.6

[email protected]:~/dockerized-apache$ vi 5.6/Dockerfile

7.2

[email protected]:~/dockerized-apache$ vi 7.2/Dockerfile

[email protected]:~/dockerized-apache$ docker build --tag dataspects/apache5.6:180529 --file 5.6/Dockerfile .
[email protected]:~/dockerized-apache$ docker build --tag dataspects/apache7.2:180529 --file 7.2/Dockerfile .

Run (create and start) container

Run environment Run command
[email protected]:~$ tree
.
├── 000-default.conf
└── MediaWiki_root_folder
[email protected]:~$ docker run \
  --network <docker_network_name> \
  --ip <db_container_ip_on_docker_network> \
  --publish <apache_port_on_host>:80 \
  --volume /home/user/MediaWiki_root_folder:/var/www/html/w \
  --volume /home/user/000-default.conf:/etc/apache2/sites-available/000-default.conf
  --name <apache_container_name> \
  --detach \
  dataspects/apache5.6:180529

Building Block "mariadb"

Run (create and start) container

Run environment Run command
[email protected]:~$ docker run \
  --network <docker_network_name> \
  --ip <db_container_ip_on_docker_network> \
  --name <db_container_name> \
  --publish <db_port_on_host>:3306 \
  --volume <db_files_on_host>:/var/lib/mysql \
  --env MYSQL_ROOT_PASSWORD=<mysql_root_password> \
  --detach \
  mariadb:10.3.6

Building Block "elasticsearch"

Building Block "dataspects/dataspects"

Build image

Build environment Dockerfile Build command
[email protected]:~$ git clone https://github.com/dataspects/dockerized-dataspects.git
[email protected]:~/dockerized-dataspects$ git clone https://github.com/dataspects/dataspects.git
[email protected]:~/dockerized-dataspects$ git clone https://github.com/dataspects/dataspectsMainAPI.git
[email protected]:~/dockerized-dataspects$ docker build --tag dataspects .

Run (create and start) container

Run environment Run command
[email protected]:/mnt$ git clone https://github.com/dataspects/dataspectsOntologyEngineering.git
[email protected]:~/PROFILES$ ls
.
└── dataspects_profile.yml
# Run dataspects
[email protected]:~$ docker run --rm \
  --volume /mnt:/mnt \
  --volume ~/PROFILES:/PROFILES \
  --workdir /usr/src/dataspects \
  dataspects \
  bundle exec bin/dataspects \
    -p /PROFILES/dataspects_profile.yml \
    manage /mnt/dataspectsOntologyEngineering/jobs/dataspects/index_github_dataspects.rb
# Run dataspectsMainAPI
[email protected]:~$ docker run --rm \
  --publish 4567:4567 \
  --workdir /usr/src/dataspectsMainAPI \
  dataspects \
  puma -C config/puma.rb -e production config.ru

dataspects/duplicity

Build image

Build environment Dockerfile Build command

Run (create and start) container

Run environment Run command
Create snapshot
[email protected]:~$ docker run \
  --rm \
  --user=`id -u <snapshooting_os_user>` \
  -e PASSPHRASE='<gnupg_key_passphrase>' \
  -v $PWD/.cache:/home/duplicity/.cache/duplicity \
  -v $PWD/.gnupg:/home/duplicity/.gnupg \
  -v <source_directory_urn>:/source/ \
  -v <duplicity_target_snapshot_url>:/target/:rw \
  <duplicity_image> \
  duplicity \
  --allow-source-mismatch \
  --verbosity 9 \
  --encrypt-key <gnupg_key_id> \
  /source/ file:///target/
Restore snapshot
[email protected]:~$ docker run \
  --rm \
  --user=`id -u <lsdr_restoring_os_user>` \
  -e PASSPHRASE='<gnupg_key_passphrase>' \
  -v $PWD/.cache:/home/duplicity/.cache/duplicity \
  -v $PWD/.gnupg:/home/duplicity/.gnupg \
  -v <duplicity_snapshot_url>:/source/ \
  -v <restore_target_directory_urn>:/target/:rw \
  <duplicity_image> \
  duplicity \
  --allow-source-mismatch \
  --verbosity 9 \
  --encrypt-key <gnupg_key_id> \
  file:///source/ /target/

Profiling

Building Blocks dataspects shall be semantically profiled for faceting.

Implicit Semantic Profiling (preferred)
  • [[UsesAnsibleVariable::]] can be featurized by /\{{2}([.\w-]+)\}{2}/i skipping {{item}} (which is an Ansible meta variable for loops).
Explicit Semantic Profiling (if necessary)

Conceptual Layers

Notice:
  • Write the code you wish you had.
  • Good architecture maximizes the number of decisions not made.

Conceptual Layer "Assertional Ontologies"

Conceptual Layer "Tools"

Conceptual Layer "Terminological Ontologies"

Conceptual Layer "Extensions"

  • This conceptual layer provides Ansible roles to install any selection of MediaWiki extensions.
    • Methods: Composer, Special:ExtensionDistributor, GitHub
  • These Ansible roles will extend/alter Conceptual Layer "Infrastructure" and Conceptual Layer "MediaWiki" according to the requirements of the MediaWiki extensions being installed.
relies on everything below

Conceptual Layer "MediaWiki"

  • This conceptual layer provides Ansible roles to install any version of MediaWiki that hasn't yet reached its end-of-life.
  • These Ansible roles will extend/alter Conceptual Layer "Infrastructure" in accordance with the requirements of the MediaWiki version being installed.
relies on everything below

Conceptual Layer "Web services"

relies on everything below

Conceptual Layer "Infrastructure"

This conceptual layer provides Ansible roles to provision a single or multiple machine(s) so that it/they is/are able to host a MediaWiki-centric system.

  • AnsibleRole "Upgrade infrastructure"

Web server

E.g. Apache

Database management system (DBMS)

E.g. MariaDB

  • AnsibleRole "Install DBMS"
  • AnsibleRole "Upgrade DBMS"
  • AnsibleRole "Secure DBMS"

Server frameworks

E.g. Node.js

Search engines

E.g. Elasticsearch

relies on everything below

Conceptual Layer "Server Operating System Abstraction"

This layer is necessary if there is no pertinent Ansible module or plugin available and it is decided not to implement such.

E.g. for managing Apache virtual hosts: e.g. virtual host files: Ubuntu: /etc/apache2/sites-available/, CentOS: /etc/httpd/sites-available/.

Notice:

E.g. where to map an OS-agnostic {{sites-available-directory}} Ansible variable to /etc/apache2/sites-available/ or /etc/httpd/sites-available/ respectively?

relies on everything below

Conceptual Layer "Server Operating System"