Architecture "dataspects Search"

From SMW CindyKate - Main
Component1127210566
Jump to: navigation, search

Content

Lex1804051221A.png

dataspects Core

Core

Settings

Mappings

Entities

class Entity
  def initialize(oResource)

class Subject < Entity
  def initialize(oResource)
    super(oResource)

class Property < Entity
  def initialize(oResource)
    super(oResource)

Resources

The abstract Dataspects::Resource class defines defaults for subclasses:

  • default aEntities as ResourceEntitizer.the_entire_resource_is_one_single_entity
  • default resource-level annotations for its subclasses:
    • sHasResourceName
    • sHasResourceURL
    • ...
  • default entity-level annotations for entities extracted from its subclasses:
    • sHasEntityName
    • sHasEntityTitle
    • ...
class Resource
  attr_reader   :sHasResourceName, :sHasResourceURL
  attr_reader   :sHasResourceType
  attr_reader   :oResourceContent, :oResourceSilo
  def initialize(oResourceSilo)
  def aEntities
    oRE = ResourceEntitizer.new(self)
    oRE.the_entire_resource_is_one_single_entity
    return oRE.aEntities

class SemanticMediaWikiPage < Resource
  attr_reader   :sWikitext
  def initialize oResourceSilo, sSMWPageName
    super(oResourceSilo)
  def sHasEntityTitle
    return @oFullHTMLSource.xpath('//title').text
  def sHasEntityBlurb
    sRandomValueForPROPERTYNAME('HasEntityBlurb')

class HTMLDocument < Resource
  attr_accessor :nokogiriHTMLDoc_HasEntityHTMLContent
  attr_reader   :oHasSubjectTitle, :oHasSubjectType
  def initialize oResourceSilo, sHasResourceURL
    super(oResourceSilo)

ResourceSilos

class ResourceSilo
  attr_accessor :sResourceSiloLabel
  attr_reader   :sResourceSiloID
  def initialize(oProfiles, hOptions)
    @oProfiles = oProfiles

class SemanticMediaWiki < ResourceSilo
  attr_reader   :oAPI, :sAPIUrl, :sScriptPath, :sTopicURLPrefix
  attr_accessor :hSMWPageTitles
  def initialize(oProfiles, sSMWIdentifier, hOptions)
    super(oProfiles, hOptions)
    @sResourceSiloID = @sAPIUrl

class Sitemap < ResourceSilo
  def initialize(oProfiles, sSMWIdentifier, hOptions)
    super(oProfiles, hOptions)
    @sResourceSiloID = sSitemapURL


Entities

  • Knowledge is managed as entities, which are either subjects/topics or relationships/properties.
  • Every entity has its provenance profile: an entity is extracted from a resource which on its part is extracted from a resource silo.

Terminological

entity.rb
Learning bite:

Entities DO have annotations!

module Dataspects
  class Entity
    # Constants/defaults set in subclasses
    attr_reader :oHasEntityClass
    # Constants/defaults set in this class
    attr_reader :oHasEntityName, :oHasEntityTitle, :oHasEntityType
    attr_reader :oHasEntityContent
    attr_reader :aHasEntityKeywords
    attr_reader :aHasEntityAnnotations

    def initialize oResource
      @oResource = oResource # Now the entity knows from which resource it originated
      # Defaults
      @oHasEntityName = Annotation.new('HasEntityName', @oResource.default_HasEntityName)
      @oHasEntityTitle = Annotation.new('HasEntityTitle', @oResource.default_HasEntityTitle)
      @oHasEntityType = Annotation.new('HasEntityType', @oResource.default_HasEntityType)
      @oHasEntityContent = Annotation.new('HasEntityContent', @oResource.default_HasEntityContent)
      @aHasEntityKeywords = @oResource.default_HasEntityKeywords
      @aHasEntityAnnotations = @oResource.default_HasEntityAnnotations
    end

    def set_HasEntityName sHasEntityName
      @oHasEntityName = Annotation.new('HasEntityName', sHasEntityName)
    end

    def set_HasEntityTitle sHasEntityTitle
      @oHasEntityTitle = Annotation.new('HasEntityTitle', sHasEntityTitle)
    end

    def set_HasEntityType sHasEntityType
      @oHasEntityType = Annotation.new('HasEntityType', sHasEntityType)
    end

    def set_HasEntityContent sHasEntityContent
      @oHasEntityContent = Annotation.new('HasEntityContent', sHasEntityContent)
    end

    def set_HasEntityKeywords aKeywords
      @aHasEntityKeywords = aKeywords
    end

    def set_HasEntityAnnotations aAnnotations
      @aHasEntityAnnotations = aAnnotations
    end
  end
end
property.rb
module Dataspects
  class Property < Entity
    @oHasEntityClass = Annotation.new(
      'HasEntityClass',
      'property'
    )
subject_type.rb
module Dataspects
  class SubjectType < Entity
    @oHasEntityClass = Annotation.new(
      'HasEntityClass',
      'subject-type'
    )
subject_role.rb
module Dataspects
  class SubjectRole < Entity
    @oHasEntityClass = Annotation.new(
      'HasEntityClass',
      'subject-role'
    )

Assertional

subject.rb
module Dataspects
  class Subject < Entity
    def initialize oResource
      super(oResource)
      @oHasEntityClass = Annotation.new(
        'HasEntityClass',
        'subject'
      )
    end
annotation.rb

See Concept "DSKMF Reification"

module Dataspects
  class Annotation < Entity
    @oHasEntityClass = Annotation.new(
      'HasEntityClass',
      'annotation' # Reification
    )

    @oProperty
    @oValues

    def initialize sPropertyName, mPropertyValue
    end
annotation_values.rb
module Dataspects
  class AnnotationValues

    def initialize a_AnnotationValueObjects
    end
annotation_value.rb
module Dataspects
  class AnnotationValue

    @sValue

    def new_from_sSTRING sString
    end

Elasticsearch Indexing

run_indexing_process_for_smw-cindykate.com.rb
require 'customizing_module_for_smw-cindykate.com.rb'

oSMWPages = Dataspects::Facet.new(@hOptions)
oSMWPages.from_oSEMANTICMEDIAWIKI(oSMW)
oSMWPages.from_mCATEGORIES('Subject') do |oResource| # Direct iteration through facet filter
  oResource.aEntities.each do |oEntity|
  ###
    # This is a code block passed to Facet.from_mCATEGORIES(mCategories, &block)
    # @block.call(oResource) is then run in get_entities_batch_for_category()
    begin
      sJobContext = oResource.sHasResourceURL
      aEntityAnnotationsAtIndexing = [
        Dataspects::DSKMF.h_dskmfProperty(
          'TestAnnotation',
          ['value']
        )
      ]
      jsonDOC = {
        # Resource level
        HasResourceName: oResource.sHasResourceName,
        HasResourceURL: oResource.sHasResourceURL,
        HasResourceType: oResource.sHasResourceType,
        # Resource level
        HasEntityClass: oEntity.oHasEntityClass.oValues.oFirstValue.sValue,
        # Entity/subject level
        HasEntityName: oEntity.oHasEntityName.oValues.oFirstValue.sValue,
        HasEntityType: oEntity.oHasEntityType.oValues.oFirstValue.sValue,
        "HasEntityTitle#{oEntity.oHasEntityTitle.oValues.oFirstValue.sLanguage}":
          oEntity.oHasEntityTitle.oValues.oFirstValue.sValue,
        "HasEntityTypeAndEntityTitle#{oEntity.oHasEntityTitle.oValues.oFirstValue.sLanguage}":
          "#{oEntity.oHasEntityType.oValues.oFirstValue.sValue} \"#{oEntity.oHasEntityTitle.oValues.oFirstValue.sValue}\"",
        HasEntityKeywords: [
          "SMWCK"
        ]+oEntity.aHasEntityKeywords,
        "HasEntityContent#{oEntity.oHasEntityContent.oValues.oFirstValue.sLanguage}":
          oEntity.oHasEntityContent.oValues.oFirstValue.sValue,
        HasEntityAnnotations: aEntityAnnotationsAtIndexing+oEntity.aHasEntityAnnotations
      }
    rescue JSON::GeneratorError
      Dataspects.errorMessage("JSON::GeneratorError for #{sJobContext}")
    end
    Dataspects.logMessage("STORING: #{sJobContext}...")
    oESC.store_jsonDOC_in_sINDEX(jsonDOC.to_json, sIndexName)
  end
end
customizing_module_for_smw-cindykate.com.rb
module Dataspects
  class SemanticMediaWikiPage
    # Define how this page declares entities
    def aEntities
      # This page declares a single subject
      oEntity = Subject.new(self)
      # CUSTOMIZATIONS
      # No customized HasEntityName -> fallback to SemanticMediaWikiPage default
      oEntity.set_HasEntityTitle(sRandomValueForPROPERTYNAME('HasTitle'))
      oEntity.set_HasEntityType(sRandomValueForPROPERTYNAME('HasType')[5..-1])
      @oFullHTMLSource.remove_section_from_HasEntityHTMLContent!("//div[@class='printfooter']")
      @oFullHTMLSource.remove_section_from_HasEntityHTMLContent!("//div[@id='footer']")
      @oFullHTMLSource.remove_section_from_HasEntityHTMLContent!("//div[@id='mw-navigation']")
      @oFullHTMLSource.remove_section_from_HasEntityHTMLContent!("//div[@id='catlinks']")
      oEntity.set_HasEntityContent(@oFullHTMLSource.sSanitizedHasEntityHTMLContent)
      oEntity.set_HasEntityKeywords(aKeywords_by_SMWPROPERTYNAME('HasEntityKeyword'))
      return [oEntity]
    end
  end
end

dataspects Main API

# dataspectsSearch_config.yml
---
dataspectsESCluster: 
sSMWArticlePath: 
sDomain: 
sIndexName:

Modules/Classes

# dataspectsMainAPI.rb
class DataspectsMainAPI < Sinatra::Base
  post '/dataspectsSearchAPI' do
    ...
    oSic = Dataspects::SearchInterfaceComponent.new(h_dataspectsSearch_config, hPOSTFields)
    if(hPOSTFields.has_key?('sQueryStringForSuggestions'))
      return oSic.html_suggestions(hPOSTFields['sQueryStringForSuggestions'])
    elsif(hPOSTFields.has_key?('sQueryStringForSearch'))
      return oSic.html_searchResults
    else
      return oSic.html_searchInterface
    end

dataspects Search Core

Index Document Mappings

doc

properties
Resource level
HasResourceName
Entity level
HasEntityType
Subject level
HasSubjectType
Nested HasSubjectPropertyInstances
properties
subjectPropertyName
Nested subjectPropertyValues

Modules/Classes

search_interface_component.rb

module Dataspects
  class SearchInterfaceComponent
    def initialize h_dataspectsSearch_config, hPOSTFields
      ...
      loadPlugins
    end
    def html_searchInterface
      ...
      return HTMLComponents.html_searchInterface(self)
    end
    def html_searchResults
      ...
      oResultForTermOnSystem = ResultsForTermInFacet.new(sQueryString, oFacet)
      html << oResultForTermOnSystem.html_searchResults
      return "<div id='searchResults'>#{html.join()}</div>"
    end
    def html_suggestions sQueryString
      ...
      return aHTML.join('')
    end

result_for_term_in_facet.rb

module Dataspects
  class ResultsForTermInFacet
    def initialize sQueryTerm, oFacet
      ...
    end
    def html_searchResults
      ags = ElasticsearchQueries.aElasticsearchHits_standardSearch(@sQueryTerm, @oFacet)
      ags.each do |oElasticsearchHit|
        aLinks << HTMLComponents.html_individualSearchResult(oElasticsearchHit)
      end
      return aLinks.join()
    end

html_components.rb

module Dataspects
  module HTMLComponents
    def self.html_searchInterface oSic
      ...
    def self.html_individualSearchResult oElasticsearchHit
      ...

elasticsearch_queries.rb

module Dataspects
  module ElasticsearchQueries
    def self.aElasticsearchHits_standardSearch sQueryTerm, oFacet
      ...
      return esr.aElasticsearchHits
    end

User Interface

<!-- dataspectsSearchHTMLInterface.html -->
<!-- This could be already provided by context BEGIN -->
<script src="https://code.jquery.com/jquery-3.2.1.min.js"></script>
<script src="https://code.jquery.com/ui/1.12.1/jquery-ui.min.js"></script>
<link rel="stylesheet" href="https://code.jquery.com/ui/1.12.1/themes/base/jquery-ui.css"/>
<!-- This could be already provided by context END -->
<!-- dataspectsSearch BEGIN -->
<div id="dataspectsSearchMainContainerDiv"></div>
<link rel="stylesheet" href="/dataspectsSearchThemeStandard/dataspectsSearch.css"/>
<script src="/dataspectsSearch_library.js"></script>
<script src="/dataspectsSearch_config.js"></script>
<script src="/dataspectsSearch_loader.js"></script>
<!-- dataspectsSearch END -->

dataspects Plugin "mySearchEngine"

Index Document Mappings

Modules/Classes

# dataspectsSearch_config.yml
---
...
sPluginURLs: [
  dataspectsPlugins/mySearchEngine
]

dataspectsPlugins/mySearchEngine/


# html_components.rb
module Dataspects
  module HTMLComponents
    def self.html_searchInterface oSic
      ...
    def self.html_individualSearchResult oElasticsearchHit
      ...
result_for_term_in_facet.rb
search_interface_component.rb

User Interface