-
Notifications
You must be signed in to change notification settings - Fork 49
Open
Description
How would you implement most relevant search results? The follow code is using sort_by with some success but is somehow limited. Trying to use sort to have two elements to compare is difficult because it returns two Allocation containing terms. No idea what to do with those.
Perhaps Picky already has something for that? In any case, I would like to sort results by most relevant, then uri, then body. It would be useful to be able to do some kind of sort { |a, b| ... } to compared two elements.
The code below prioritize the uri containing the term. Could be an interesting example for the manual #140.
require("picky")
require("open-uri")
require("net/https")
require("pp")
Picky.logger = Picky::Loggers::Silent.new
doc = open(
"https://raw.githubusercontent.com/Shoes3/shoes3/master/static/manual-en.txt",
{ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE}
).read
Entry = Struct.new :id, :uri, :body
entry = nil
entries = []
doc.each_line { |n|
if n =~ /^=+ (.*) =+$/
entry = Entry.new(entries.size, n.strip, "")
entries << entry
elsif n.strip.size > 0 and not entry.nil?
entry.body += n
end
}
index = Picky::Index.new :terms do
indexing removes_characters: %r{[^a-z0-9\s\/\-\_\:\"\&\.]}i,
splits_text_on: %r{[\s/\-\_\:\"\&/\.]}
category :uri, :from => lambda { |doc| doc.uri.dup }
category :body, :from => lambda { |doc| doc.body.dup }
end
search = Picky::Search.new index do
searching removes_characters: %r{[^a-z0-9\s\/\-\_\:\"\&\.]}i,
splits_text_on: %r{[\s/\-\_\:\"\&/\.]}
end
puts "total entries #{entries.size}"
entries.each { |n| index.add n }
term = "image"
retval = []
results = search.search(term, entries.size, 0)
results.sort_by { |id| entries.detect { |n| n.id == id }.uri =~ /#{term}/ ? 0 : id }
results.ids.each do |id|
entry = entries.detect { |n| n.id == id }
retval << entry.uri unless entry.nil?
end
puts "total retval #{retval.uniq.size}"
pp retval.uniqMetadata
Metadata
Assignees
Labels
No labels