Indexing is not tokenizing title or name

Bug #1205477 reported by Kapil Thangavelu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
charmworld
Triaged
High
Unassigned

Bug Description

It should be afaics. ie. when i search for rsyslog i should get results for both rsyslog-forwarder and rsyslog. Because its not being anaylzed its not going to be tokenized 'rsyslog-forwarder' into 'rsyslog' and 'forwarder'. The same applies to bundle title, because its never going to match on title unless someone types in a sentence that matches exactly.

Curtis Hovey (sinzui)
tags: added: elasticsearch
Changed in charmworld:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Aaron Bentley (abentley) wrote :

If we analyze names, then searching on the exact, un-anaylized text will fail. It seems to me that "rsyslog-forwarder" failing to match "rsyslog-forwarder" is worse than it failing to match "rsyslog", since the term "rsyslog" is likely to appear elsewhere, in places that are analysed, such as the description and summary.

We don't have any rules for a "title" field, so they should be analyzed as a default. AFAICT, cs:precise/rsyslog-forwarder-2 does not have a title. In any case, we do not search on a "title" field, so either way, it will not affect search results.

Revision history for this message
Aaron Bentley (abentley) wrote :

Sorry, I missed that you were talking about bundle title.

Revision history for this message
Richard Harding (rharding) wrote :

This does appear to work now. Marking fix released. In production you have to uncheck reviewed charms, and in comingsoon it'll show up under the non-reviewed charms.

Changed in charmworld:
status: Triaged → Fix Released
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

This isn't fixed. see another bug report on the same 1220909

It only matches on a body text search because of the readme help content. Fielded search on title or description fails.

Aaron, query inputs can be tokenized too.

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

pls ignore #4, its not clear that things are doing title tokenization (reverseproxy doesn't match), but the indexing and search does appear to be improved enough on the overall charm metadata and readme such that its a non issue.

Revision history for this message
Curtis Hovey (sinzui) wrote :

Thank you kapil. I was looking for this bug yesterday.

Changed in charmworld:
status: Fix Released → Triaged
Revision history for this message
Aaron Bentley (abentley) wrote :

Kapil, query inputs can be tokenized, but that is not a good choice when you are trying to do an exact match. When you are searching on "rsyslog-forwarder", you do not want "rsyslog" in your results.

Revision history for this message
Kapil Thangavelu (hazmat) wrote : Re: [Bug 1205477] Re: Indexing is not tokenizing title or name

Yeah i think you do, their kinda of related, and yes they would both show
up regardless, since both reference each other in their readme. Anyways as
i mentioned in comment #5 it does appear to be moot as the search/indexing
is improved significantly since when this was filed.. ie. rsyslog matches
to rsyslog-forwarder, forward pickups rsyslog-forwarder etc. I have to
carefully construe an example to find one that doesn't match as the readme
content generally resolves to a hit.

On Wed, Sep 18, 2013 at 4:10 AM, Aaron Bentley
<email address hidden>wrote:

> Kapil, query inputs can be tokenized, but that is not a good choice when
> you are trying to do an exact match. When you are searching on
> "rsyslog-forwarder", you do not want "rsyslog" in your results.
>
> --
> You received this bug notification because you are a member of
> Charmworld Developers, which is subscribed to charmworld.
> https://bugs.launchpad.net/bugs/1205477
>
> Title:
> Indexing is not tokenizing title or name
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/charmworld/+bug/1205477/+subscriptions
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.