Intranet Search research and analysis console script
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
KARL3 |
Fix Released
|
Medium
|
Chris Rossi |
Bug Description
We have an SOW for the second half of the year to provide an "Intranet Search". That is, a search option that only returns the following:
1) Content under /offices
2) Content in communities that have a special marker
3) Profiles
This work is for two goals:
a. Better signal-to-noise ratio when using KARL for "business" content
b. Faster search performance by limiting content to a much smaller subset
The first step of this work is a research project to measure the impact and set a baseline we can use for a before/after comparison. I'd like a console script that can be run outside the app server to make it more predictable. (Any other ideas to remove external influences are welcome.) The console script should do two things:
a. Time a current LiveSearch (prefix multigroup) across the whole database, then record the response time on this ticket.
b. Give a count of the current number of content resources in the entire database, versus a count of resources under /offices and /profiles (plus the communities Nat has in mind.)
This ticket is up for better ideas and thinking, so feel free to give feedback.
I've attached a script for getting live search timing information. You might have a better idea of which searches are most interesting to you. Feel free to tweak. The output looks something like:
karlstaging@ karlstaging10 ~/staging/current $ bin/karlserve debug osf -S ~crossi/ intranet- analysis/ time_searches. py karlstaging10 ~/staging/current $ bin/karlserve debug osf -S ~crossi/ intranet- analysis/ time_searches. py
Searching /?val=paul
Found 30
Elapsed: 5.46 s
Searching /?val=sta%2A
Found 30
Elapsed: 36.46 s
Searching /?val=c%2A
Found 30
Elapsed: 89.75 s
karlstaging@
Searching /?val=paul
Found 30
Elapsed: 4.47 s
Searching /?val=sta%2A
Found 30
Elapsed: 5.50 s
Searching /?val=c%2A
Found 30
Elapsed: 18.39 s