add arrow-down arrow-left arrow-right arrow-up authorcheckmark clipboard combo comment delete discord dots drag-handle dropdown-arrow errorfacebook history inbox instagram issuelink lock markup-bbcode markup-html markup-pcpp markup-cyclingbuilder markup-plain-text markup-reddit menu pin radio-button save search settings share star-empty star-full star-half switch successtag twitch twitter user warningwattage weight youtube

Working on forum search

philip

48 months ago

Too early for a reliable ETA. My wild guess is that it'll be ready in dev in a day or two. However, we're not pushing any major code live until after Christmas so it'll probably be available close to New Year's.

(The last couple days I've done a boatload of refactoring, cleaning up various bits of code and templates. I suspect if I push that out now I'll break something... which isn't wise right before Christmas.)

Comments

  • 48 months ago
  • 3 points

don't push yourself too hard man. do what you can, take your time, and make it right.

this site is still amazing no matter what

  • 48 months ago
  • 1 point

This is something I have really wanted.

  • 48 months ago
  • 1 point

So, what goes into a forum search? Is it just text comparisons of all the threads/posts?

Like (and yea, just making this up on the spot so it's not following any actual coding... And I'm sure I'm leaving a lot out...):
Search: "Hi Philip"
For 1=1:N of "Database" {
Match "Hi" = TRUE; ThreadN-relevance+1
Match "Philip" = TRUE; ThreadN-relevance+1
End Loop
Sort ThreadN-relevance by Highest-Value, Most Recent
Output

  • 48 months ago
  • 2 points

Search is one of those things that can be either really simple or really complex. There's a whole domain on it called information retrieval.

Basic searches can do simple string comparisons, but doing pattern matching against tons of posts becomes time prohibitive. Search results need to be returned in < 100ms, so simple pattern matching will be too slow when your database gets modestly sized.

Instead, there are all sorts of indexing strategies where the posts are tokenized, categorized, and indexed. Those make the pattern matching much faster, and often allow for all sorts of extra functionality like automatically converting plural word forms, verb conjugations (typically called stemming).

I'm not an information retrieval expert by any stretch. Fortunately there are existing frameworks and setups that can be used. ElasticSearch, AWS CloudSearch, and others make it easy to build and use a performant index. You still need to know a bit about the domain to use it efficiently and effectively, but it takes a ton of the work out of it. For those, you basically set up an index structure, then upload all the "documents" (topics in our case), and then set up what fields to query.

  • 48 months ago
  • 1 point

Well, there's my reading for the day. (Seriously, thank you very much for the links!)

Very interesting... Definitely going to make me think more about how I search.

Thanks for taking the time to type up such a great response!

  • 48 months ago
  • 1 point

Awesome, you da man!

Take all the time you need.

  • 48 months ago
  • 1 point

Will it be a simple search or something advanced like searching by post, searching topic titles only, listing results as topics or as posts, searching within a specific timeframe, listing results in ascending or descending order etc?

Then you got the issue when if you search for something like "MSI Gaming" would it show the results for just "MSI Gaming" or everything which has the word "MSI" or "Gaming" in it.

  • 48 months ago
  • 1 point

I imagine that they start with simole, then add advanced search options later

  • 48 months ago
  • 1 point

Ding ding!

  • 48 months ago
  • 1 point

Ding ding with the Christmas bells, gotta have the spirit up Philip!

[comment deleted]
  • 48 months ago
  • 1 point

lol

[comment deleted]
[comment deleted]
[comment deleted by staff]
  • 48 months ago
  • 1 point

We have completed build title/description search already, but we don't index the comments. (Still debating whether we want to index the comments for those or not.)

[comment deleted by staff]
  • 48 months ago
  • 1 point

I can look into it, but if you've got specific examples I'd love to check them out to help identify what I may be doing wrong.

Sort

add arrow-down arrow-left arrow-right arrow-up authorcheckmark clipboard combo comment delete discord dots drag-handle dropdown-arrow errorfacebook history inbox instagram issuelink lock markup-bbcode markup-html markup-pcpp markup-cyclingbuilder markup-plain-text markup-reddit menu pin radio-button save search settings share star-empty star-full star-half switch successtag twitch twitter user warningwattage weight youtube