Closed

Database architecture assistance/implementation for vertical search

This project received 9 bids from talented freelancers with an average bid price of € EUR.

Get free quotes for a project like this
Employer working
Project Budget
N/A
Total Bids
9
Project Description

We need a database solution to handle storage, querying and a sort of faceting for the following structure of documents:

[STORAGE]

Document = {
title: 'Document title',
content: '500-word paragraph here'
document_intervals: [
{start_date: "07/31/2013", end_date: "08/15/2013"}
],
tags: [
{
tag_id: 123,
tag_intervals: [
{start_date: "01/31/2013", end_date: "07/15/2013"},
{start_date: "06/25/2013", end_date: "11/30/2013"}
],
match_country: [Romania, France, Spain]
},
{
tag_id: 146,
matching_intervals: [] // this means TAG 146 matches this documents regardless of date
}
]
}

[SEARCHING]

The user should be able to search by TAG, TIME INTERVAL and fulltext search on CONTENT. Examples:

- "Show documents tagged with TAG 146"

- "Show documents that are tagged with TAG 123, but only if interval ("03/10/2013" - "03/10/2013") overlaps the tag's TAG_INTERVALS"

- "Show documents where interval ["02/01/2013" - "03/01/2014"] overlaps at least one of the DOCUMENT_INTERVALS"

- "Show documents matching "some words" inside CONTENT"

[NARROW SEARCH]

Once the documents are retrieved, we also need to show matching tags and dates, so the user can narrow his search (faceted search, but we don't necesarily need the item count for each tag)

Considerations for faceting:
- If the user's query contains a date interval, then, for the "narrow-your-search" menu we would only pull out tags that match the desired interval.
- Based on the DOCUMENT_INTERVALS of the matched documents we need to also show available days/months for further narrowing (which effectively means enabling/disabling clickable dates on a calendar).
- Even before any searching is done, a selection of available tags should be displayed for the user to start his search.

[TAG AUTOCOMPLETE]

There are about 2 million tags in the database. We also need a solution for the bi-lingual auto-complete search bar.

- the autocomplete only takes tags into consideration (all the suggestions are tags). If possible, only tags that actually match documents would appear.
- each tag may have aliases. "Alexandre Dumas", "Al. Dumas", "Dumas", "Alexander Dumas" would be added as aliases so that no matter how the user types the name, the autocomplete will show "Alexandre Dumas".

[YOUR JOB]:
- recommend (and possibly set up appropriate database engine(s))

- explain the data structures and algorithm(s) we have to implement so that search works as expected (we're fast learners and good students)
OR
- explain and implement this yourself (and also help out integrating your work in our existing system)

- make sure the proposed solution is scalable and fast

[NOT YOUR JOB]:
- user interface
- admin control panels for adding/updating documents and assigning tags
- visual design of any kind
- matching documents with tags
[ the above are NOT your job, just to be clear. they will be done by other awesome people, like yourself, but with a different skill set, we need you focused on the hard-core database stuff ]

As you can see, this can either be
- a consulting job (if you're a very experienced search/database guru that doesn't want to actually write code, just share your awesomeness)
- a database design/coding job (if you love PHP, you want to be directly involved in the coding and you're a fast, efficient coder)

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online