How do we do something like this in Django? There are some existing ideas out there, for example Merquery. That post also has some links to other ideas for python-based text searching. Merquery is pretty much just an idea at the moment, and the examples are SQLObject oriented. What might a Django indexed search system look like? Here's some ideas:
- Every searchable model class has a method to index or re-index instances of it in the index database
- Index updates happen automatically on save() or other updates of the instance
- If a django object is deleted, remove the entry in the index
- If you change the object directly with SQL, its your job to call the re-index method
- There must be a way to re-build the index from scratch. This would iterate over all indexable objects.
- Comes with methods for indexing standard fields (minimally CharField and TextField) but extensible to other fields - imagine indexing PDF files...
- Ability to search over multiple classes - for example blog texts and blog comments.
- Ability to restrict search to some or all fields.
- Ability to use Fields in queries - so you could search for anyone with firstName Fred and lastName not Smith.
- Ability to specify an ordering - by relevance, or some date field.
- Get from the search result back to the Django object. HyperEstraier stores a user-defined URI with each document it indexes, and this would have to map to Django objects. Best I can think of at the moment is a URI which is a string concatenation of model and id.
class Blog(IndexedModel):by default this will index all fields that it knows how to index. This can be controlled with an inner class in the same way as the Admin inner class does:
name=CharField()
body=TextField()
class Blog(IndexedModel):
class Indexer:
fields=("name","body",)
name=CharField()
body=TextField()
secret=TextField()
moo=CowField()
This would index the name and body fields, but not the secret field, nor the moo field, since it doesn't know how to deal with CowField types.
I haven't yet thought how to organise the search end of things... Comments on all of this welcome!
B