2001-09-17 <NotZed@Ximian.com> * disktail.c (tail_space): Fix the tail space calculation, it didn't always take into account the space used by a new tail node (i think). (tail_info): Add a failback assertion that end >= start. Fix all callers (needed to add blocks argument). 2001-08-16 Not Zed <NotZed@Ximian.com> * dumpindex.c (main): Force open of internal data before using it. * ibex_block.c (ibex_use): Use a next pointer so we keep the list intact as we scan it. 2001-08-15 Not Zed <NotZed@Ximian.com> * ibex_block.c (ibex_use): New function to limit use of fd's. Mark an ibex file in use, re-open if necessary, and close off another (lru) if necessary. (ibex_unuse): Mark an ibex as not in use. (ibex_open): Delay opening of the actual block file till later, and add the ibex to a global list. (ibex_save): Use/unuse around operations. (close_backend): Zero out blocks when closed. (ibex_close): Remove the ibex from the global list before closing it down. (ibex_index_buffer, ibex_find, ibex_unindex, ibex_find_name): Use/unuse around ops. 2001-08-10 Not Zed <NotZed@Ximian.com> * wordindexmem.c (sync_cache_entry): NOOP if writing to a failed file. (word_index_pre): NOOP if failed file. (ibex_create_word_index_mem): Setup blocks value. ** Added internal exception handling to libibex, in the case of errors with on-disk data, exceptions are returned. * block.c (ibex_block_cache_open): Detect fatal errors below us and clean up appropriately. (ibex_block_cache_fail): New function to handle the failure, and keep track of it. (ibex_block_cache_sync): Dont do anything if we've failed on this file. * disktail.c (tail_compress): Add blocks param so we can assert for exceptions. * hash.c, block.c disktail.c: g_assert->ibex_block_cache_assert where dealing with external data. * hash.c (hash_info): Add index param so we can assert for exceptions. * ibex_block.c (ibex_index_buffer): Setjmp before calling into internal routines. (ibex_save): " (ibex_unindex): " (ibex_find): " (ibex_find_name): " (ibex_contains_name): " (ibex_reset): Function to reset the index file if we have an error, call when we have an error. * block.h (ibex_block_cache_assert): Create assertion/exception macros, and include a setjmp buffer for returning it. 2001-08-09 Not Zed <NotZed@Ximian.com> * Makefile.am (libibex_la_SOURCES): Remove wordindex.c, wordindexmem is what's used. 2001-06-01 Peter Williams <peterw@ximian.com> * Makefile.am (dumpindex_LDADD): Add GAL_LIBS here too. (testindex_LDADD): And here. 2001-04-25 Dan Winship <danw@ximian.com> * Makefile.am (libibex_la_LIBADD): Add GAL_LIBS for gunicode stuff (until glib 2.0) (INCLUDES): Use EXTRA_GNOME_CFLAGS (dumpindex_LDADD, testindex_LDADD): fix Remove references to mkindex and lookup. * ibex_block.c (ibex_normalise_word, utf8_category): Convert to gunicode interfaces * ibex_db.c, lookup.c, mkindex.c: Unused, remove. 2001-03-26 Kjartan Maraas <kmaraas@gnome.org> * disktail.c: Header shuffling. Move glibc headers before gnome stuff. * testindex.c: Same here. * wordindexmem.c: Added <string.h> and <stdlib.h> to quench warnings from newer gcc. 2000-12-24 Not Zed <NotZed@HelixCode.com> * Merge from camel-mt-branch. 2000-12-18 Not Zed <NotZed@HelixCode.com> * dumpindex.c (main): Same here. * testindex.c (main): Add a g_thread_init(). Sigh, glib's thread stuff is snot. (read_words): Setup another flat-out thread to test multithreadedness at little bit. * ibex_block.c (ibex_index_buffer): Add locking around internal calls. (ibex_open): Init the locking mutex. (ibex_close): Free the locking mutex. (ibex_unindex): (ibex_find): (ibex_find_name): (ibex_contains_name): Add locking around internal calls. * ibex_internal.h (struct ibex): Add a lock. Include config.h 2000-12-13 Christopher James Lahey <clahey@helixcode.com> * disktail.c (tail_compress): (tail_get): Added some casts to get rid of warnings. (tail_dump): #if 0ed this out to get rid of a warning. (ibex_diskarray_dump): Added a prototype. * ibex_block.c (ibex_index_buffer): Assigned cat the value 0 to start off with to avoid a warning. 2000-12-12 Christopher James Lahey <clahey@helixcode.com> * wordindex.c (cache_sanity): Made cache_sanity only be included if d(x) is defined as x. * wordindexmem.c: Made node_sanity and cache_sanity only be included if d(x) is defined as x or if MALLOC_CHECK is defined. Made sync_value only be included if d(x) is defined as x. 2000-11-28 Not Zed <NotZed@HelixCode.com> * index.h: Turn off index stats by default. * ibex_block.c (ibex_save): And here. (ibex_close): Debug out printfs. * wordindexmem.c (ibex_create_word_index_mem): And here. (num): Made buf static. * block.c (ibex_block_cache_open): Debug out some printfs. (ibex_block_read): And here. 2000-11-17 Not Zed <NotZed@HelixCode.com> * wordindexmem.c (add_list): If we have the namecache active, and there is no name there, we add it directly and dont look it up first. * testindex.c: Some performance testing & stat gathering stuff. 2000-11-16 Not Zed <NotZed@HelixCode.com> * wordindexmem.c (ibex_create_word_index_mem): Initialise nameinit & namecache. (contains_name): On first call, load all names into memory. We usually do a whole lot of lookups in a row, and this saves a lot of penalties on a big list, for not too much a memory hit. (find_name): If we have the namelist in memory do a quick short-circuit check to see if we have to do further processing. (unindex_name): Cross check the namecache, if it is active. Remove it there too/or exit (no work to do). (word_flush): If we have the namecache active, destroy it now, as it is not needed anymore (for now). 2000-10-30 Kjartan Maraas <kmaraas@gnome.org> * hash.c: #include <stdlib.h> to remove warning. * wordindex.c: #include <stdlib.h> and <string.h>. 2000-10-26 Not Zed <NotZed@HelixCode.com> * block.c (ibex_block_cache_open): Use IBEX_VERSION rather than hardcoded version string. * ibex_internal.h (IBEX_VERSION): Bumped version again. This time I did change the index format. (IBEX_VERSION): moved into block.h * hash.c (struct _hashroot): Add a linked list of keys to the table. (struct _hashblock): Added a next pointer as a block number. (hash_insert): Link new key blocks into the key block list. (struct _HASHCursor): Renamed block to key and added a block item. (hash_cursor_next): Changed to go through the linked list of all hash items rather than through each hash chain separately. >> faster. (ibex_hash_dump_rec): Remove a warning. 2000-10-25 <jpr@helixcode.com> * ibex_block.c: No longer include <db.h> 2000-10-25 Not Zed <NotZed@HelixCode.com> * ibex_internal.h (IBEX_VERSION): Bumped to another version. The file format hasn't changed, but earlier bugs may create invalid files. * block.c (ibex_block_read): Use the root data directly. (ibex_block_cache_open): As well. (ibex_block_get): And here too. (ibex_block_cache_sync): Sync the root block directly here. * block.h: Pad root block out to 1024 bytes. Added root block to struct _memcache. * disktail.c (tail_get): Dirty the root block. (tail_get): Fix for changes to root access. (disk_remove): And here too. * wordindexmem.c (sync_cache_entry): Handle the case of not having any files in the list, which can happen now. (word_index_pre): Make sure we set the wordid on the new cache entry. * ibex_block.c (ibex_save): Sigh. Pass the right argument to index_post. 2000-10-24 JP Rosevear <jpr@helixcode.com> * .cvsignore: Shush 2000-10-24 Not Zed <NotZed@HelixCode.com> * block.c (ibex_block_cache_open): Create a word_index_mem for indexing the words, rather than a word_index. * ibex_block.c (ibex_index_buffer): If we haven't called index_pre yet, do it before indexing anything. (ibex_save): If wehave called index_pre previously, call index_post. (ibex_close): And same for here. * index.h: Added a cursor class, and cursor retrieval function for iterating through an index's keys. * wordindexmem.c (ibex_create_word_index_mem): New word class, similar to wordindex, but meant to be faster for updates. (word_index_pre): Implement. We load all keys into memory. (word_index_post): Implement. We sync and free all keys. (find): Remove lru code, its no longer a cache, but a lookup table. (add_index_cache): Remove lru code here too. (find_name): And here. (word_flush): Flush the hashtable direct. (word_close): Call flush to flush, rather than doing it ourselves. (add_index_cache): If we are in an index state, we can assume a cache miss == a new word. (word_index_post): Maintain whether or not we are in an index state, and the depth of the state. (word_index_pre): Likewise. Dont reread the index if we have already. (cache_sanity): Fixed for struct changes. * wordindex.h (IBEXWordClass): Added functions to prepare/cleanup for lots of indexing. i.e. can be used to optimise indexing speed at the cost of extra memory usage during the indexing process. * dumpindex.c: Dumps the contents of indexs. * hash.c (ibex_hash_dump_rec): Also print the word count. (hash_cursor_create): Create a new cursor for iterating through a hashtable. (hash_cursor_close): 'close' the cursor. It is upto the application to close any cursors it creates. (hash_cursor_next): Goto the next key id. (hash_cursor_next_key): Goto the next key, reutrn the key. (hash_get_cursor): Return a cursor object. * wordindex.c (unindex_name): Cross-check the cache as well. (word_index_post): (word_index_pre): Added (empty) callbacks for pre/post functions. 2000-10-12 Not Zed <NotZed@HelixCode.com> * ibex_internal.h (struct ibex): Bumped ibex rev. * block.c (ibex_block_cache_open): Bumped the ibex file revision because of the hash table size change. * index.h: Added some stat stuff. * wordindex.c (struct _wordcache): Changed files[] to be a pointer to an allocated block/or an individual item. (find): Fix for changes to struct. (find_name): " (sync_cache_entry): " (add): " (add_list): " (add_index_cache): Free the cache file array if it was created. (word_flush): And here. (word_close): And here too. (ibex_create_word_index): Double the size of the hashtables. (word_flush): Make sure we reset the wordcount to 0 if we remove the list items. DOH. (add_index_cache): Use a slightly more sohpisticated aging algorithm to remove expired nodes. 2000-10-10 Not Zed <NotZed@HelixCode.com> * hash.c (hash_find): (hash_remove): (hash_insert): Truncate key if it is too big to fit in a single block to MAX_KEYLEN bytes. 2000-09-28 Not Zed <NotZed@HelixCode.com> * block.c (ibex_block_free): Make sure we map the 'free' block to a block number when unlinking a block (fixes a lot of assertion failures). (ibex_block_cache_open): Initialise sync flag on root block. If it is not set on open then the index could be in an invalid state, and should be rescanned. (ibex_block_cache_sync): Sync root block last, and set the sync flag. (ibex_block_cache_open): Mirror root block flags in block_cache struct. (ibex_block_cache_sync): Likewise. (ibex_block_read): If we write a dirty block, then we clear the sync flag if its still set; we are no longer synced. 2000-09-19 Not Zed <NotZed@HelixCode.com> ** Merged from IBEX_DISK branch to head. * file.c: * find.c: * words.c: * index.c: Removed unused files. * block.h: Changed block to use only 24 bits for next and 8 for used, and fixed all relevant code. Some cleanup. * disktail.c (tail_get): If we use an empty tail node, then make sure we make it dirty. 2000-09-15 Not Zed <NotZed@HelixCode.com> * wordindex.c (word_close): Free hashtable on exit too. * disktail.c: Implemented tail-node storage for the end of long lists, or for short lists. Should save significant disk space (5x?). Implemented special case for 1-item lists, where the tailnode pointer is used to store the index entry. 2000-09-14 Not Zed <NotZed@HelixCode.com> * wordindex.c (add_index_key): Keys also handle tails. * hash.c (hash_set_data_block): Added new parameter to keys - a tail block (a full 32 bit block pointer). (hash_get_data_block): And same here. 2000-09-12 Not Zed <NotZed@HelixCode.com> * wordindex.c (word_close): Dont close namestore twice. 2000-09-11 Not Zed <NotZed@HelixCode.com> ** Redid almost everything, on-disk hash table to store an index to index records, mroe on the way to modularisation (more to go), now stores reverse indexes for deleting. 2000-08-31 Not Zed <NotZed@HelixCode.com> * block.c (add_key_mem): Initialise a memory based array for newly added index entries. (add_record): Changed to cache updates in memory until we hit a limit, and then flush them to disk. (get_record): Merge in-memory records with disk records. (remove_record): Remove from memory first, and if that fails, goto disk. (find_record): Check memory first, then disk if that fails. (add_datum_list): oops, copy size * sizeof(blockid_t) (add_indexed): Make sure we link in the head node when we create a new one. 2000-08-09 Christopher James Lahey <clahey@helixcode.com> * file.c, find.c: Fixed some warnings. 2000-05-11 NotZed <NotZed@HelixCode.com> * index.c (ibex_unindex): Make sure we mark the ibex as dirty. 2000-05-07 NotZed <NotZed@HelixCode.com> * file.c (ibex_save): New function, only write out the ibex if it has changed. 2000-05-07 <notzed@helixcode.com> * file.c (ibex_open): Also close the fd after we're done. * find.c (ibex_contains_name): New function to find out if a file is indexed. 2000-05-02 Matt Loper <matt@helixcode.com> * Makefile.am: set G_LOG_DOMAIN. 2000-04-12 NotZed <NotZed@HelixCode.com> * find.c (ibex_dump_all): Debug function to dump the whole index to stdout. * words.c (get_ibex_file): Use g_strdup(), not strdup(). 2000-04-11 NotZed <NotZed@HelixCode.com> * file.c (write_word): Always write out all words we have (even if its 0 ... the file expects it). No longer check for removed files. (store_word): Check for removed files here, and only add to the ordered tree if we have references left to this word. (ibex_write): First insert into the tree, to determine the wordcount to be saved in the output file, and then write that. (ibex_open): Remove some debug. * words.c (ibex_index_buffer): Always set 'unread', if it is a valid pointer (dont rely on caller to initialise it). 2000-03-26 NotZed <NotZed@HelixCode.com> * lookup.c (main): Fixed call to ibex_open. * mkindex.c (main): Fixed call to ibex_open. * file.c (ibex_open): Changed to accept flags and mode equivalent to open(2). 2000-02-25 Dan Winship <danw@helixcode.com> * *.c: add gtk-doc-style comments 2000-02-21 Matt Loper <matt@helixcode.com> * .cvsignore: Added mkindex. 2000-02-21 NotZed <NotZed@HelixCode.com> * Makefile.am: change noinst_LIBRARIES to noinst_LTLIBRARIES, and supply -static to LDFLAGS. Duh, and changed LDADD back to libibex.la. 2000-02-20 Matt Loper <matt@helixcode.com> * Makefile.am: changed mkindex_LDADD to libibex.a instead of libibex.la. 2000-02-19 Matt Loper <matt@helixcode.com> * .cvsignore: added lookup. 2000-02-18 Miguel de Icaza <miguel@nuclecu.unam.mx> * Makefile.am (lookup_LDADD): For now. make a libibex.a library so we can link it with the camel provider. I hate libtool 2000-02-16 Dan Winship <danw@helixcode.com> * Makefile.am: automakify 2000-02-16 NotZed <NotZed@HelixCode.com> * find.[ch] (ibex_find_name): Finds if a word is indexed under a given name. 2000-02-14 NotZed <notzed@zedzone.helixcode.com> * Makefile: Hack together a build using libtool. This should all be auto*'d at some point I guess. 2000-02-13 NotZed <notzed@zedzone.helixcode.com> * Added ChangeLog file.