From fce26238c489db149f63164f13adf4bdb83d6062 Mon Sep 17 00:00:00 2001 From: Miguel de Icaza Date: Thu, 15 Jul 1999 00:11:56 +0000 Subject: Implemented base64 encoder based on CamelStreams. Should the 1999-07-13 Miguel de Icaza * camel/gmime-base64.c (gmime_encode_base64): Implemented base64 encoder based on CamelStreams. Should the encoder/decoder be a Stream itself? * camel/gmime-utils.c: include config.h here. * camel/url-util.c: ditto. * camel/gstring-util.c: ditto. * camel/gmime-content-field.c: ditto. * camel/camel-stream.c: ditto. * camel/camel-stream-fs.c: ditto. * camel/camel-store.c: ditto. * camel/camel-simple-data-wrapper.c: ditto. * camel/camel-session.c: ditto. * camel/camel-service.c: ditto. * camel/camel-mime-part.c: ditto. * camel/camel-mime-message.c: ditto. * camel/camel-log.c: ditto. * camel/camel-data-wrapper.c: ditto * camel/camel-folder.c: ditto. * camel/camel-stream.c (camel_stream_write): Moved api documentation to the places that they document. (camel_stream_class_init): Virtual classes do not need to have a default implementation. So null them all. (camel_stream_write): Return value from write. (camel_stream_available): implement. (camel_stream_write_strings): documented. * devel-docs/query/virtual-folder-in-depth.sgml: Small reformatting 1999-06-28 bertrand * tests/test2.c (main): now use CamelDataWrapper::contruct_form_stream to test svn path=/trunk/; revision=1024 --- devel-docs/query/virtual-folder-in-depth.sgml | 772 +++++++++++++------------- 1 file changed, 392 insertions(+), 380 deletions(-) (limited to 'devel-docs') diff --git a/devel-docs/query/virtual-folder-in-depth.sgml b/devel-docs/query/virtual-folder-in-depth.sgml index fc85132673..e6ba7ecf18 100644 --- a/devel-docs/query/virtual-folder-in-depth.sgml +++ b/devel-docs/query/virtual-folder-in-depth.sgml @@ -3,393 +3,405 @@
- - - - Giao - Nguyen - - - An in-depth look at the virtual folder mechanism - - - This document describes a different way of approaching mail - organization and how all things are possible in this brave new - world. This document does not describe physical storage issues nor - interface issues. - - - Historically mail has been organized into folders. These folders - usually mapped to a single storage medium. The relationship between - mail organization and storage medium was one to one. There was one - mail organization for every storage medium. This scheme had its - limitations. - - - Efforts at categorizations are only meaningful at the instance that - one categorized. To find any piece of data, regardless of how well - it was categorized, required some amount of searching. Therefore, any - attempts to nullify searching is doomed to fail. It's time to embrace - searching as a way of life. - - - These are the terms and their definitions. The example rules used are - based on the syntax for VM (http://www.wonderworks.com/vm/) by Kyle - Jones whose ideas form the basis for this. I'm only adding the - existence of summary files to aid in scaling. I currently use VM and - it's virtual-folder rules for my daily mail purposes. To date, my only - complaints are speed (it has no caches) and for the unitiated, it's - not very user-friendly. - - - Comments, questions, rants, etc. should be directed at Giao Nguyen - (grail@cafebabe.org) who will try to address issues in a timely - manner. - - - - - Definitions - - Store - - A location where mail can be found. This may be a file (Berkeley - mbox), directory (MH), IMAP server, POP3 server, Exchange server, - Lotus Notes server, a stack of Post-Its by your monitor fed through - some OCR system. - - - - Message - - An individual mail message. - - - - Vfolder - - A group of messages sharing some commonality. This is the result of a - query. The vfolder maybe contained in a store, but it is not necessary - that a store holds only one vfolder. There is always an implicit - vfolder rule which matches all messages. A store contains the vfolder - which is the result of the query (any). It's short for virtual folder - or maybe view folder. I dunno. - - - - Default-vfolder - - The vfolder defined by (any) applied to the store. This is not the - inbox. The inbox could easily be defined by a query. A default rule - for the inbox could be (new) but it doesn't have to be. Mine happens - to be (or (unread) (new)). - - - - Folder - - The classical mail folder approach: one message organization per - store. - - - - Query - - A search for messages. The result of this is a vfolder. There are two - kinds of queries: named queries and lambda queries. More on this - later. - - - - Summary file - - An external file that contains pointers to messages which are matches - for a named query. In addition to pointers, the summary file should - also contain signatures of the store for sanity checks. When the term - "index" is used as a verb, it means to build a summary file for a - given name-value pair. - - - + + + + Giao + Nguyen + + - - Queries - - Named queries are analogous to classical mail folders. Because named - queries maybe reused, summary files are kept as caches to reduce - the overall cost of viewing a vfolder. Summary files are superior to - folders in that they allow for the same messages to appear in multiple - vfolders without message duplications. Duplications of messages - defeats attempts at tagging a message with additional user information - like annotations. Named queries will define folders. - - - Lambda queries are similar to named queries except that they have no - name. These are created on the fly by the user to filter out or - include certain messages. - - - All queries can be layered on top of each other. A lambda query can be - layered on a named query and a named query can be layered on a lambda - query. The possibilities are endless. - - - The layerings can be done as boolean operations (and, or, not). Short - circuiting should be used. - - - Examples: - - (and (author "Giao") - (unread)) - - The (unread) query should only be evaluated on the results of (author - "Giao"). - - (or (author "Giao") - (unread)) - - Both of these queries should be evaluated. Any matches are added to the - resulting vfolder. - - - - - Summary files - - Summary files are only meaningful when applied to the context of the - default-vfolder of a store. - - - Summary files should be generated for queries of the form: - - (function "constant value") - - Summary files should never be generated for queries of the form: - - (function (function1)) - - (and (function "value") - (another-function "another value")) - - Given a query of the form: - - (and (function "value") - (another-function "another value")) - - The system should use one summary file for (function "value") and - another summary file for (another-function "another value"). I will - call the prior form the "plain form". - - - It should be noted that the signature of the store should be based on - the assumption that new data may have been added to the store since - the application generated the summary file. Signatures generated on - the entirety of the store will most likely be meaningless for things - like POP/IMAP servers. - - + An in-depth look at the virtual folder mechanism + + + This document describes a different way of approaching mail + organization and how all things are possible in this brave new + world. This document does not describe physical storage issues + nor interface issues. + + + Historically mail has been organized into folders. These + folders usually mapped to a single storage medium. The + relationship between mail organization and storage medium was + one to one. There was one mail organization for every storage + medium. This scheme had its limitations. + + + Efforts at categorizations are only meaningful at the instance that + one categorized. To find any piece of data, regardless of how well + it was categorized, required some amount of searching. Therefore, any + attempts to nullify searching is doomed to fail. It's time to embrace + searching as a way of life. + + + These are the terms and their definitions. The example rules used are + based on the syntax for VM (http://www.wonderworks.com/vm/) by Kyle + Jones whose ideas form the basis for this. I'm only adding the + existence of summary files to aid in scaling. I currently use VM and + it's virtual-folder rules for my daily mail purposes. To date, my only + complaints are speed (it has no caches) and for the unitiated, it's + not very user-friendly. + + + Comments, questions, rants, etc. should be directed at Giao Nguyen + (grail@cafebabe.org) who will try to address issues in a timely + manner. + + + - - Incremental indexing - - When new messages are detected, all known queries should be evaluated - on the new messages. vfolders should be notified of new messages that - are positive matches for their queries. The indexes generated by this - process should be merged into the current indexes for the vfolder. - - + + + Definitions + + Store + + A location where mail can be found. This may be a file (Berkeley + mbox), directory (MH), IMAP server, POP3 server, Exchange server, + Lotus Notes server, a stack of Post-Its by your monitor fed through + some OCR system. + + - - Can I have multiple stores? - - I don't see why not. Again, the inbox is a vfolder so you can get a - unified inbox consisting of all new mail sent to all your stores or - your can get inboxes for each store or any combination your heart - desire. You get your cake, eat it, and someone else cleans the dishes! - - + + Message + + An individual mail message. + + + + Vfolder + + A group of messages sharing some commonality. This is the result of a + query. The vfolder maybe contained in a store, but it is not necessary + that a store holds only one vfolder. There is always an implicit + vfolder rule which matches all messages. A store contains the vfolder + which is the result of the query (any). It's short for virtual folder + or maybe view folder. I dunno. + + + + Default-vfolder + + The vfolder defined by (any) applied to the store. This is not the + inbox. The inbox could easily be defined by a query. A default rule + for the inbox could be (new) but it doesn't have to be. Mine happens + to be (or (unread) (new)). + + + + Folder + + The classical mail folder approach: one message organization per + store. + + + + Query + + A search for messages. The result of this is a vfolder. There are two + kinds of queries: named queries and lambda queries. More on this + later. + + + + Summary file + + An external file that contains pointers to messages which are matches + for a named query. In addition to pointers, the summary file should + also contain signatures of the store for sanity checks. When the term + "index" is used as a verb, it means to build a summary file for a + given name-value pair. + + + - - Why all this? - - Consider the dynamic nature of the following query: - - (and (author "Giao") - (sent-after (today-midnight))) - - today-midnight would be a function that is evaluated at run-time to - calculate the appropriate object. - - + + + Queries + + Named queries are analogous to classical mail folders. Because named + queries maybe reused, summary files are kept as caches to reduce + the overall cost of viewing a vfolder. Summary files are superior to + folders in that they allow for the same messages to appear in multiple + vfolders without message duplications. Duplications of messages + defeats attempts at tagging a message with additional user information + like annotations. Named queries will define folders. + + + Lambda queries are similar to named queries except that they have no + name. These are created on the fly by the user to filter out or + include certain messages. + + + All queries can be layered on top of each other. A lambda query can be + layered on a named query and a named query can be layered on a lambda + query. The possibilities are endless. + + + The layerings can be done as boolean operations (and, or, not). Short + circuiting should be used. + + + Examples: + +(and (author "Giao") + (unread)) + + The (unread) query should only be evaluated on the results of (author + "Giao"). + +(or (author "Giao") + (unread)) + + Both of these queries should be evaluated. Any matches are added to the + resulting vfolder. + + - - Scenarios of usage and their solutions - - Mesage alterations - - This is a fuzzy area that should be left to the UI to handle. Messages - are altered. Read status are altered when a new message is read for - example. How do we handle this if our query is for unread messages? - Upon viewing the state would change. - - - One idea is to not evaluate the queries unless we're changing between - vfolder views. This assumes that one can only view a particular - vfolder at a time. For multi-vfolder viewing, a message change should - propagate through the vfolder system. Certain effects (as in our - example) would not be intuitive. - - - It would not be a clean solution to make special cases but they may be - necessary where certain defined fields are ignored when they are - changed. Some combination of the above rules can be used. I don't - think it's an easy solution. - - - - Message inclusion and exclusion - - Messages are included and excluded also with queries. The final query - will have the form of: - - (and (author "Giao") - (criteria value) - (not (criteria other-value))) - - Userland criterias may be a label of some sort. These may be userland - labels or Message-IDs. What are the performance issues involved in - this? With short circuiting, it's not a major problem. - - - The criterias and values are determined by the UI. The vfolder - mechanism isn't concerned with such issues. - - - Messages can be included and excluded at will. The idea is often - called "arbitrary inclusion/exclusion". This can be done by - Message-IDs or other fields. It's been noted that Message-IDs are not - unique. - - - I propose that any given vfolder is allocated an inclusion label and an - exclusion label. These should be randomly generated. This should be - part of the vfolder description. It should be noted that the vfolder - description has not been drafted yet. - - - The result is such that the rules for a given named query is: - - (and (user-query) - (label inclusion-label) - (not exclusion-label)) - - - - - Query scheduling - - Consider the following extremely dynamic queries: - - A: - (and (author "Giao") - (sent-after (today-midnight))) - - B: - (and (sent-after (today-midnight)) - (author "Giao")) - - C: - (or (author "Giao") - (sent-after (today-midnight))) - - Query A would be significantly faster because (author "Giao") is not - dynamic. A summary file could be generated for this query. Query B is - slow and can be optimized if there was a query compiler of some - sort. Query C demonstrates a query in which there is no good - optimization which can be applied. These come with a certain amount of - baggage. - - - It seems then that for boolean 'and' operations, plain forms should be - moved forward and other queries should be moved such that they are - evaluated later. I would expect that the majority of queries would be - of the plain form. - - - First is that the summary file is tied to the query and the store - where the query originates from. Second, a hashing function for - strings needs to be calculated for the query so that the query and the - summary file can be associated. This hashing function could be similar - to the hashing function described in Rob Pike's "The Practice of - Programming". (FIXME: Stick page number here) - - - - Archives - - Many people are concerned that archives won't be preserved, archives - aren't supported, and many other archive related issues. This is the - short version. - - - Archives are just that, archives. Archives are stores. Take your - vfolder, export it to a store. You are done. If you load up the store - again, then the default-vfolder of that store is the view of the - vfolder, except the query is different. - - - The point to vfolder is not to do away with classical folder - representation but to move the queries to the front where it would - make data management easier for people who don't think in terms of - files but in terms of queries because ordinary people don't think in - terms of files. - - - + + + Summary files + + Summary files are only meaningful when applied to the context of the + default-vfolder of a store. + + + Summary files should be generated for queries of the form: + +(function "constant value") + + Summary files should never be generated for queries of the form: + + (function (function1)) + + (and (function "value") + (another-function "another value")) + + Given a query of the form: + + (and (function "value") + (another-function "another value")) + + The system should use one summary file for (function "value") and + another summary file for (another-function "another value"). I will + call the prior form the "plain form". + + + It should be noted that the signature of the store should be based on + the assumption that new data may have been added to the store since + the application generated the summary file. Signatures generated on + the entirety of the store will most likely be meaningless for things + like POP/IMAP servers. + + - - Miscellany - - Annotations - - There should be a scheme to add annotations to messages. Common mail - user agents have used a tag in the message header to mark messages as - read/unread for example. Extending on this we have the ability to add - our own data to a message to add meaning to it. If we have a good - scheme for doing this, new possibilities are opened. - - - Keywords + + + Incremental indexing - When sending a message, a message could have certain keywords attached - to it. While this can be done with the subject line, the subject line - has a tendency to be munged by other mail applications. One popular - example is the "[rR]e:" prefix. Using the subject line also breaks the - "contract" with other mail user agents. Using keywords in another - field in the message header allows the sender to assist the recipient - in organizing data automatically. Note that the sender can only - provide hints as the sender is unlikely to know the organization - schemes of the recipient. + When new messages are detected, all known queries should be evaluated + on the new messages. vfolders should be notified of new messages that + are positive matches for their queries. The indexes generated by this + process should be merged into the current indexes for the vfolder. - - - - Scope - - Let us assume that we have multiple stores. Does a query work on a - given store? Or does it work on all stores? Or is it configurable such - that a query can work on a user-selected list of stores? - - - + - - Alternatives to the above - - Jim Meyer (purp@selequa.com) is putting some notes on where - annotations needs to be located. They'll be located here as well as - any contributions I may have to them. - - + + + Can I have multiple stores? + + I don't see why not. Again, the inbox is a vfolder so you can get a + unified inbox consisting of all new mail sent to all your stores or + your can get inboxes for each store or any combination your heart + desire. You get your cake, eat it, and someone else cleans the dishes! + + + + + + Why all this? + + Consider the dynamic nature of the following query: + +(and (author "Giao") + (sent-after (today-midnight))) + + today-midnight would be a function that is evaluated at run-time to + calculate the appropriate object. + + + + + + Scenarios of usage and their solutions + + Mesage alterations + + This is a fuzzy area that should be left to the UI to handle. Messages + are altered. Read status are altered when a new message is read for + example. How do we handle this if our query is for unread messages? + Upon viewing the state would change. + + + One idea is to not evaluate the queries unless we're changing between + vfolder views. This assumes that one can only view a particular + vfolder at a time. For multi-vfolder viewing, a message change should + propagate through the vfolder system. Certain effects (as in our + example) would not be intuitive. + + + It would not be a clean solution to make special cases but they may be + necessary where certain defined fields are ignored when they are + changed. Some combination of the above rules can be used. I don't + think it's an easy solution. + + + + Message inclusion and exclusion + + Messages are included and excluded also with queries. The final query + will have the form of: + + (and (author "Giao") + (criteria value) + (not (criteria other-value))) + + Userland criterias may be a label of some sort. These may be userland + labels or Message-IDs. What are the performance issues involved in + this? With short circuiting, it's not a major problem. + + + The criterias and values are determined by the UI. The vfolder + mechanism isn't concerned with such issues. + + + Messages can be included and excluded at will. The idea is often + called "arbitrary inclusion/exclusion". This can be done by + Message-IDs or other fields. It's been noted that Message-IDs are not + unique. + + + I propose that any given vfolder is allocated an inclusion label and an + exclusion label. These should be randomly generated. This should be + part of the vfolder description. It should be noted that the vfolder + description has not been drafted yet. + + + The result is such that the rules for a given named query is: + + (and (user-query) + (label inclusion-label) + (not exclusion-label)) + + + + + Query scheduling + + Consider the following extremely dynamic queries: + + A: + (and (author "Giao") + (sent-after (today-midnight))) + + B: + (and (sent-after (today-midnight)) + (author "Giao")) + + C: + (or (author "Giao") + (sent-after (today-midnight))) + + Query A would be significantly faster because (author "Giao") is not + dynamic. A summary file could be generated for this query. Query B is + slow and can be optimized if there was a query compiler of some + sort. Query C demonstrates a query in which there is no good + optimization which can be applied. These come with a certain amount of + baggage. + + + It seems then that for boolean 'and' operations, plain forms should be + moved forward and other queries should be moved such that they are + evaluated later. I would expect that the majority of queries would be + of the plain form. + + + First is that the summary file is tied to the query and the store + where the query originates from. Second, a hashing function for + strings needs to be calculated for the query so that the query and the + summary file can be associated. This hashing function could be similar + to the hashing function described in Rob Pike's "The Practice of + Programming". (FIXME: Stick page number here) + + + + Archives + + Many people are concerned that archives won't be preserved, archives + aren't supported, and many other archive related issues. This is the + short version. + + + Archives are just that, archives. Archives are stores. Take your + vfolder, export it to a store. You are done. If you load up the store + again, then the default-vfolder of that store is the view of the + vfolder, except the query is different. + + + The point to vfolder is not to do away with classical folder + representation but to move the queries to the front where it would + make data management easier for people who don't think in terms of + files but in terms of queries because ordinary people don't think in + terms of files. + + + + + + + Miscellany + + Annotations + + There should be a scheme to add annotations to messages. Common mail + user agents have used a tag in the message header to mark messages as + read/unread for example. Extending on this we have the ability to add + our own data to a message to add meaning to it. If we have a good + scheme for doing this, new possibilities are opened. + + + Keywords + + When sending a message, a message could have certain keywords attached + to it. While this can be done with the subject line, the subject line + has a tendency to be munged by other mail applications. One popular + example is the "[rR]e:" prefix. Using the subject line also breaks the + "contract" with other mail user agents. Using keywords in another + field in the message header allows the sender to assist the recipient + in organizing data automatically. Note that the sender can only + provide hints as the sender is unlikely to know the organization + schemes of the recipient. + + + + + Scope + + Let us assume that we have multiple stores. Does a query work on a + given store? Or does it work on all stores? Or is it configurable such + that a query can work on a user-selected list of stores? + + + + + + + Alternatives to the above + + Jim Meyer (purp@selequa.com) is putting some notes on where + annotations needs to be located. They'll be located here as well as + any contributions I may have to them. + +
-- cgit v1.2.3