At ayfie, we are all about innovation and creating pioneering legal tech products. Our new blog post series “ayfie’s power features” highlights our latest and most exciting product features. Last week, we introduced our new emoji extraction that is becoming more and more important for court cases. This week’s blog post is all about introducing you to another one of our brand new features that we are very proud of: a literal reinvention of email threading that is extremely fast and insightful.
Email accounts for more than 50 percent of documents in eDiscovery. Given the amount of digital evidence being created grows exponentially every year means that even a small case usually involves thousands of email messages. Furthermore, emails are not only prevalent in electronic collections but are incredibly important because they preserve communication between parties along a time axis, providing insight into who knew what and when. It is the context of these conversations that make them so necessary to eDiscovery and document review.
The problem with converting email conversations to reviewable documents lies in the need to maintain the original context. When the initial communication occurred, it was written and read in threads - where one email is a reply to another one, or forward to another person. That organization is lost when email is collected and processed into its raw format. In the raw email data format, the context is hard to grasp as emails are no longer part of conversations but rather part of an unstructured pile of documents within the entire collection. Another obstacle when dealing with emails stems from mail duplicates. Duplication happens e.g., when the same email exists in the sender’s outbox and the recipient’s inbox. This dramatically expands the set of documents, dilutes the original conversation context by adding multiple copies of conversations and therefore increases review time and costs.
The solution of "email threading" is meant to reconstruct the conversational structure of the emails from the raw documents, identify duplicates and annotate the emails with data that reconstitutes the conversation and supports the review process.
To maximize its usefulness, we have completely redone our existing email threading. The new version combines gained experience and customer input over the past years. Aside from significantly improving the richness of the generated meta data, the changes reduce processing time and memory consumption to a fraction of the status quo. More on that later.
In this all-new email threading mechanism, we offer our customers a variety of different methods to reconstruct the threading structure of millions upon millions of emails even if some metadata might be missing.
It is crucial to not only look at external factors contained in the header of every email but really look into the content of each email in order to predict exactly how that email should be sorted into the overall structure and if the email needs to be individually reviewed or not.
ayfie’s email threading mechanism doesn’t only look at external metadata such as conversation indexes – which are produced by Microsoft programs like Outlook – to reconstruct email threads. Instead, we do the same thing a human reviewer would do: We look at the content of the email. We analyze the text and the reply/forward structure (e.g., indentation/quotation) and can reconstruct a threading structure even if metadata – recipients, subject line, sent date, etc. – is missing or corrupted.
ayfie’s email threading automatically detects highly repetitive content like confidentiality footers such that it can remove irrelevant text blocks and improve the threading quality even further.
If this sounds interesting to you, and you want to find out what else ayfie’s all-new email threading mechanism can do, please book a demo here or continue reading for detailed information.
Missing and deleted emails are a problem in almost all eDiscovery cases. Quite often, not all documents have been made available for the review process. This can change the outcome of a trial significantly.
That’s why we created a robust mechanism for reconstructing so-called “ghost emails,” which don’t exist as a primary document in the collected data set but can be reconstructed from other emails quoting them or by analyzing additional information sources like the conversational index. The ayfie email threading process will rebuild the structure of email trees even when faced with multiple interleaved ghosts, like the following image shows:
Figure 1: The emails between A and D don't exist in the data set but were reconstructed from
quotes in D and other metadata. The same goes for the parent email of H.
Significant for reducing an attorney’s analysis time (and thus costs) is the “unique content marker,” which tags documents that are usually the last mail in an email thread and thereby include all the previous content of that thread branch.
Reading through an email carrying the unique content marker tag guarantees the reviewer will know everything about all included emails.
ayfie’s new email threading assists the review process by highlighting many properties of any given email in the context of their thread. It emphasizes things that could point to suspicious behavior like changed subject lines and variations in recipients.
The reviewer is able to retrieve information on each individual email whether it’s a root email, a ghost email, a reply, forward, draft, has an attachment or not and of which type the attachment is- for instance, an Office document, an image, PDF file, etc.
Below, you see an example of an email thread branch, where emails C, D, E, and G are replies (white arrow), email F is a forwarded (black arrow), and email H is a draft (pencil icon). email D has an attachment (folder symbol). email G has one or more duplicates in the data set, which don’t need to be individually reviewed (illustrated stack of emails). The list of recipients has changed on email G (head symbol).
Our next-generation email threading feature will aggregate threading information from multiple sources into a single consistent view. In the example, dashed links represent links that stem from the conversational index; a regular line signifies that the parent was found by analyzing the content. Email A was detected to be the parent of email C (with one ghost in between) based on conversational index and content. That is the best evidence possible! Email D was linked to email C by just the conversational index and thus, email D does not fully quote email D. Email E was found to be a child of D by quotation alone. That’s why C and E are marked as "mail contains unique content” (pale red). Only if both emails E and C are inspected, it is guaranteed that every text passage of this branch has been reviewed. Our tags are optimal in so far that they a) ensure that nothing escapes the review process b) only the absolute minimum number of emails must be reviewed.
Figure 2: Example of an email thread branch, where email
C, D, E,and G are replies, email email F is a forwarded email and email H is a draft.
As you can see, ayfie offers eDiscovery reviewers a precise, fast and easy way to sift through loads of emails without spending time reading irrelevant content. The wealth of information that ayfie’s brand-new email threading mechanism extracts and makes available to users ensures flexible ways of navigating into potentially compromising conversations in order to uncover the smoking gun faster and more efficiently.
Any legal practice, no matter how big the case can benefit from automatic email analysis.
If you want to find out more, please contact us.
The next edition of our blog post series will feature ayfie's powerful extraction capabilities for content, entities and PII.