- Hashes



Overview of Hashes
Lawyers use hashes almost every day, often without realizing it. Hashes are widely used to authenticate digital documents in criminal and civil cases. The Federal Court's Electronic Case Filing system uses MD5 hashes to verify digital signatures, and to verify proper storage of all pleadings and documents that are filed electronically.

Hashes are also an integral part of Casdex's archive system. We use hashes to help identify content and ensure that documents on our system have not been altered. With our system, a hash is generated when the document is archived. You can then use that hash to prove the document's authenticity and that the content has not changed since the file was archived.


What is a hash?
Simply put, a hash value is a representation of a digital document. The hash value is created by running a hashing algorithm on the document. The output of the hashing algorithm is a string of alphanumeric characters of a fixed length. The document can be any type of digital file (pictures, video, audio, text, pdf etc.), or even disks or disk images. There are several different types of hashes, but Casdex uses MD5 and SHA-1. In cryptographic terms these hashes are both one-way, and collision resistant.

One Way
Hashes are said to be "one-way" when it is computationally infeasible to produce, or reproduce a document from a known hash. In this way, you can transmit a hash in the place of information you wish to keep secure and anyone viewing the hash knows that it is the correct information, but does not know what information is being represented. Most password systems work this way.

Collision Resistant
Hashes are said to be "collision resistant" when it is computationally infeasible to find two documents with different content that produce the same hash. The chance of a hash collision with the MD5 hash algorithm is one in 1038. The chances of a hash collision with the SHA-1 hash algorithm is one in 1048. Casdex uses both hashes, so the chance of a collision is between one and 1048 and one in 1086. For comparison, the chance of winning the Powerball lottery is one in 108.

These two properties combine to make hashing documents uniquely suited for any number of file operations within the legal community. Forensic experts use hashes to confirm the contents of a document or disk have not changed through the process of examination and chain of custody. Police and investigators use hashes to search for illegal content without viewing the documents. E-Discovery teams use hashes as a way to quickly cull through terabytes of digital documents for litigation.

Casdex offers lawyers a new way to use hashes.


Hashes in Casdex
Casdex uses hashes to ensure document authenticity. In the same way a forensic expert can use a hash to prove that a document's contents have not changed, lawyers archiving their documents with Casdex can prove their documents are unchanged. Hashes not only prove authenticity of a given document, they meet the requirements of the Original Writing rule and can verify the contents of the document as well.

Attorneys can use Casdex to archive a contract, will, or any other document, and publish the hash to the interested parties. If a dispute arises, they can be secure in the knowledge that the legally operative copy of the document can easily be identified.

In addition, Casdex shoulders the burden of managing a digital archive for its customers. With retention policies, workgroup and external document sharing, and document authentication, Casdex is a digital archive for the small law firm.


Hashes in Court
Case law is developing around hashes and their use in forensics, e-discovery, and authentication of evidence. The following are cases citing hashes as a method of authenticating documents or proving that contents have not been altered:

US District Court of Kansas
Williams v. Sprint/United Mgmt. Co. (Williams II), 2006 WL 3691604 (D. Kan. Dec. 12, 2006)

Defendant's concerns regarding maintaining the integrity of the spreadsheet's values and data could have been addressed by the less intrusive and more efficient use of "hash marks." For example, Defendant could have run the data through a mathematical process to generate a shorter symbolic reference to the original file, called a "hash mark" or "hash value," that is unique to that particular file. n74. This "digital fingerprint" akin to a tamper-evident seal [**51] on a software package would have shown if the electronic spreadsheets were altered. When an electronic file is sent with a hash mark, others can read it, but the file cannot be altered without a change also occurring in the hash mark."

US District Court of Maryland
Lorraine v. Markel Am. Ins. Co., PWG-06-1893, at 26 (D. Md. May 4, 2007)

One method of authenticating electronic evidence under Rule 901(b)(4) is the use of "hash values" or "hash marks" when making documents. A hash value is:

A unique numerical identifier that can be assigned to a file, a group of files, or a portion of a file, based on a standard mathematical algorithm applied to the characteristics of the data set. The most commonly used algorithms, known as MD5 and SHA, will generate numerical values so distinctive that the chance that any two data sets will have the same hash value, no matter how similar they appear, is less than one in one billion. 'Hashing' is used to guarantee the authenticity of an original data set and can be used as a digital equivalent of the Bates stamp used in paper document production.23

Hash values can be inserted into original electronic documents when they are created to provide them with distinctive characteristics that will permit their authentication under Rule 901(b)(4). Also, they can be used during discovery of electronic records to create a form of electronic "Bates stamp" that will help establish the document as electronic.24 This underscores a point that counsel often overlook. A party that seeks to introduce its own electronic records may have just as much difficulty authenticating them as one that attempts to introduce the electronic records of an adversary. Because it is so common for multiple versions of electronic documents to exist, it sometimes is difficult to establish that the version that is offered into evidence is the "final" or legally operative version. This can plague a party seeking to introduce a favorable version of its own electronic records, when the adverse party objects that it is not the legally operative version, given the production in discovery of multiple versions. Use of hash values when creating the "final" or "legally operative" version of an electronic record can insert distinctive characteristics into it that allow its authentication under Rule 901(b)(4).


External Links
The legal community is discussing hashes in publications and law reviews like:
Register Now!
© Copyright 2008 Casdex.com, inc.