In just three years, the total amount of data (as measured in megabytes or MB) GovQA customers provided to requesters increased by 237%, from about 8,000 MB per quarter to about 27,000 per quarter. Averaged by number of responses, the average file size doubled from about 10MB to about 20MB. For perspective, a 150-200 page black-and-white scanned PDF is around 10MB while a PDF created by Microsoft Word is 400-500 pages and 20MB.
A set of responsive documents used to look like a more-or-less tidy stack of papers. It might include photos, maps and graphics as well as written documents; but by and large you could thumb through the stack and have a reasonable sense of what was there and how it was organized.
Large files are more complex
Electronic files have completely changed the game of file complexity in the context of public records requests, covering the entire process lifecycle from record generation through to storage to review/redaction to transmittal to the requester. These file types include emails, text messages, reports and spreadsheets that can be in image form (PDF) or proprietary formats like .doc and .xls, photos, GIS data, plans and maps, audio recordings, and the 800-pound gorilla: video files.
Why have electronic files changed everything? Primarily because they contain much, much more data, and also because that data is more inscrutable to the eye. We can flip through a 300-page printed document and get a quick idea of what’s in it and whether it may be responsive much faster than we can scroll through a 300-page document. And audio and video technologies have been around long enough that we pretty much instinctively know how much information there will be to mentally process in 30 minutes of audio or video. But having an idea what we’re looking at with a 30 MB .shp file (GIS shape file) is still pretty much beyond us.
Technology...sometimes the solution, sometimes the problem
While we can keyword-search a 300-page electronic document much faster than a 300-page printed document, searching for keywords may mislead us into concluding a document isn’t responsive to a request when in fact it is. As court cases have shown “ in many contexts, the use of keywords without testing and refinement (or more sophisticated techniques) will in fact not be reasonably calculated to uncover all responsive material.”
Technology is also allowing us to transmit greater and greater quantities of data. While the file sizes of emails themselves haven’t changed much over time, what we can include with emails has – cloud technology now removes the limit on the total quantity of data that can be included in email attachments – and each single attachment can be up to 250 GB. So imagine an email chain that transmitted documents of this size and went to an initial circle of people and then perhaps on from there. If it’s possible the email or attachments are responsive to a request, you would have to figure out a whole system for ensuring the documents being sent were indeed the same ones as they were being forwarded on, whether any of them might contain information and would have to be reviewed or redacted, and perhaps if an attorney was copied at any point in the chain, invoking possible privilege. You would also have to figure out if any emails were .pst files rather than .msg files, meaning they were stored locally rather than on the server.
The ever-growing challenge of video
The mother of all large, complex files are video files, particularly law enforcement-related video like officer bodycam footage.
First, these files are just…relative to other types of files governments generally deal with…huge; and the data piles up quickly. The Chula Vista, CA police department learned from equipping a few officers with bodycams that 30 minutes of video was about 800 MB of data, and that its 200 officers would end up generating 33 terabytes of data per year – a massive amount of data to try to organize and store, and at tremendous cost. Even with technology such as cameras that compress file sizes by 50%, it’s still 15 terabytes of data. The City of Baltimore chose to forego a bodyworn camera program in part because it estimated video storage costs at $2.6 million per year.
On a day-to-day staff time and expense basis, the costliest aspect of dealing with large law-enforcement related videos is review and redaction of a host of nondisclosable data, much of it related to PII (personally identifiable information). These videos include those from bodycams but also dashcams, traffic cams, citizens, surveillance cameras from businesses, and other sources.
A California Supreme Court ruling highlighted just how fraught video files can be for jurisdictions. The City of Hayward got into a court battle with the National Lawyers’ Guild over whether the latter had to pay thousands of dollars the City had incurred to review and redact six hours of video requested by the Guild. The Court ultimately interpreted that CA State law did not intend jurisdictions to recoup staff costs for redaction, so the city was out those costs but also the much greater costs for multiple court cases. Had the Guild not agreed to reduce its request from 90 hours of video to six hours, the City would have been out tens of thousands more.
The future is big files
The Pandora’s Box of ever-increasing file sizes and ever-more technology to manage these files has been opened, and what’s escaped isn’t going back in. There is too much pressure from too many sides. Technology companies are bombarding jurisdictions with promises of tools to meet their complex needs and solve the problem that the last round of technology created (Big video files? Get these cameras that compress the file sizes!). Citizens and elected officials are calling for new tools to increase accountability and transparency, and staff need to keep them productively engaged.
Where can you turn for answers?
GovQA regularly hosts and moderates roundtables and webinars with partners, associations, customers, and other subject matter experts to discuss the challenges you face. Join GovQA and your Peers in Public Records at the next GovQA event!
The Peers in Public Records Newsletter (formerly FOIA News) is a bi-monthly e-newsletter brought to you by GovQA. It is a collection of the latest trends in public record requests and government transparency initiatives, shared stories, live roundtables, informative case studies, and actionable knowledge that will help you calm the chaos and keep your organization compliant. Send your comments to email@example.com.
Subscribe to the Peers in Public Records Newsletter
© Copyright 2021. PiPRIndex. All rights reserved.