Data, Data Everywhere/Nor Any Drop to Drink

By Stephen Embry on December 7, 2019

When the winds of change are blowing, some people are building shelters, and others are building windmills. Chinese Proverb

I recently read Tools and Weapons: The Promise and Peril of The Digital Age by Brad Smith, Microsoft’s President. Smith discusses the challenges and opportunities posed by the digital age and artificial intelligence on a variety of fronts. In one chapter, Smith discusses one of my a favorite subjects: the opportunities that data and data analytics provide for improved decision making.

His analysis, though, reminded me of the line from the Samuel Taylor Coleridge poem, The Rime of the Ancient Mariner, about the sailor surrounded by ocean water he cannot drink: “Water, water everywhere/Nor any drop to drink.”

When it comes to data, the problem is not the amount of data available but accessing it, making it drinkable. Data exists in businesses, law firms and court silos. Accessing and sharing this data to create richer and more robust analytics is particularly a problem in the legal profession. On top of lots of other challenges, concepts of attorney-client privilege, the ethical duty of confidentiality, and lawyers’ general reluctance to share come into play.

But as I discussed in a post from some three years ago (!), the opportunities of accessing and using data to aid in legal decision making and even in preventing the underlying events leading to legal problems are significant.

Data Is Reusable

Smith correctly notes that the opportunity of big data stems from the fact that it can be used repeatedly by a variety of organizations in a variety of contexts without detracting from the utility.

Data can be used repeatedly by a variety of organizations in a variety of contexts without detracting from its utility

Smith cites the work of one Matthew Trunnell, the Chief Data Officer at the Fred Hutchinson Cancer Research Center in Seattle, as an example of this reusable feature and opportunity. The Hutch, as it is known, is one of the world’s leading cancer research centers. Trunnell recognizes that the future of the Hutch and its success in fighting cancer depends largely on access to more and more data. Trunnell wants to enable health research organizations throughout the world to share cancer-related data and, by doing so, jump-start research and open analytical doors.

Trunnell imagines a world where data developed by all institutions would be available to researchers who could then mine this data for trends and patterns that might revolutionize cancer prevention and treatment. Microsoft is so convinced of the advantages of this approach, that it, with SAP and Adobe have launched the Open Data Initiative designed to provide a technology platform and tools to enable organizations to access data while continuing to own and maintain control of the data they share. Microsoft has also announced a $4 million commitment to support the Hutch’s project to help identify and facilitate sharing data and, at the same time, protecting privacy.

Another example offered by Smith of the impact of data aggregation is how the Trump campaign used data in the 2016 election. Says Smith, the Trump team connected with as many organizations as possible to aggregate as much diverse data as possible. The Trump campaign, somewhat out of necessity, relied on something closer to the shared-data approach Trunnell describes. A a result of this aggregation of diverse data, the Trump campaign had much richer data and could predict such things as the late turn of the electorate in the Midwest toward Trump. The Clinton campaign, on the other hand, relied almost solely on its technical prowess.

But What About for Law?

Back in 2016, I wrote about the potential impact of data and analytics on legal related issues. I was focused on the power of data analytics to influence claims handling by insurance companies or other entities that routinely deal with multiple claims. I reasoned that data and analytics could identify factors from past claims to find similarities and other factors in common with new claims to better predict what would happen with them: a proverbial crystal ball.

Using analytics, those responsible for managing claims could make better predictions about a claim when it is made (or even before). Analytics could also aid claims handlers in making more educated decisions about how to handle claims.

Armed with these correlations, claims handlers could determine and identify long-term, expensive claims sooner and devise strategies as to how to handle or settle them. If a claims manager knew a particular claim, for example, would likely be one with a substantial chance to result in litigation or a protracted adjustment battle, the handler could try to resolve the matter early on. Or, failing that, the handler would have a better chance to put strategies in place to mitigate the likely losses.

Similarly, if litigation occurs, data based analytical models could also help identify suits that would likely incur high defense costs and/or exposure. This ability would improve outside attorney selection and management: the claim could be assigned to an outside lawyer with the skill set and experience to handle significant exposure matters even if the exposure is not self-evident initially. Again, better allocation of resources and better ability to predict indemnity and legal spend.

Data analytics could lead to the identification and elimination of factors that increase the chance of injury or claims in a wide variety of contexts, including the workplace, hospitals, assisted living facilities.

Above and beyond this, data analytics could lead to the identification and elimination of factors that increase the chance of injury or claims in a wide variety of contexts, including the workplace, hospitals, assisted living facilities. Analytics might even help identify factors that often precede legal or medical malpractice.

Then and even more so now, I believe data and analytics can go even further and predict, for example, who–due to such things as lifestyle and changes in lifestyle and/ health–might be prone to future claims. Data analytics could confirm whether things like weight gain, medication change, a decline in balance, or even divorce or a death in the family might all be factors that predispose someone to suffer an accident and assert a resulting claim. These factors could serve much like orange cones warning of a wet floor or other hazard which reduce the risk of an accident or other event creating a legal hazard.

Given the fact that the IoT is creating even more data, there is an even greater opportunity to learn when and under what circumstances certain events like accidents and injuries would occur. This knowledge would allow for intervention and mitigation to reduce the chances for the events to occur, at all. This in turn could reduce claims and legal issues from ever developing.

Indeed, since then, we have seen a plethora of products to better mine data for litigation as I recently discussed. And recently, the insurance giant, Liberty Mutual, announced a much more robust use of data to do many things I talked about in 2016.

The Challenges

This all sounds good. But as we all know, there are significant challenges to take advantage of these opportunities.

There is no shortage of data. The problem is the inability to access all this data.

The biggest obstacle which I identified in 2016 and, as Trunnell’s project still illustrates, is the inability to share data and create a more robust data pool. The more data–clean, accurate, and multiyear–the better the analytical outcomes and predictions. Again, there is no shortage of data: tons of it remain siloed in insurance companies, law firms, in-house legal departments, businesses, and even in state, and federal courts. The problem is the inability to access all this data. As Trunnell recognizes, the challenge is to somehow aggregate and share all this data in a platform that could be accessed and used to create models to benefit multiple organizations.

Trunnell correctly notes that, like many businesses, legal departments and law firms, most research institutes have conducted and maintain their work in silos which they developed with in-house tools. Other problems include privacy concerns, the fact that data may not be stored in machine-readable formats, and may be formatted, labeled, and structured in different ways that make it harder to share and use in common. The collective impact, Trunnell observes, is the difficulty for organizations to partner with each other and insufficient aggregation to support maximum machine learning.

The same is true in spades for data that could affect legal issues, claims, and even underlying events. Lawyers are faced not only with the impediments Trunnell mentions but also attorney-client privilege and client confidentiality issues. Not to mention the fact we don’t share very well generally.

Are There Solutions?

Despite these challenges, we are today seeing efforts being made to harness legal related data across businesses. I recently wrote about a company that collects billing data from various companies and law firms, makes it anonymous, and then mines it to make comparisons of lawyer efficiencies and results.

Similar attempts could be made across a wide range of industries. Businesses and even law firms should think of how the data could be used to address common industry issues that have no competitive impact, but which can improve the overall process and even reduce incidents. Microsoft, for example, has recognized the benefit of sharing data between and among its panel counsel and even other businesses.

The divide between those producing the data and those building novel tools is a huge missed opportunity

As Smith notes, “The key to this type of …collaboration lies with human values and processes and not just a focus on technology. Organizations need to decide whether and how to share data, and if so, on what terms.” He notes this sharing could be achieved by developing a few foundational principles like concrete arrangements to protect privacy, formulas for protecting security, addressing fundamental questions around data ownership, and the development of tools to enable easier and less costly data sharing. In the legal setting, it would also mean protecting confidentiality and privilege.

Significant challenges? Yes. But given the opportunity, surely the development of these principles can be done. Trunnell says it the best: “the divide between those producing the data and those building novel tools is a huge missed opportunity for making impactful, life-changing—and potentially lifesaving—discoveries using the massive amount of scientific, educational, and clinical trial data being generated every day.”

The same is true in law.

Photo Attribution

Photo by @colinczerwinski on Unsplash

Photo by Laurenz Kleinheider on Unsplash

Photo by Markus Spiske on Unsplash

Menu

TechLaw Crossroads

Data, Data Everywhere/Nor Any Drop to Drink