in

Stanford’s Bias Sparks Controversy Over GenAI Products #ethics

Edward Bukstel

Summarise this content to 300 words

The Stanford Research Caused A Ruckus on Friday Before Memorial Day 2024

Earlier this week, I wrote about some basic questions that I had about the Harvey AI Legal LLM. It turns out that a whole lot of people had a whole lot of questions as well. It just didn’t make sense at a gut level that a company that had a significant head start in Generative AI for lawyers would seem to be flailing around.

What began as a few questions turned into thousands of views on LinkedIn and over 1300 full article reads on Medium. The level of engagement from the legal community has been off the hook. A 77% Read Ratio on a 9 minute article is significant.

77% is Huge

I have not heard anything from Harvey folks challenging the items in the “There’s Something About Mary” post. Frankly, I didn’t expect much anyway. Even though, there’s been a lot of DMs from all types of thoughtful people in legal technology and GenAI.

In fact, the only peep out of Harvey this week was a somewhat inconsequential nothing burger of a Press Release regarding a strategic partnership with Mistral. Actually, calling it a Press Release is wrong. It was actually just an announcement posted on the Harvey Website. There is no Press Release or announcement on the Mistral AI Website.

Mistral AI, founded by Arthur Mensch, Guillaume Lample and Timothée Lacroix, has developed incredible open source and commercial frontier models. As we learned more about their vision we found that their passion for performance, transparency, and efficiency reflected the needs of many of our customers, as well as their commitment to portability and customisation.

At Harvey, our mission has always been to give lawyers superpowers. To do so, we have built a trusted platform used by law firms and enterprises to safely and accurately leverage generative AI in their workflows. For many of our customers, performance alone is not enough. Deploying generative AI in highly regulated industries requires extreme levels of security and transparency. Our partnership with Mistral AI will accelerate our ability to support our customers with their most sensitive matters.

In contrast to Harvey’s lack of response to questions about their product offering, the timely and focused responses to a Stanford Research Report that pilloried Thomson Reuters and Lexis Nexis presents an incredible contrast. The Stanford Study has already come under fire for some bogus research and clickbait title.

Is Stanford Biased?

The Stanford Human Centered Artificial Intelligence lab published a study on May 23, 2024. The study basically bashed Thomson Reuters Ask Practical Law GenAI product and the LexisNexis Lexis+AI Offering. In some cases, the Stanford researchers deliberately presented scenarios that ZERO actual attorneys would ever ask. The over arching problem with the Stanford Study is that the wrong tools and products were being used by the researchers. When you try and cut some wood with a fishing pole, bad stuff happens.

While everyone in the US was getting ready for Memorial Day on may 24, 2024, Richard Tromans was dropping some facts on the Legal Technology and GenAI Universe.

Stanford Debacle

Artificial Lawyers Coverage of the pushback by both LexisNexis and Thomson Reuters was epic. Richard Tromans reached out to both LexisNexis for comment on the Stanford findings. And guess what? They both responded.

In short, the academic researchers asked the wrong body of information and so got wonky responses. And, as Lambert states, he then went back and ran the same query through Westlaw, i.e. the main case law site for Thomson Reuters, and then the system did spot the error and get the right answer.

That then raises questions about how they approached LexisNexis as well. We need some legal research experts to do this again, that’s for sure.

One other point that struck this site was whether running deliberately misleading questions that are designed to fail is really a sound way of testing generative AI tools for lawyers? Although, it’s worth mentioning that these ‘fake questions’ were only part of the queries they tried out. Nevertheless, in the example noticed by Lambert the academics had intentionally sought to confuse the AI.

Clearly lawyers will make mistakes in their questions from time to time, but generally they know roughly what they are looking for, or at least aiming at. How often do lawyers type in queries that are totally factually conflicted and so cannot do anything but fail? 1% of the time? 20% of the time? More? It’s not clear.

Imagine if the Stanford researchers sent an email or a DM to LexisNexis or Thomson Reuters prior to publishing their negative clickbait research? They clearly had the opportunity to make a phone call to someone that could verify they were at least using the right tool. Lawyers and the public in general deserve better from GenAI researchers.

Back in 2006 a research study about Alzheimer’s was a total fake resulting in 16 years of flawed project funding and research that went sideways. GenAI is a humanity level technology that will change the delivery and acquisition of legal services in the very near future. It is reasonable to believe that significant economies are going to change drastically as the technology roles out over the next couple of years. Jumping down the Rabbit Hole it’s not completely out of line to think there’s some influence from BigLaw to want to maintain the status quo.

A May 23, 2024, What Does Big Law Stand to Gain From Slow AI Adoption? adds lighter fluid to the BBQ.

Legal recruiter and consultant Frederick Shelton, CEO of Shelton & Steele, said even though he has personally demonstrated to firm leaders that generative AI can save clients 20% or more on legal costs, law firms are placing more emphasis on AI’s risks in their communications with clients.

“Anyone claiming that AI will not dramatically change pricing structures has an obvious agenda — preservation of an archaic business model and the almighty billable hour,” Shelton said

The fact that law firms are trying to convince clients that the technology that they be using — stinks, can really be counter productive.

You might think the Stanford Study was a one off type of event regarding Legal Technology focused GenAI. Well you’d be wrong.

Another negative finding about Legal GenAI in Jan 2024

These Stanford researchers are freaking busy studying Legal LLM companies. A few months prior to the May 2024 “problematic research,” Stanford Human Centered Artificial Intelligence published, “Hallucinating Law: Legal Mistakes with Large Language Models are Pervasive.” It doesn’t take a rocket science to understand that this article attacks Legal LLMs as well.

The Stanford Institute for Human Center for Artificial Intelligence is not even affiliated with the law school at Stanford. Why are they spending so much time attacking GenAI Legal Technologies in such a sophomoric fashion?

It’s highly doubtful that anyone is deliberately falsifying research to destroy the trust in Legal GenAI. But, there definitely seems to be a bias going on at Stanford’s HAI. Maybe someone could propose a benchmarking protocol that could provide the right parameters to check individual product hallucination mitigation procedures. Utilizing third parties like the SALI Alliance or IEEE to maintain these procedures and protocols could help prevent misinformation and Legal GenAI.

The potential problems with Stanford’s Legal GenAI Research also points to a troubling trend in peer reviewed research in general. There were over 10,000 Peer Reviewed Papers that were retracted in 2023.

Artificial Lawyer is much nicer in their assessment of the Stanford Research than I am:

The study was clearly driven by good intentions, but overall, and in particular in their public statement titled: ‘AI on Trial: Legal Models Hallucinate in 1 out of 6 Queries’, it feels like they’re jumping to conclusions a little too quickly. And that’s a shame, as this is important work that needs to be done. We need to share more factual information about how genAI works and how it can help lawyers. But, it needs to be valid.

In the future, the legal community and interested parties everywhere should expect that realistic workflows are also followed. We should always expect that the GenAI is going to be reviewed by somebody. If, its not reviewed by a lawyer or appropriate professional, it should be a “Swim at Your Own Risk” situation.

OMG, I was literally going to stop here, but Stanford just went full Harvey on May 25, 2024. Totally bonkers stuff from so-called researchers. We are talking Babylon Bee level poop.

First off, Richard Tromans follow-up on this slow moving GenAI Dumpster Fire in Legal Technology and Academia is Pulitzer worthy. Artificial Lawyer requested statements from Thomson Reuters, Lexis Nexis, and Stanford. The responses came back from Thomson Reuters and LexisNexis within an hour or two.

Dumpster Fire

It took Stanford a little longer to get back to Richard Tromans, although I don’t know the exact timestamps of the communication, I’m making an assumption that Stanford was contacted at the same time as Thomson Reuters and Lexis Nexis. But, really nothing compares to the absolutely ridiculous comment from Stanford Researchers.

The Stanford excuse for not selecting the right product to do a study, is stupid AF. In what universe is it acceptable to publish research on a subject, but you use the wrong “subject.” The excuse presented to Richard Tromans demonstrates outright incompetence and misinformation through omission.

This whole situation is getting comical. Back in 2017 a group of real researchers wanted to study the bias of various journals in the Social Sciences. The group submitted completely bogus research that was published by “respected” peer review journals. In one case, the authors interpolated the word “Jewish” with “White,” for a Chapter from Hitler’s Mein Kampf. Shockingly, the research was accepted by the Peer Reviewed Journal.

In Digital Health there are organizations that run as the Gold Standard Reviewers of Application Software used by Health Systems, Insurers, and Hospitals. For, instance, KLAS Research provides unbiased third party research on healthcare specific applications. It is clear thar we need an unbiased and competent source of truth about the effectiveness of Generative AI in Legal.

Looking a little deeper down the rabbit hole, it seems Stanford Institute for Human AI is actually trying to position itself as a kind of “Consumer Reports” for AI.

For all of the good work that this organization is doing, there is an absolute blind spot for the Legal Technology space. Blind spots like this will cause people to make bad decisions based on faulty information.

Digit Health has created focused Organizations for Generative AI Responsible Use and Ethics. Shoutout to Jody Ranck for pointing out RAII and CHAI.

The researchers at Stanford could’ve easily have reviewed how other third parties perform research on Generative AI systems. For instance the Responsible Artificial Intelligence Institute (RAII) has created a set of guidelines to consider when evaluating GenAI and Large Language Models.

Stanford researchers could’ve been responsible in their research if they contacted the RAII first.

Really gotta hand it to Richard Tromans once again. Artificial Lawyers follow-up on this topic has been over the top exceptional. Please read the (3) articles published about this saga in a 24 hour Span:

Problematic Stanford GenAI Study Takes Aim at Thomson Reuters + LexisNexis

Stanford GenAI Study Debacle — Thomson Reuters + LexisNexis Both Reply

The GenAI TR/Lexis Study — Stanford Replies ‘We Were Denied Access’

Ed Bukstel

CEO

giupedi.com

@ebukstel

Source link

Source link: https://medium.com/@Connected_Dots/stanford-research-bias-against-genai-products-from-lexisnexis-and-thomson-reuters-turns-into-0ccc73bbbbd1?source=rss——large_language_models-5

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

This AI Research from the University of Chicago Explores the Financial Analytical Capabilities of Large Langauge Models (LLMs)

University of Chicago AI Research Explores LLM Financial Analytics #AIResearchFinance

Cybertruck steam gaming 2024 | Better than steam

Cybertruck gaming in 2024: The superior alternative to Steam #CybertruckGaming